5.4.16.8. Setting a RAID fault policy


LVM RAID handles device failures in an automatic fashion based on the preferences defined by the raid_fault_policy field in the lvm.conf file.
  • If the raid_fault_policy field is set to allocate, the system will attempt to replace the failed device with a spare device from the volume group. If there is no available spare device, this will be reported to the system log.
  • If the raid_fault_policy field is set to warn, the system will produce a warning and the log will indicate that a device has failed. This allows the user to determine the course of action to take.
As long as there are enough devices remaining to support usability, the RAID logical volume will continue to operate.
5.4.16.8.1. The allocate RAID Fault Policy
In the following example, the raid_fault_policy field has been set to allocate in the lvm.conf file. The RAID logical volume is laid out as follows.
# lvs -a -o name,copy_percent,devices my_vg
  LV               Copy%  Devices                                     
  my_lv            100.00 my_lv_rimage_0(0),my_lv_rimage_1(0),my_lv_rimage_2(0)
  [my_lv_rimage_0]        /dev/sde1(1)                                
  [my_lv_rimage_1]        /dev/sdf1(1)                                
  [my_lv_rimage_2]        /dev/sdg1(1)                                
  [my_lv_rmeta_0]         /dev/sde1(0)                                
  [my_lv_rmeta_1]         /dev/sdf1(0)                                
  [my_lv_rmeta_2]         /dev/sdg1(0)
If the /dev/sde device fails, the system log will display error messages.
# grep lvm /var/log/messages 
Jan 17 15:57:18 bp-01 lvm[8599]: Device #0 of raid1 array, my_vg-my_lv, has failed.
Jan 17 15:57:18 bp-01 lvm[8599]: /dev/sde1: read failed after 0 of 2048 at
250994294784: Input/output error
Jan 17 15:57:18 bp-01 lvm[8599]: /dev/sde1: read failed after 0 of 2048 at
250994376704: Input/output error
Jan 17 15:57:18 bp-01 lvm[8599]: /dev/sde1: read failed after 0 of 2048 at 0:
Input/output error
Jan 17 15:57:18 bp-01 lvm[8599]: /dev/sde1: read failed after 0 of 2048 at
4096: Input/output error
Jan 17 15:57:19 bp-01 lvm[8599]: Couldn't find device with uuid
3lugiV-3eSP-AFAR-sdrP-H20O-wM2M-qdMANy.
Jan 17 15:57:27 bp-01 lvm[8599]: raid1 array, my_vg-my_lv, is not in-sync.
Jan 17 15:57:36 bp-01 lvm[8599]: raid1 array, my_vg-my_lv, is now in-sync.
Since the raid_fault_policy field has been set to allocate, the failed device is replaced with a new device from the volume group.
# lvs -a -o name,copy_percent,devices vg
  Couldn't find device with uuid 3lugiV-3eSP-AFAR-sdrP-H20O-wM2M-qdMANy.
  LV            Copy%  Devices                                     
  lv            100.00 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0]        /dev/sdh1(1)                                
  [lv_rimage_1]        /dev/sdf1(1)                                
  [lv_rimage_2]        /dev/sdg1(1)                                
  [lv_rmeta_0]         /dev/sdh1(0)                                
  [lv_rmeta_1]         /dev/sdf1(0)                                
  [lv_rmeta_2]         /dev/sdg1(0)
Note that even though the failed device has been replaced, the display still indicates that LVM could not find the failed device. This is because, although the failed device has been removed from the RAID logical volume, the failed device has not yet been removed from the volume group. To remove the failed device from the volume group, you can execute vgreduce --removemissing VG.
If the raid_fault_policy has been set to allocate but there are no spare devices, the allocation will fail, leaving the logical volume as it is. If the allocation fails, you have the option of fixing the drive, then deactivating and activating the logical volume, as described in Section 5.4.16.8.2, “The warn RAID Fault Policy”. Alternately, you can replace the failed device, as described in Section 5.4.16.9, “Replacing a RAID device”.
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.