16.20. Replacing a failed disk in RAID


You can reconstruct the data from the failed disks by using the remaining disks. RAID level and the total number of disks determines the minimum amount of remaining disks needed for a successful data reconstruction.

In this procedure, the /dev/md0 RAID contains four disks. The /dev/sdd disk has failed and you need to replace it with the /dev/sdf disk.

Prerequisites

  • A spare disk for replacement.
  • The mdadm package is installed.

Procedure

  1. Check the failed disk:

    1. View the kernel logs:

      # journalctl -k -f
    2. Search for a message similar to the following:

      md/raid:md0: Disk failure on sdd, disabling device.
      
      md/raid:md0: Operation continuing on 3 devices.
    3. Press Ctrl+C on your keyboard to exit the journalctl program.
  2. Mark the failed disk as faulty:

    # mdadm --manage /dev/md0 --fail /dev/sdd
  3. Optional: Check if the failed disk was marked correctly:

    # mdadm --detail /dev/md0

    At the end of the output is a list of disks in the /dev/md0 RAID where the disk /dev/sdd has the faulty status:

    Number   Major   Minor   RaidDevice State
       0       8       16        0      active sync   /dev/sdb
       1       8       32        1      active sync   /dev/sdc
       -       0        0        2      removed
       3       8       64        3      active sync   /dev/sde
    
       2       8       48        -      faulty   /dev/sdd
  4. Remove the failed disk from the RAID:

    # mdadm --manage /dev/md0 --remove /dev/sdd
    警告

    If your RAID cannot withstand another disk failure, do not remove any disk until the new disk has the active sync status. You can monitor the progress using the watch cat /proc/mdstat command.

  5. Add the new disk to the RAID:

    # mdadm --manage /dev/md0 --add /dev/sdf

    The /dev/md0 RAID now includes the new disk /dev/sdf and the mdadm service will automatically starts copying data to it from other disks.

Verification

  • Check the details of the array:

    # mdadm --detail /dev/md0

    If this command shows a list of disks in the /dev/md0 RAID where the new disk has spare rebuilding status at the end of the output, data is still being copied to it from other disks:

    Number   Major   Minor   RaidDevice State
       0       8       16        0      active sync   /dev/sdb
       1       8       32        1      active sync   /dev/sdc
       4       8       80        2      spare rebuilding   /dev/sdf
       3       8       64        3      active sync   /dev/sde

    After data copying is finished, the new disk has an active sync status.

  • Check the progress of synchronization:

    # cat /proc/mdstat
    Personalities : [raid4] [raid5] [raid6]
    md0 : active raid5 sdf[5] sde[4] sdc[1] sdb[0]
          6282240 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UU_U]
          [==============>......]  recovery = 72.0% (1509820/2094080) finish=0.0min speed=215688K/sec
    
    unused devices: <none>
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2026 Red Hat
返回顶部