2.3. 디스크 오류 시뮬레이션
하드 및 소프트의 두 가지 디스크 장애 시나리오가 있습니다. 하드 오류는 디스크를 교체합니다. 소프트 오류는 장치 드라이버 또는 일부 다른 소프트웨어 구성 요소에 문제가 될 수 있습니다.
소프트 오류의 경우 디스크 교체가 필요하지 않을 수 있습니다. 디스크를 교체하는 경우 오류가 발생한 디스크를 제거하고 Ceph에 대체 디스크를 추가하려면 단계를 따라야 합니다. 소프트 디스크 오류를 시뮬레이션하기 위해 가장 좋은 방법은 장치를 삭제하는 것입니다. 장치를 선택하고 시스템에서 장치를 삭제합니다.
사전 요구 사항
- 정상 실행 중인 Red Hat Ceph Storage 클러스터.
- Ceph OSD 노드에 대한 루트 수준의 액세스.
절차
sysfs
에서 블록 장치 제거 :구문
echo 1 > /sys/block/BLOCK_DEVICE/device/delete
예제
[root@osd ~]# echo 1 > /sys/block/sdb/device/delete
Ceph OSD 로그의 OSD 노드에서 Ceph는 오류를 감지하고 복구 프로세스를 자동으로 시작했습니다.
예제
[root@osd ~]# tail -50 /var/log/ceph/ceph-osd.1.log 2020-09-02 15:50:50.187067 7ff1ce9a8d80 1 bdev(0x563d263d4600 /var/lib/ceph/osd/ceph-2/block) close 2020-09-02 15:50:50.440398 7ff1ce9a8d80 -1 osd.2 0 OSD:init: unable to mount object store 2020-09-02 15:50:50.440416 7ff1ce9a8d80 -1 ^[[0;31m ** ERROR: osd init failed: (5) Input/output error^[[0m 2020-09-02 15:51:10.633738 7f495c44bd80 0 set uid:gid to 167:167 (ceph:ceph) 2020-09-02 15:51:10.633752 7f495c44bd80 0 ceph version 12.2.12-124.el7cp (e8948288b90d312c206301a9fcf80788fbc3b1f8) luminous (stable), process ceph-osd, pid 36209 2020-09-02 15:51:10.634703 7f495c44bd80 -1 bluestore(/var/lib/ceph/osd/ceph-2/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-2/block: (5) Input/output error 2020-09-02 15:51:10.635749 7f495c44bd80 -1 bluestore(/var/lib/ceph/osd/ceph-2/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-2/block: (5) Input/output error 2020-09-02 15:51:10.636642 7f495c44bd80 -1 bluestore(/var/lib/ceph/osd/ceph-2/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-2/block: (5) Input/output error 2020-09-02 15:51:10.637535 7f495c44bd80 -1 bluestore(/var/lib/ceph/osd/ceph-2/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-2/block: (5) Input/output error 2020-09-02 15:51:10.641256 7f495c44bd80 0 pidfile_write: ignore empty --pid-file 2020-09-02 15:51:10.669317 7f495c44bd80 0 load: jerasure load: lrc load: isa 2020-09-02 15:51:10.669387 7f495c44bd80 1 bdev create path /var/lib/ceph/osd/ceph-2/block type kernel 2020-09-02 15:51:10.669395 7f495c44bd80 1 bdev(0x55a423da9200 /var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block 2020-09-02 15:51:10.669611 7f495c44bd80 1 bdev(0x55a423da9200 /var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000, 466GiB) block_size 4096 (4KiB) rotational 2020-09-02 15:51:10.670320 7f495c44bd80 -1 bluestore(/var/lib/ceph/osd/ceph-2/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-2/block: (5) Input/output error 2020-09-02 15:51:10.670328 7f495c44bd80 1 bdev(0x55a423da9200 /var/lib/ceph/osd/ceph-2/block) close 2020-09-02 15:51:10.924727 7f495c44bd80 1 bluestore(/var/lib/ceph/osd/ceph-2) _mount path /var/lib/ceph/osd/ceph-2 2020-09-02 15:51:10.925582 7f495c44bd80 -1 bluestore(/var/lib/ceph/osd/ceph-2/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-2/block: (5) Input/output error 2020-09-02 15:51:10.925628 7f495c44bd80 1 bdev create path /var/lib/ceph/osd/ceph-2/block type kernel 2020-09-02 15:51:10.925630 7f495c44bd80 1 bdev(0x55a423da8600 /var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block 2020-09-02 15:51:10.925784 7f495c44bd80 1 bdev(0x55a423da8600 /var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000, 466GiB) block_size 4096 (4KiB) rotational 2020-09-02 15:51:10.926549 7f495c44bd80 -1 bluestore(/var/lib/ceph/osd/ceph-2/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-2/block: (5) Input/output error
Ceph OSD 디스크 트리를 보면 디스크가 오프라인 상태인지 확인할 수 있습니다.
예제
[root@osd ~]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.28976 root default -2 0.09659 host ceph3 1 0.09659 osd.1 down 1.00000 1.00000 -3 0.09659 host ceph1 2 0.09659 osd.2 up 1.00000 1.00000 -4 0.09659 host ceph2 0 0.09659 osd.0 up 1.00000 1.00000