2.3. 디스크 오류 시뮬레이션
하드 및 소프트의 두 가지 디스크 장애 시나리오가 있습니다. 하드 오류는 디스크를 교체합니다. 소프트 오류는 장치 드라이버 또는 일부 다른 소프트웨어 구성 요소에 문제가 될 수 있습니다.
소프트 오류의 경우 디스크 교체가 필요하지 않을 수 있습니다. 디스크를 교체하는 경우 오류가 발생한 디스크를 제거하고 Ceph에 대체 디스크를 추가하려면 단계를 따라야 합니다. 소프트 디스크 오류를 시뮬레이션하기 위해 가장 좋은 방법은 장치를 삭제하는 것입니다. 장치를 선택하고 시스템에서 장치를 삭제합니다.
echo 1 > /sys/block/$DEVICE/device/delete
예제
[root@ceph1 ~]# echo 1 > /sys/block/sdb/device/delete
Ceph OSD 로그의 OSD 노드에서 Ceph는 오류를 감지하고 복구 프로세스를 자동으로 시작했습니다.
예제
[root@ceph1 ~]# tail -50 /var/log/ceph/ceph-osd.1.log 2017-02-02 12:15:27.490889 7f3e1fa3d800 -1 ^[[0;31m ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-1: (5) Input/output error^[[0m 2017-02-02 12:34:17.777898 7fb7df1e7800 0 set uid:gid to 167:167 (ceph:ceph) 2017-02-02 12:34:17.777933 7fb7df1e7800 0 ceph version 10.2.3-17.el7cp (ca9d57c0b140eb5cea9de7f7133260271e57490e), process ceph-osd, pid 1752 2017-02-02 12:34:17.788885 7fb7df1e7800 0 pidfile_write: ignore empty --pid-file 2017-02-02 12:34:17.870322 7fb7df1e7800 0 filestore(/var/lib/ceph/osd/ceph-1) backend xfs (magic 0x58465342) 2017-02-02 12:34:17.871028 7fb7df1e7800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2017-02-02 12:34:17.871035 7fb7df1e7800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option 2017-02-02 12:34:17.871059 7fb7df1e7800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: splice is supported 2017-02-02 12:34:17.897839 7fb7df1e7800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2017-02-02 12:34:17.897985 7fb7df1e7800 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: extsize is disabled by conf 2017-02-02 12:34:17.921162 7fb7df1e7800 1 leveldb: Recovering log #22 2017-02-02 12:34:17.947335 7fb7df1e7800 1 leveldb: Level-0 table #24: started 2017-02-02 12:34:18.001952 7fb7df1e7800 1 leveldb: Level-0 table #24: 810464 bytes OK 2017-02-02 12:34:18.044554 7fb7df1e7800 1 leveldb: Delete type=0 #22 2017-02-02 12:34:18.045383 7fb7df1e7800 1 leveldb: Delete type=3 #20 2017-02-02 12:34:18.058061 7fb7df1e7800 0 filestore(/var/lib/ceph/osd/ceph-1) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2017-02-02 12:34:18.105482 7fb7df1e7800 1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 18: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 1 2017-02-02 12:34:18.130293 7fb7df1e7800 1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 18: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 1 2017-02-02 12:34:18.130992 7fb7df1e7800 1 filestore(/var/lib/ceph/osd/ceph-1) upgrade 2017-02-02 12:34:18.136547 7fb7df1e7800 0 <cls> cls/cephfs/cls_cephfs.cc:202: loading cephfs_size_scan 2017-02-02 12:34:18.142863 7fb7df1e7800 0 <cls> cls/hello/cls_hello.cc:305: loading cls_hello 2017-02-02 12:34:18.255019 7fb7df1e7800 0 osd.1 51 crush map has features 2200130813952, adjusting msgr requires for clients 2017-02-02 12:34:18.255041 7fb7df1e7800 0 osd.1 51 crush map has features 2200130813952 was 8705, adjusting msgr requires for mons 2017-02-02 12:34:18.255048 7fb7df1e7800 0 osd.1 51 crush map has features 2200130813952, adjusting msgr requires for osds 2017-02-02 12:34:18.296256 7fb7df1e7800 0 osd.1 51 load_pgs 2017-02-02 12:34:18.561604 7fb7df1e7800 0 osd.1 51 load_pgs opened 152 pgs 2017-02-02 12:34:18.561648 7fb7df1e7800 0 osd.1 51 using 0 op queue with priority op cut off at 64. 2017-02-02 12:34:18.562603 7fb7df1e7800 -1 osd.1 51 log_to_monitors {default=true} 2017-02-02 12:34:18.650204 7fb7df1e7800 0 osd.1 51 done with init, starting boot process 2017-02-02 12:34:19.274937 7fb7b78ba700 0 -- 192.168.122.83:6801/1752 >> 192.168.122.81:6801/2620 pipe(0x7fb7ec4d1400 sd=127 :6801 s=0 pgs=0 cs=0 l=0 c=0x7fb7ec42e480).accept connect_seq 0 vs existing 0 state connecting
osd 디스크 트리를 보면 디스크도 오프라인 상태임을 알 수 있습니다.
[root@ceph1 ~]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.28976 root default -2 0.09659 host ceph3 1 0.09659 osd.1 down 1.00000 1.00000 -3 0.09659 host ceph1 2 0.09659 osd.2 up 1.00000 1.00000 -4 0.09659 host ceph2 0 0.09659 osd.0 up 1.00000 1.00000