5.2. 替换 IBM Power 上的操作或失败的存储设备
您可以使用 IBM Power 上的本地存储设备替换 OpenShift Data Foundation 中部署的对象存储设备(OSD)。
可能需要替换一个或多个底层存储设备。
先决条件
- 红帽建议为替换设备配置类似的基础架构和资源,以用于被替换的设备。
确保数据具有弹性。
-
在 OpenShift Web 控制台中,点 Storage
Data Foundation。 -
点 Storage Systems 选项卡,然后点
ocs-storagecluster-storagesystem。 - 在 Block and File 仪表板的 Status 卡中,在 Overview 选项卡下,验证 Data Resiliency 是否具有绿色勾号标记。
-
在 OpenShift Web 控制台中,点 Storage
流程
识别需要替换的 OSD,以及在其上调度 OSD 的 OpenShift Container Platform 节点。
oc get -n openshift-storage pods -l app=rook-ceph-osd -o wide
$ oc get -n openshift-storage pods -l app=rook-ceph-osd -o wideCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
rook-ceph-osd-0-86bf8cdc8-4nb5t 0/1 crashLoopBackOff 0 24h 10.129.2.26 worker-0 <none> <none> rook-ceph-osd-1-7c99657cfb-jdzvz 1/1 Running 0 24h 10.128.2.46 worker-1 <none> <none> rook-ceph-osd-2-5f9f6dfb5b-2mnw9 1/1 Running 0 24h 10.131.0.33 worker-2 <none> <none>
rook-ceph-osd-0-86bf8cdc8-4nb5t 0/1 crashLoopBackOff 0 24h 10.129.2.26 worker-0 <none> <none> rook-ceph-osd-1-7c99657cfb-jdzvz 1/1 Running 0 24h 10.128.2.46 worker-1 <none> <none> rook-ceph-osd-2-5f9f6dfb5b-2mnw9 1/1 Running 0 24h 10.131.0.33 worker-2 <none> <none>Copy to Clipboard Copied! Toggle word wrap Toggle overflow 在本例中,需要替换
rook-ceph-osd-0-86bf8cdc8-4nb5t,worker-0是调度 OSD 的 RHOCP 节点。注意如果要替换的 OSD 处于健康状态,pod 的状态为
Running。缩减 OSD 部署,以替换 OSD。
osd_id_to_remove=0
$ osd_id_to_remove=0Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc scale -n openshift-storage deployment rook-ceph-osd-${osd_id_to_remove} --replicas=0$ oc scale -n openshift-storage deployment rook-ceph-osd-${osd_id_to_remove} --replicas=0Copy to Clipboard Copied! Toggle word wrap Toggle overflow 其中,
osd_id_to_remove是 pod 名称中紧接在rook-ceph-osd前缀后面的整数。在本例中,部署名称为rook-ceph-osd-0。输出示例:
deployment.extensions/rook-ceph-osd-0 scaled
deployment.extensions/rook-ceph-osd-0 scaledCopy to Clipboard Copied! Toggle word wrap Toggle overflow 验证
rook-ceph-osdpod 是否已终止。oc get -n openshift-storage pods -l ceph-osd-id=${osd_id_to_remove}$ oc get -n openshift-storage pods -l ceph-osd-id=${osd_id_to_remove}Copy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
No resources found in openshift-storage namespace.
No resources found in openshift-storage namespace.Copy to Clipboard Copied! Toggle word wrap Toggle overflow 重要如果
rook-ceph-osdpod 处于terminating状态超过几分钟,请使用force选项删除 pod。oc delete -n openshift-storage pod rook-ceph-osd-0-86bf8cdc8-4nb5t --grace-period=0 --force
$ oc delete -n openshift-storage pod rook-ceph-osd-0-86bf8cdc8-4nb5t --grace-period=0 --forceCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. pod "rook-ceph-osd-0-86bf8cdc8-4nb5t" force deleted
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. pod "rook-ceph-osd-0-86bf8cdc8-4nb5t" force deletedCopy to Clipboard Copied! Toggle word wrap Toggle overflow 从集群中移除旧 OSD,以便您可以添加新 OSD。
识别与要替换的 OSD 关联的
DeviceSet。oc get -n openshift-storage -o yaml deployment rook-ceph-osd-${osd_id_to_remove} | grep ceph.rook.io/pvc$ oc get -n openshift-storage -o yaml deployment rook-ceph-osd-${osd_id_to_remove} | grep ceph.rook.io/pvcCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
ceph.rook.io/pvc: ocs-deviceset-localblock-0-data-0-64xjl ceph.rook.io/pvc: ocs-deviceset-localblock-0-data-0-64xjlceph.rook.io/pvc: ocs-deviceset-localblock-0-data-0-64xjl ceph.rook.io/pvc: ocs-deviceset-localblock-0-data-0-64xjlCopy to Clipboard Copied! Toggle word wrap Toggle overflow 在本例中,持久性卷声明(PVC)名称是
ocs-deviceset-localblock-0-data-0-64xjl。识别与 PVC 关联的持久性卷(PV)。
oc get -n openshift-storage pvc ocs-deviceset-<x>-<y>-<pvc-suffix>
$ oc get -n openshift-storage pvc ocs-deviceset-<x>-<y>-<pvc-suffix>Copy to Clipboard Copied! Toggle word wrap Toggle overflow 其中,
x、y、和pvc-suffix是前面步骤中标识的DeviceSet中的值。输出示例:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ocs-deviceset-localblock-0-data-0-64xjl Bound local-pv-8137c873 256Gi RWO localblock 24h
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ocs-deviceset-localblock-0-data-0-64xjl Bound local-pv-8137c873 256Gi RWO localblock 24hCopy to Clipboard Copied! Toggle word wrap Toggle overflow 在本例中,关联的 PV 是
local-pv-8137c873。确定要替换的设备的名称。
oc get pv local-pv-<pv-suffix> -o yaml | grep path
$ oc get pv local-pv-<pv-suffix> -o yaml | grep pathCopy to Clipboard Copied! Toggle word wrap Toggle overflow 其中,
pv-suffix是前面步骤中标识的 PV 名称中的值。输出示例:
path: /mnt/local-storage/localblock/vdc
path: /mnt/local-storage/localblock/vdcCopy to Clipboard Copied! Toggle word wrap Toggle overflow 在本例中,设备名称为
vdc。识别与要替换的 OSD 关联的
prepare-pod。oc describe -n openshift-storage pvc ocs-deviceset-<x>-<y>-<pvc-suffix> | grep Used
$ oc describe -n openshift-storage pvc ocs-deviceset-<x>-<y>-<pvc-suffix> | grep UsedCopy to Clipboard Copied! Toggle word wrap Toggle overflow 其中,
x、y、和pvc-suffix是前面步骤中标识的DeviceSet中的值。输出示例:
Used By: rook-ceph-osd-prepare-ocs-deviceset-localblock-0-data-0-64knzkc
Used By: rook-ceph-osd-prepare-ocs-deviceset-localblock-0-data-0-64knzkcCopy to Clipboard Copied! Toggle word wrap Toggle overflow 在本例中,
prepare-pod名称为rook-ceph-osd-prepare-ocs-deviceset-localblock-0-data-0-64knzkc。删除所有旧的
ocs-osd-removal任务。oc delete -n openshift-storage job ocs-osd-removal-job
$ oc delete -n openshift-storage job ocs-osd-removal-jobCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
job.batch "ocs-osd-removal-job" deleted
job.batch "ocs-osd-removal-job" deletedCopy to Clipboard Copied! Toggle word wrap Toggle overflow 注意以上命令必须达到
Completed状态,然后继续下一步。这可能需要超过十分钟。更改到
openshift-storage项目。oc project openshift-storage
$ oc project openshift-storageCopy to Clipboard Copied! Toggle word wrap Toggle overflow 从集群中移除旧的 OSD。
oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=${osd_id_to_remove} FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=${osd_id_to_remove} FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -Copy to Clipboard Copied! Toggle word wrap Toggle overflow FORCE_OSD_REMOVAL 值必须在有三个 OSD 的集群中更改为 "true",或者有足够空间的集群在移除 OSD 后恢复所有这三个数据副本。
警告此步骤会导致 OSD 完全从集群中移除。确保提供了
osd_id_to_remove的正确值。
通过检查
ocs-osd-removal-jobpod 的状态,验证 OSD 是否已成功移除。状态
Completed确认 OSD 移除作业已成功。oc get pod -l job-name=ocs-osd-removal-job -n openshift-storage
$ oc get pod -l job-name=ocs-osd-removal-job -n openshift-storageCopy to Clipboard Copied! Toggle word wrap Toggle overflow 确保 OSD 移除已完成。
oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1 | egrep -i 'completed removal'
$ oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1 | egrep -i 'completed removal'Copy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
2022-05-10 06:50:04.501511 I | cephosd: completed removal of OSD 0
2022-05-10 06:50:04.501511 I | cephosd: completed removal of OSD 0Copy to Clipboard Copied! Toggle word wrap Toggle overflow 重要如果
ocs-osd-removal-job失败且 pod 没有处于预期的Completed状态,请检查 pod 日志以进一步调试。例如:
oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1
# oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1Copy to Clipboard Copied! Toggle word wrap Toggle overflow 如果在安装时启用了加密,请从从相应 OpenShift Data Foundation 节点中删除的 OSD 中删除
dm-crypt管理的device-mapper映射。从
ocs-osd-removal-jobpod 的日志中获取所替换 OSD 的 PVC 名称。oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1 |egrep -i ‘pvc|deviceset’
$ oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1 |egrep -i ‘pvc|deviceset’Copy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
2021-05-12 14:31:34.666000 I | cephosd: removing the OSD PVC "ocs-deviceset-xxxx-xxx-xxx-xxx"
2021-05-12 14:31:34.666000 I | cephosd: removing the OSD PVC "ocs-deviceset-xxxx-xxx-xxx-xxx"Copy to Clipboard Copied! Toggle word wrap Toggle overflow 对于之前标识的每个节点,请执行以下操作:
创建
debugpod 和chroot到存储节点上的主机。oc debug node/<node name>
$ oc debug node/<node name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow <node name>是节点的名称。
chroot /host
$ chroot /hostCopy to Clipboard Copied! Toggle word wrap Toggle overflow
根据上一步中标识的 PVC 名称,查找相关的设备名称。
dmsetup ls| grep <pvc name>
$ dmsetup ls| grep <pvc name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow <pvc name>是 PVC 的名称。
输出示例:
ocs-deviceset-xxx-xxx-xxx-xxx-block-dmcrypt (253:0)
ocs-deviceset-xxx-xxx-xxx-xxx-block-dmcrypt (253:0)Copy to Clipboard Copied! Toggle word wrap Toggle overflow
删除映射的设备。
cryptsetup luksClose --debug --verbose ocs-deviceset-xxx-xxx-xxx-xxx-block-dmcrypt
$ cryptsetup luksClose --debug --verbose ocs-deviceset-xxx-xxx-xxx-xxx-block-dmcryptCopy to Clipboard Copied! Toggle word wrap Toggle overflow 重要如果上述命令因为权限不足而卡住,请运行以下命令:
-
按
CTRL+Z退出上述命令。 查找阻塞的进程的 PID。
ps -ef | grep crypt
$ ps -ef | grep cryptCopy to Clipboard Copied! Toggle word wrap Toggle overflow 使用
kill命令终止进程。kill -9 <PID>
$ kill -9 <PID>Copy to Clipboard Copied! Toggle word wrap Toggle overflow <PID>- 是进程 ID。
验证设备名称是否已移除。
dmsetup ls
$ dmsetup lsCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
按
查找需要删除的 PV。
oc get pv -L kubernetes.io/hostname | grep localblock | grep Released
$ oc get pv -L kubernetes.io/hostname | grep localblock | grep ReleasedCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
local-pv-d6bf175b 1490Gi RWO Delete Released openshift-storage/ocs-deviceset-0-data-0-6c5pw localblock 2d22h compute-1
local-pv-d6bf175b 1490Gi RWO Delete Released openshift-storage/ocs-deviceset-0-data-0-6c5pw localblock 2d22h compute-1Copy to Clipboard Copied! Toggle word wrap Toggle overflow 删除 PV。
oc delete pv <pv-name>
$ oc delete pv <pv-name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow <pv-name>- 是 PV 的名称。
替换旧设备,并使用新设备创建新的 OpenShift Container Platform PV。
使用要替换的设备登录到 OpenShift Container Platform 节点。在本例中,OpenShift Container Platform 节点是
worker-0。oc debug node/worker-0
$ oc debug node/worker-0Copy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
Starting pod/worker-0-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.88.21 If you don't see a command prompt, try pressing enter. # chroot /host
Starting pod/worker-0-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.88.21 If you don't see a command prompt, try pressing enter. # chroot /hostCopy to Clipboard Copied! Toggle word wrap Toggle overflow 使用之前标识的设备名称
vdc记录要替换的/dev/disk。ls -alh /mnt/local-storage/localblock
# ls -alh /mnt/local-storage/localblockCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
total 0 drwxr-xr-x. 2 root root 17 Nov 18 15:23 . drwxr-xr-x. 3 root root 24 Nov 18 15:23 .. lrwxrwxrwx. 1 root root 8 Nov 18 15:23 vdc -> /dev/vdc
total 0 drwxr-xr-x. 2 root root 17 Nov 18 15:23 . drwxr-xr-x. 3 root root 24 Nov 18 15:23 .. lrwxrwxrwx. 1 root root 8 Nov 18 15:23 vdc -> /dev/vdcCopy to Clipboard Copied! Toggle word wrap Toggle overflow 查找
LocalVolumeCR 的名称,并删除或注释掉要替换的设备/dev/disk。oc get -n openshift-local-storage localvolume
$ oc get -n openshift-local-storage localvolumeCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
NAME AGE localblock 25h
NAME AGE localblock 25hCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc edit -n openshift-local-storage localvolume localblock
# oc edit -n openshift-local-storage localvolume localblockCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow 确保在编辑 CR 后保存更改。
使用要替换的设备登录到 OpenShift Container Platform 节点,并删除旧的
符号链接。oc debug node/worker-0
$ oc debug node/worker-0Copy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
Starting pod/worker-0-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.88.21 If you don't see a command prompt, try pressing enter. # chroot /host
Starting pod/worker-0-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.88.21 If you don't see a command prompt, try pressing enter. # chroot /hostCopy to Clipboard Copied! Toggle word wrap Toggle overflow 确定要替换的设备名称的旧
符号链接。在本例中,设备名称为vdc。ls -alh /mnt/local-storage/localblock
# ls -alh /mnt/local-storage/localblockCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
total 0 drwxr-xr-x. 2 root root 17 Nov 18 15:23 . drwxr-xr-x. 3 root root 24 Nov 18 15:23 .. lrwxrwxrwx. 1 root root 8 Nov 18 15:23 vdc -> /dev/vdc
total 0 drwxr-xr-x. 2 root root 17 Nov 18 15:23 . drwxr-xr-x. 3 root root 24 Nov 18 15:23 .. lrwxrwxrwx. 1 root root 8 Nov 18 15:23 vdc -> /dev/vdcCopy to Clipboard Copied! Toggle word wrap Toggle overflow 删除
符号链接。rm /mnt/local-storage/localblock/vdc
# rm /mnt/local-storage/localblock/vdcCopy to Clipboard Copied! Toggle word wrap Toggle overflow 验证
symlink已被删除。ls -alh /mnt/local-storage/localblock
# ls -alh /mnt/local-storage/localblockCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
total 0 drwxr-xr-x. 2 root root 6 Nov 18 17:11 . drwxr-xr-x. 3 root root 24 Nov 18 15:23 ..
total 0 drwxr-xr-x. 2 root root 6 Nov 18 17:11 . drwxr-xr-x. 3 root root 24 Nov 18 15:23 ..Copy to Clipboard Copied! Toggle word wrap Toggle overflow
- 使用新设备替换旧设备。
重新登录正确的 OpenShift Cotainer Platform 节点,并确定新驱动器的设备名称。设备名称必须更改,除非您要重置同一设备。
lsblk
# lsblkCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow 在本例中,新设备名称为
vdd。新的
/dev/disk可用后,您可以向 LocalVolume CR 添加新磁盘条目。编辑 LocalVolume CR 并添加新的
/dev/disk。在本例中,新设备为
/dev/vdd。oc edit -n openshift-local-storage localvolume localblock
# oc edit -n openshift-local-storage localvolume localblockCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow 确保在编辑 CR 后保存更改。
验证是否有新的 PV 处于
Available状态且大小正确。oc get pv | grep 256Gi
$ oc get pv | grep 256GiCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
local-pv-1e31f771 256Gi RWO Delete Bound openshift-storage/ocs-deviceset-localblock-2-data-0-6xhkf localblock 24h local-pv-ec7f2b80 256Gi RWO Delete Bound openshift-storage/ocs-deviceset-localblock-1-data-0-hr2fx localblock 24h local-pv-8137c873 256Gi RWO Delete Available localblock 32m
local-pv-1e31f771 256Gi RWO Delete Bound openshift-storage/ocs-deviceset-localblock-2-data-0-6xhkf localblock 24h local-pv-ec7f2b80 256Gi RWO Delete Bound openshift-storage/ocs-deviceset-localblock-1-data-0-hr2fx localblock 24h local-pv-8137c873 256Gi RWO Delete Available localblock 32mCopy to Clipboard Copied! Toggle word wrap Toggle overflow 为新设备创建一个新 OSD。
部署新 OSD。您需要重启
rook-ceph-operator来强制协调 Operator。识别
rook-ceph-operator的名称。oc get -n openshift-storage pod -l app=rook-ceph-operator
$ oc get -n openshift-storage pod -l app=rook-ceph-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
NAME READY STATUS RESTARTS AGE rook-ceph-operator-85f6494db4-sg62v 1/1 Running 0 1d20h
NAME READY STATUS RESTARTS AGE rook-ceph-operator-85f6494db4-sg62v 1/1 Running 0 1d20hCopy to Clipboard Copied! Toggle word wrap Toggle overflow 删除
rook-ceph-operator。oc delete -n openshift-storage pod rook-ceph-operator-85f6494db4-sg62v
$ oc delete -n openshift-storage pod rook-ceph-operator-85f6494db4-sg62vCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
pod "rook-ceph-operator-85f6494db4-sg62v" deleted
pod "rook-ceph-operator-85f6494db4-sg62v" deletedCopy to Clipboard Copied! Toggle word wrap Toggle overflow 在本例中,rook-ceph-operator pod 名称为
rook-ceph-operator-85f6494db4-sg62v。验证
rook-ceph-operatorpod 是否已重启。oc get -n openshift-storage pod -l app=rook-ceph-operator
$ oc get -n openshift-storage pod -l app=rook-ceph-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
NAME READY STATUS RESTARTS AGE rook-ceph-operator-85f6494db4-wx9xx 1/1 Running 0 50s
NAME READY STATUS RESTARTS AGE rook-ceph-operator-85f6494db4-wx9xx 1/1 Running 0 50sCopy to Clipboard Copied! Toggle word wrap Toggle overflow 在操作器重启后,创建新 OSD 可能需要几分钟。
删除
ocs-osd-removal任务。oc delete -n openshift-storage job ocs-osd-removal-job
$ oc delete -n openshift-storage job ocs-osd-removal-jobCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
job.batch "ocs-osd-removal-job" deleted
job.batch "ocs-osd-removal-job" deletedCopy to Clipboard Copied! Toggle word wrap Toggle overflow
使用带有数据加密的外部密钥管理系统(KMS)时,可以从 Vault 服务器中删除旧的 OSD 加密密钥,因为它现在是孤立的密钥。
验证步骤
验证是否有新的 OSD 正在运行。
oc get -n openshift-storage pods -l app=rook-ceph-osd
$ oc get -n openshift-storage pods -l app=rook-ceph-osdCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
rook-ceph-osd-0-76d8fb97f9-mn8qz 1/1 Running 0 23m rook-ceph-osd-1-7c99657cfb-jdzvz 1/1 Running 1 25h rook-ceph-osd-2-5f9f6dfb5b-2mnw9 1/1 Running 0 25h
rook-ceph-osd-0-76d8fb97f9-mn8qz 1/1 Running 0 23m rook-ceph-osd-1-7c99657cfb-jdzvz 1/1 Running 1 25h rook-ceph-osd-2-5f9f6dfb5b-2mnw9 1/1 Running 0 25hCopy to Clipboard Copied! Toggle word wrap Toggle overflow 验证是否创建了新 PVC。
oc get -n openshift-storage pvc | grep localblock
$ oc get -n openshift-storage pvc | grep localblockCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
ocs-deviceset-localblock-0-data-0-q4q6b Bound local-pv-8137c873 256Gi RWO localblock 10m ocs-deviceset-localblock-1-data-0-hr2fx Bound local-pv-ec7f2b80 256Gi RWO localblock 1d20h ocs-deviceset-localblock-2-data-0-6xhkf Bound local-pv-1e31f771 256Gi RWO localblock 1d20h
ocs-deviceset-localblock-0-data-0-q4q6b Bound local-pv-8137c873 256Gi RWO localblock 10m ocs-deviceset-localblock-1-data-0-hr2fx Bound local-pv-ec7f2b80 256Gi RWO localblock 1d20h ocs-deviceset-localblock-2-data-0-6xhkf Bound local-pv-1e31f771 256Gi RWO localblock 1d20hCopy to Clipboard Copied! Toggle word wrap Toggle overflow 可选:如果在集群中启用了集群范围的加密,请验证新 OSD 设备是否已加密。
识别运行新 OSD pod 的节点。
oc get -n openshift-storage -o=custom-columns=NODE:.spec.nodeName pod/<OSD-pod-name>
$ oc get -n openshift-storage -o=custom-columns=NODE:.spec.nodeName pod/<OSD-pod-name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow <OSD-pod-name>是 OSD pod 的名称。
例如:
oc get -n openshift-storage -o=custom-columns=NODE:.spec.nodeName pod/rook-ceph-osd-0-544db49d7f-qrgqm
$ oc get -n openshift-storage -o=custom-columns=NODE:.spec.nodeName pod/rook-ceph-osd-0-544db49d7f-qrgqmCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例:
NODE compute-1
NODE compute-1Copy to Clipboard Copied! Toggle word wrap Toggle overflow
对于之前标识的每个节点,请执行以下操作:
创建 debug pod,并为所选主机打开 chroot 环境。
oc debug node/<node name>
$ oc debug node/<node name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow <node name>是节点的名称。
chroot /host
$ chroot /hostCopy to Clipboard Copied! Toggle word wrap Toggle overflow
检查
ocs-deviceset名称旁边的crypt关键字。lsblk
$ lsblkCopy to Clipboard Copied! Toggle word wrap Toggle overflow
- 登录 OpenShift Web 控制台,并在 Storage 下的 OpenShift Data Foundation 仪表板中检查状态卡。
根据正在恢复的数据量,完整数据恢复可能需要更长的时间。