Chapter 1. OpenShift Data Foundation deployed using dynamic devices

1.1. OpenShift Data Foundation deployed on AWS

To replace an operational node, see:
- Section 1.1.1, “Replacing an operational AWS node on user-provisioned infrastructure”.
- Section 1.1.2, “Replacing an operational AWS node on installer-provisioned infrastructure”.
To replace a failed node, see:
- Section 1.1.3, “Replacing a failed AWS node on user-provisioned infrastructure”.
- Section 1.1.4, “Replacing a failed AWS node on installer-provisioned infrastructure”.

1.1.1. Replacing an operational AWS node on user-provisioned infrastructure

Prerequisites

Ensure that the replacement nodes are configured with similar infrastructure and resources to the node that you replace.
You must be logged into the OpenShift Container Platform cluster.

Note

When replacing an AWS node on user-provisioned infrastructure, the new node needs to be created in the same AWS zone as the original node.

Procedure

Identify the node that you need to replace.
Mark the node as unschedulable:
```
$ oc adm cordon <node_name>
```
<node_name>
Specify the name of node that you need to replace.
Drain the node:
```
$ oc adm drain <node_name> --force --delete-emptydir-data=true --ignore-daemonsets
```
Important
This activity might take at least 5 - 10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when you label the new node, and it is functional.
Delete the node:
```
$ oc delete nodes <node_name>
```
Create a new Amazon Web Service (AWS) machine instance with the required infrastructure. See Platform requirements.
Create a new OpenShift Container Platform node using the new AWS machine instance.
Check for the Certificate Signing Requests (CSRs) related to OpenShift Container Platform that are in Pending state:
```
$ oc get csr
```
Approve all the required OpenShift Container Platform CSRs for the new node:
```
$ oc adm certificate approve <certificate_name>
```
<certificate_name>
Specify the name of the CSR.
Click Compute Nodes. Confirm that the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using one of the following:
From the user interface
For the new node, click Action Menu (⋮) Edit Labels.
Add cluster.ocs.openshift.io/openshift-storage, and click Save.
From the command-line interface
Apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.1.2. Replacing an operational AWS node on installer-provisioned infrastructure

Procedure

Log in to the OpenShift Web Console, and click Compute Nodes.
Identify the node that you need to replace. Take a note of its Machine Name.
Mark the node as unschedulable:
```
$ oc adm cordon <node_name>
```
<node_name>
Specify the name of node that you need to replace.
Drain the node:
```
$ oc adm drain <node_name> --force --delete-emptydir-data=true --ignore-daemonsets
```
Important
This activity might take at least 5 - 10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when you label the new node, and it is functional.
Click Compute Machines. Search for the required machine.
Besides the required machine, click Action menu (⋮) Delete Machine.
Click Delete to confirm that the machine is deleted. A new machine is automatically created.
Wait for the new machine to start and transition into Running state.
Important
This activity might take at least 5 - 10 minutes or more.
Click Compute Nodes. Confirm that the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node:
From the user interface
For the new node, click Action Menu (⋮) Edit Labels.
Add cluster.ocs.openshift.io/openshift-storage, and click Save.
From the command-line interface
Apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.1.3. Replacing a failed AWS node on user-provisioned infrastructure

Prerequisites

Ensure that the replacement nodes are configured with similar infrastructure and resources to the node that you replace.
You must be logged into the OpenShift Container Platform cluster.

Procedure

Identify the Amazon Web Service (AWS) machine instance of the node that you need to replace.
Log in to AWS, and terminate the AWS machine instance that you identified.
Create a new AWS machine instance with the required infrastructure. See Platform requirements.
Create a new OpenShift Container Platform node using the new AWS machine instance.
Check for the Certificate Signing Requests (CSRs) related to OpenShift Container Platform that are in Pending state:
```
$ oc get csr
```
Approve all the required OpenShift Container Platform CSRs for the new node:
```
$ oc adm certificate approve <certificate_name>
```
<certificate_name>
Specify the name of the CSR.
Click Compute Nodes. Confirm that the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using any one of the following:
From the user interface
For the new node, click Action Menu (⋮) Edit Labels.
Add cluster.ocs.openshift.io/openshift-storage, and click Save.
From the command-line interface
Execute the following command to apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.1.4. Replacing a failed AWS node on installer-provisioned infrastructure

Procedure

Log in to the OpenShift Web Console, and click Compute Nodes.
Identify the faulty node, and click on its Machine Name.
Click Actions Edit Annotations, and click Add More.
Add machine.openshift.io/exclude-node-draining, and click Save.
Click Actions Delete Machine, and click Delete.
A new machine is automatically created, wait for new machine to start.
Important
This activity might take at least 5 - 10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when you label the new node, and it is functional.
Click Compute Nodes. Confirm that the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using any one of the following:
From the user interface
For the new node, click Action Menu (⋮) Edit Labels.
Add cluster.ocs.openshift.io/openshift-storage, and click Save.
From the command-line interface
Apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.
Optional: If the failed Amazon Web Service (AWS) instance is not removed automatically, terminate the instance from the AWS console.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.2. OpenShift Data Foundation deployed on VMware

To replace an operational node, see:
- Section 1.2.1, “Replacing an operational VMware node on user-provisioned infrastructure”.
- Section 1.2.2, “Replacing an operational VMware node on installer-provisioned infrastructure”.
To replace a failed node, see:
- Section 1.2.3, “Replacing a failed VMware node on user-provisioned infrastructure”.
- Section 1.2.4, “Replacing a failed VMware node on installer-provisioned infrastructure”.

1.2.1. Replacing an operational VMware node on user-provisioned infrastructure

Prerequisites

Ensure that the replacement nodes are configured with similar infrastructure and resources to the node that you replace.
You must be logged into the OpenShift Container Platform cluster.

Procedure

Identify the node and its Virtual Machine (VM) that you need replace.
Mark the node as unschedulable:
```
$ oc adm cordon <node_name>
```
<node_name>
Specify the name of node that you need to replace.
Drain the node:
```
$ oc adm drain <node_name> --force --delete-emptydir-data=true --ignore-daemonsets
```
Important
This activity might take at least 5 - 10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when you label the new node, and it is functional.
Delete the node:
```
$ oc delete nodes <node_name>
```
Log in to VMware vSphere, and terminate the VM that you identified:
Important
Delete the VM only from the inventory and not from the disk.
Create a new VM on VMware vSphere with the required infrastructure. See Platform requirements.
Create a new OpenShift Container Platform worker node using the new VM.
Check for the Certificate Signing Requests (CSRs) related to OpenShift Container Platform that are in Pending state:
```
$ oc get csr
```
Approve all the required OpenShift Container Platform CSRs for the new node:
```
$ oc adm certificate approve <certificate_name>
```
<certificate_name>
Specify the name of the CSR.
Click Compute Nodes. Confirm that the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using any one of the following:
From the user interface
For the new node, click Action Menu (⋮) Edit Labels.
Add cluster.ocs.openshift.io/openshift-storage, and click Save.
From the command-line interface
Apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.2.2. Replacing an operational VMware node on installer-provisioned infrastructure

Procedure

Log in to the OpenShift Web Console, and click Compute Nodes.
Identify the node that you need to replace. Take a note of its Machine Name.
Mark the node as unschedulable:
```
$ oc adm cordon <node_name>
```
<node_name>
Specify the name of node that you need to replace.
Drain the node:
```
$ oc adm drain <node_name> --force --delete-emptydir-data=true --ignore-daemonsets
```
Important
This activity might take at least 5 - 10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when you label the new node, and it is functional.
Click Compute Machines. Search for the required machine.
Besides the required machine, click Action menu (⋮) Delete Machine.
Click Delete to confirm the machine is deleted. A new machine is automatically created.
Wait for the new machine to start and transition into Running state.
Important
This activity might take at least 5 - 10 minutes or more.
Click Compute Nodes. Confirm that the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using any one of the following:
From the user interface
For the new node, click Action Menu (⋮) Edit Labels.
Add cluster.ocs.openshift.io/openshift-storage, and click Save.
From the command-line interface
Apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.2.3. Replacing a failed VMware node on user-provisioned infrastructure

Prerequisites

Ensure that the replacement nodes are configured with similar infrastructure and resources to the node that you replace.
You must be logged into the OpenShift Container Platform cluster.

Procedure

Identify the node and its Virtual Machine (VM) that you need to replace.
Delete the node:
```
$ oc delete nodes <node_name>
```
<node_name>
Specify the name of node that you need to replace.
Log in to VMware vSphere and terminate the VM that you identified.
Important
Delete the VM only from the inventory and not from the disk.
Create a new VM on VMware vSphere with the required infrastructure. See Platform requirements.
Create a new OpenShift Container Platform worker node using the new VM.
Check for the Certificate Signing Requests (CSRs) related to OpenShift Container Platform that are in Pending state:
```
$ oc get csr
```
Approve all the required OpenShift Container Platform CSRs for the new node:
```
$ oc adm certificate approve <certificate_name>
```
<certificate_name>
Specify the name of the CSR.
Click Compute Nodes. Confirm that the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using any one of the following:
From the user interface
For the new node, click Action Menu (⋮) Edit Labels.
Add cluster.ocs.openshift.io/openshift-storage, and click Save.
From the command-line interface
Apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.2.4. Replacing a failed VMware node on installer-provisioned infrastructure

Procedure

Log in to the OpenShift Web Console, and click Compute Nodes.
Identify the faulty node, and click on its Machine Name.
Click Actions Edit Annotations, and click Add More.
Add machine.openshift.io/exclude-node-draining, and click Save.
Click Actions Delete Machine, and click Delete.
A new machine is automatically created. Wait for te new machine to start.
Important
This activity might take at least 5 - 10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when you label the new node, and it is functional.
Click Compute Nodes. Confirm that the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using any one of the following:
From the user interface
For the new node, click Action Menu (⋮) Edit Labels.
Add cluster.ocs.openshift.io/openshift-storage, and click Save.
From the command-line interface
Apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.
Optional: If the failed Virtual Machine (VM) is not removed automatically, terminate the VM from VMware vSphere.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.3. OpenShift Data Foundation deployed on Microsoft Azure

1.3.1. Replacing operational nodes on Azure installer-provisioned infrastructure

Procedure

Log in to the OpenShift Web Console, and click Compute Nodes.
Identify the node that you need to replace. Take a note of its Machine Name.
Mark the node as unschedulable:
```
$ oc adm cordon <node_name>
```
<node_name>
Specify the name of node that you need to replace.
Drain the node:
```
$ oc adm drain <node_name> --force --delete-emptydir-data=true --ignore-daemonsets
```
Important
This activity might take at least 5 - 10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when you label the new node, and it is functional.
Click Compute Machines. Search for the required machine.
Besides the required machine, click the Action menu (⋮) Delete Machine.
Click Delete to confirm the machine is deleted. A new machine is automatically created.
Wait for the new machine to start and transition into Running state.
Important
This activity might take at least 5 - 10 minutes or more.
Click Compute Nodes. Confirm that the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using any one of the following:
From the user interface
For the new node, click Action Menu (⋮) Edit Labels.
Add cluster.ocs.openshift.io/openshift-storage, and click Save.
From the command-line interface
Execute the following command to apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads→ Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.3.2. Replacing failed nodes on Azure installer-provisioned infrastructure

Procedure

Log in to the OpenShift Web Console, and click Compute Nodes.
Identify the faulty node, and click on its Machine Name.
Click Actions Edit Annotations, and click Add More.
Add machine.openshift.io/exclude-node-draining, and click Save.
Click Actions Delete Machine, and click Delete.
A new machine is automatically created. Wait for the new machine to start.
Important
This activity might take at least 5 - 10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when you label the new node, and it is functional.
Click Compute Nodes. Confirm that the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using any one of the following:
From the user interface
For the new node, click Action Menu (⋮) Edit Labels.
Add cluster.ocs.openshift.io/openshift-storage, and click Save.
From the command-line interface
Apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.
Optional: If the failed Azure instance is not removed automatically, terminate the instance from the Azure console.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that new the Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.4. OpenShift Data Foundation deployed on Google cloud

1.4.1. Replacing operational nodes on Google Cloud installer-provisioned infrastructure

Procedure

Log in to OpenShift Web Console and click Compute Nodes.
Identify the node that needs to be replaced. Take a note of its Machine Name.
Mark the node as unschedulable using the following command:
```
$ oc adm cordon <node_name>
```
Drain the node using the following command:
```
$ oc adm drain <node_name> --force --delete-emptydir-data=true --ignore-daemonsets
```
Important
This activity may take at least 5-10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when the new node is labeled and functional.
Click Compute Machines. Search for the required machine.
Besides the required machine, click the Action menu (⋮) Delete Machine.
Click Delete to confirm the machine deletion. A new machine is automatically created.
Wait for new machine to start and transition into Running state.
Important
This activity may take at least 5-10 minutes or more.
Click Compute Nodes, confirm if the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using any one of the following:
From User interface
For the new node, click Action Menu (⋮) Edit Labels
Add cluster.ocs.openshift.io/openshift-storage and click Save.
From Command line interface
Execute the following command to apply the OpenShift Data Foundation label to the new node:
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

1.4.2. Replacing failed nodes on Google Cloud installer-provisioned infrastructure

Procedure

Log in to OpenShift Web Console and click Compute Nodes.
Identify the faulty node and click on its Machine Name.
Click Actions Edit Annotations, and click Add More.
Add machine.openshift.io/exclude-node-draining and click Save.
Click Actions Delete Machine, and click Delete.
A new machine is automatically created, wait for new machine to start.
Important
This activity may take at least 5-10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when the new node is labeled and functional.
Click Compute Nodes, confirm if the new node is in Ready state.
Apply the OpenShift Data Foundation label to the new node using any one of the following:
From the web user interface
For the new node, click Action Menu (⋮) Edit Labels
Add cluster.ocs.openshift.io/openshift-storage and click Save.
From the command line interface
Apply the OpenShift Data Foundation label to the new node:
```
$ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
```
<new_node_name>
Specify the name of the new node.
Optional: If the failed Google Cloud instance is not removed automatically, terminate the instance from Google Cloud console.

Verification steps

Verify that the new node is present in the output:

$ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1

Click Workloads Pods. Confirm that at least the following pods on the new node are in Running state:
- csi-cephfsplugin-*
- csi-rbdplugin-*
Verify that all the other required OpenShift Data Foundation pods are in Running state.
Verify that new the Object Storage Device (OSD) pods are running on the replacement node:
```
$ oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
```
Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
For each of the new nodes identified in the previous step, do the following:
1. Create a debug pod and open a chroot environment for the one or more selected hosts:
```
$ oc debug node/<node_name>
```
```
$ chroot /host
```
2. Display the list of available block devices:
```
$ lsblk
```
  Check for the crypt keyword beside the one or more ocs-deviceset names.
If the verification steps fail, contact Red Hat Support.

此内容没有您所选择的语言版本。

Chapter 1. OpenShift Data Foundation deployed using dynamic devices

1.1. OpenShift Data Foundation deployed on AWS

1.1.1. Replacing an operational AWS node on user-provisioned infrastructure

1.1.2. Replacing an operational AWS node on installer-provisioned infrastructure

1.1.3. Replacing a failed AWS node on user-provisioned infrastructure

1.1.4. Replacing a failed AWS node on installer-provisioned infrastructure

1.2. OpenShift Data Foundation deployed on VMware

1.2.1. Replacing an operational VMware node on user-provisioned infrastructure

1.2.2. Replacing an operational VMware node on installer-provisioned infrastructure

1.2.3. Replacing a failed VMware node on user-provisioned infrastructure

1.2.4. Replacing a failed VMware node on installer-provisioned infrastructure

1.3. OpenShift Data Foundation deployed on Microsoft Azure

1.3.1. Replacing operational nodes on Azure installer-provisioned infrastructure

1.3.2. Replacing failed nodes on Azure installer-provisioned infrastructure

1.4. OpenShift Data Foundation deployed on Google cloud

1.4.1. Replacing operational nodes on Google Cloud installer-provisioned infrastructure

1.4.2. Replacing failed nodes on Google Cloud installer-provisioned infrastructure

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Red Hat legal and privacy links

Red Hat legal and privacy links