Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 13. Replacing storage nodes


You can choose one of the following procedures to replace storage nodes:

Use this procedure to replace an operational node on Google Cloud installer-provisioned infrastructure (IPI).

Procedure

  1. Log in to OpenShift Web Console and click Compute Nodes.
  2. Identify the node that needs to be replaced. Take a note of its Machine Name.
  3. Mark the node as unschedulable using the following command:

    $ oc adm cordon <node_name>
    Copy to Clipboard Toggle word wrap
  4. Drain the node using the following command:

    $ oc adm drain <node_name> --force --delete-local-data --ignore-daemonsets
    Copy to Clipboard Toggle word wrap
    Important

    This activity may take at least 5-10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when the new node is labeled and functional.

  5. Click Compute Machines. Search for the required machine.
  6. Besides the required machine, click the Action menu (⋮) Delete Machine.
  7. Click Delete to confirm the machine deletion. A new machine is automatically created.
  8. Wait for new machine to start and transition into Running state.

    Important

    This activity may take at least 5-10 minutes or more.

  9. Click Compute Nodes, confirm if the new node is in Ready state.
  10. Apply the OpenShift Container Storage label to the new node using any one of the following:

    From User interface
    1. For the new node, click Action Menu (⋮) Edit Labels
    2. Add cluster.ocs.openshift.io/openshift-storage and click Save.
    From Command line interface
    • Execute the following command to apply the OpenShift Container Storage label to the new node:

      $ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
      Copy to Clipboard Toggle word wrap

Verification steps

  1. Execute the following command and verify that the new node is present in the output:

    $ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1
    Copy to Clipboard Toggle word wrap
  2. Click Workloads Pods, confirm that at least the following pods on the new node are in Running state:

    • csi-cephfsplugin-*
    • csi-rbdplugin-*
  3. Verify that all other required OpenShift Container Storage pods are in Running state.
  4. Verify that new OSD pods are running on the replacement node.

    $ oc get pods -o wide -n openshift-storage| egrep -i new-node-name | egrep osd
    Copy to Clipboard Toggle word wrap
  5. (Optional) If data encryption is enabled on the cluster, verify that the new OSD devices are encrypted.

    For each of the new nodes identified in previous step, do the following:

    1. Create a debug pod and open a chroot environment for the selected host(s).

      $ oc debug node/<node name>
      $ chroot /host
      Copy to Clipboard Toggle word wrap
    2. Run “lsblk” and check for the “crypt” keyword beside the ocs-deviceset name(s)

      $ lsblk
      Copy to Clipboard Toggle word wrap
  6. If verification steps fail, contact Red Hat Support.

Perform this procedure to replace a failed node which is not operational on Google Cloud installer-provisioned infrastructure (IPI) for OpenShift Container Storage.

Procedure

  1. Log in to OpenShift Web Console and click Compute Nodes.
  2. Identify the faulty node and click on its Machine Name.
  3. Click Actions Edit Annotations, and click Add More.
  4. Add machine.openshift.io/exclude-node-draining and click Save.
  5. Click Actions Delete Machine, and click Delete.
  6. A new machine is automatically created, wait for new machine to start.

    Important

    This activity may take at least 5-10 minutes or more. Ceph errors generated during this period are temporary and are automatically resolved when the new node is labeled and functional.

  7. Click Compute Nodes, confirm if the new node is in Ready state.
  8. Apply the OpenShift Container Storage label to the new node using any one of the following:

    From the web user interface
    1. For the new node, click Action Menu (⋮) Edit Labels
    2. Add cluster.ocs.openshift.io/openshift-storage and click Save.
    From the command line interface
    • Execute the following command to apply the OpenShift Container Storage label to the new node:

      $ oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
      Copy to Clipboard Toggle word wrap
  9. [Optional]: If the failed Google Cloud instance is not removed automatically, terminate the instance from Google Cloud console.

Verification steps

  1. Execute the following command and verify that the new node is present in the output:

    $ oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1
    Copy to Clipboard Toggle word wrap
  2. Click Workloads Pods, confirm that at least the following pods on the new node are in Running state:

    • csi-cephfsplugin-*
    • csi-rbdplugin-*
  3. Verify that all other required OpenShift Container Storage pods are in Running state.
  4. Verify that new OSD pods are running on the replacement node.

    $ oc get pods -o wide -n openshift-storage| egrep -i new-node-name | egrep osd
    Copy to Clipboard Toggle word wrap
  5. (Optional) If data encryption is enabled on the cluster, verify that the new OSD devices are encrypted.

    For each of the new nodes identified in previous step, do the following:

    1. Create a debug pod and open a chroot environment for the selected host(s).

      $ oc debug node/<node name>
      $ chroot /host
      Copy to Clipboard Toggle word wrap
    2. Run “lsblk” and check for the “crypt” keyword beside the ocs-deviceset name(s)

      $ lsblk
      Copy to Clipboard Toggle word wrap
  6. If verification steps fail, contact Red Hat Support.
Nach oben
Red Hat logoGithubredditYoutubeTwitter

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Wir helfen Red Hat Benutzern, mit unseren Produkten und Diensten innovativ zu sein und ihre Ziele zu erreichen – mit Inhalten, denen sie vertrauen können. Entdecken Sie unsere neuesten Updates.

Mehr Inklusion in Open Source

Red Hat hat sich verpflichtet, problematische Sprache in unserem Code, unserer Dokumentation und unseren Web-Eigenschaften zu ersetzen. Weitere Einzelheiten finden Sie in Red Hat Blog.

Über Red Hat

Wir liefern gehärtete Lösungen, die es Unternehmen leichter machen, plattform- und umgebungsübergreifend zu arbeiten, vom zentralen Rechenzentrum bis zum Netzwerkrand.

Theme

© 2025 Red Hat