Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 3. Replacing data plane nodes


You can replace pre-provisioned and provisioned data plane nodes without scaling the Red Hat OpenStack Services on OpenShift (RHOSO) cloud.

3.1. Replacing a pre-provisioned node

When you replace a faulty pre-provisioned node, your replacement node has the same hostname and IP address as the faulty node. You create a new OpenStackDataPlaneDeployment CR to deploy the new node.

If your replacement node has a different IP address to the faulty node, you set a new fixedIP for the control plane network for the node and a new ansibleHost for the node in the OpenStackDataPlaneNodeSet CR. Then you create a new OpenStackDataPlaneDeployment CR to deploy the new node.

When you replace a pre-provisioned node, you manually clean up the node you are removing.

For information about errors, data plane conditions and states, and returned statuses that can occur when you modify an OpenStackDataPlaneNodeSet CR, see Modifying an OpenStackDataPlaneNodeSet CR in Customizing the Red Hat OpenStack Services on OpenShift deployment.

Prerequisites

  • You are logged in to the RHOCP cluster as a user with cluster-admin privileges.
  • The workloads on the Compute nodes have been migrated to other Compute nodes.

Procedure

  1. If the faulty node is still reachable, perform clean-up tasks to ensure that the faulty node does not impact the new node.

    1. SSH into the removed node, and stop the ovn and nova-compute containers:

      $ ssh -i <key_file_name> cloud-admin@<node_IP_address>
      [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_controller
      [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_metadata_agent
      [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_nova_compute
      • Replace <key_file_name> with the name and location of the SSH key pair file you created to enable Ansible to manage the RHEL nodes.
      • Replace <node_IP_address> with the IP address for the removed node.
    2. Remove the systemd unit files that manage the ovn and nova-compute containers to prevent the agents from being automatically started and registered in the database if the removed node is rebooted:

      [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_controller
      [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_metadata_agent
      [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_nova_compute
    3. Disconnect from the node:

      $ exit
  2. If your replacement node has the same IP address as the faulty node, proceed to Step 3 to create the OpenStackDataPlaneDeployment CR to deploy the new node. If your replacement node has a different IP address to the faulty node, open the OpenStackDataPlaneNodeSet CR file for the node set you want to update, for example, openstack_data_plane.yaml, and make the following changes.

    1. Update the fixedIP for the control plane network for the node you are replacing to specify the IP address of the new node:

      Example:

        nodes:
          edpm-compute-0:
            hostName: edpm-compute-0
            networks:
            - name: ctlplane
              subnetName: subnet1
              defaultRoute: true
              fixedIP: 192.168.122.100
            ...
    2. Update the ansibleHost value for the node you are replacing to specify the IP address of the new node:

      Example:

        nodes:
          edpm-compute-0:
            hostName: edpm-compute-0
            ...
            ansible:
              ansibleHost: 192.168.122.100
              ansibleUser: cloud-admin
              ansibleVars:
                fqdn_internal_api: edpm-compute-0.example.com
      ...
    3. Save the OpenStackDataPlaneNodeSet CR file.
    4. Apply the updated OpenStackDataPlaneNodeSet CR configuration:

      $ oc apply -f openstack_data_plane.yaml
    5. Verify that the data plane resource has been updated by confirming that the status is SetupReady:

      $ oc wait openstackdataplanenodeset openstack-data-plane --for condition=SetupReady --timeout=10m

      When the status is SetupReady, the command returns a condition met message, otherwise it returns a timeout error.

  3. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR to deploy the new node:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_deployment_name>
    • Replace <node_set_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, must consist of lower case alphanumeric characters, - (hyphen) or . (period), and must start and end with an alphanumeric character.
  4. Add the OpenStackDataPlaneNodeSet CR:

    spec:
      nodeSets:
        - <nodeSet_name>
  5. Save the OpenStackDataPlaneDeployment CR deployment file.
  6. Deploy the OpenStackDataPlaneNodeSet CR:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10
  7. Verify that the OpenStackDataPlaneNodeSet CR is deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME             	  STATUS    MESSAGE
    openstack-data-plane   True     Setup Complete
    
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME             	  STATUS    MESSAGE
    openstack-data-plane   True     NodeSet Ready

3.2. Replacing a provisioned node

To replace a faulty provisioned node from the data plane without scaling the Red Hat OpenStack Services on OpenShift (RHOSO) cloud, you delete the faulty bare metal host (BMH). The OpenStackBaremetalSet CR is reconciled to provision a new available BMH and reset the deployment status of the OpenStackDataPlaneNodeSet, prompting you to create a new OpenStackDataPlaneDeployment CR for deploying on the newly provisioned node.

Prerequisites

  • You are logged in to the RHOCP cluster as a user with cluster-admin privileges.
  • The workloads on the Compute nodes have been migrated to other Compute nodes.

Procedure

  1. Verify that you have a spare BMH in the available state that you can use to replace the faulty node:

    $ oc get bmh

    Example output:

    NAME      STATE         CONSUMER    ONLINE   ERROR   AGE
    leaf0-0   available                 false            11h
    leaf0-1   provisioned   nodeset-0   true             11h
    leaf1-0   provisioned   nodeset-1   true             11h
    leaf1-1   provisioned   nodeset-1   true             11h
  2. Delete the faulty node:

    $ oc delete bmh leaf0-1

    Example output:

    baremetalhost.metal3.io "leaf0-1" deleted

    The OpenStackBaremetalSet CR is reconciled to provision a new available BMH and reset the deployment status of the OpenStackDataPlaneNodeSet, prompting you to create a new OpenStackDataPlaneDeployment CR for deploying on the newly provisioned node.

  3. Wait for the node set that contained the faulty node to reach the SetupReady state.

    $ oc wait openstackdataplanenodeset openstack-data-plane --for condition=SetupReady --timeout=10m

    When the status is SetupReady, the command returns a condition met message, otherwise it returns a timeout error.

  4. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_deployment_name>
    • Replace <node_set_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, must consist of lower case alphanumeric characters, - (hyphen) or . (period), and must start and end with an alphanumeric character.

      Tip

      Give the definition file and the OpenStackDataPlaneDeployment CR unique and descriptive names that indicate the purpose of the modified node set.

  5. Add the OpenStackDataPlaneNodeSet CR that you modified:

    spec:
      nodeSets:
        - <nodeSet_name>
  6. Save the OpenStackDataPlaneDeployment CR deployment file.
  7. Deploy the modified OpenStackDataPlaneNodeSet CR:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10
  8. Verify that the modified OpenStackDataPlaneNodeSet CR is deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME             	  STATUS    MESSAGE
    openstack-data-plane   True     Setup Complete
    
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME             	  STATUS    MESSAGE
    openstack-data-plane   True     NodeSet Ready
  9. Verify that the spare BMH has replaced the faulty BMH:

    $ oc get bmh

    Example output:

    NAME      STATE         CONSUMER    ONLINE   ERROR   AGE
    leaf0-0   provisioned   nodeset-0   true             13h
    leaf1-0   provisioned   nodeset-1   true             13h
    leaf1-1   provisioned   nodeset-1   true             13h
Red Hat logoGithubredditYoutubeTwitter

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Wir helfen Red Hat Benutzern, mit unseren Produkten und Diensten innovativ zu sein und ihre Ziele zu erreichen – mit Inhalten, denen sie vertrauen können. Entdecken Sie unsere neuesten Updates.

Mehr Inklusion in Open Source

Red Hat hat sich verpflichtet, problematische Sprache in unserem Code, unserer Dokumentation und unseren Web-Eigenschaften zu ersetzen. Weitere Einzelheiten finden Sie in Red Hat Blog.

Über Red Hat

Wir liefern gehärtete Lösungen, die es Unternehmen leichter machen, plattform- und umgebungsübergreifend zu arbeiten, vom zentralen Rechenzentrum bis zum Netzwerkrand.

Theme

© 2026 Red Hat
Nach oben