Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Chapter 3. Replacing data plane nodes
You can replace pre-provisioned and provisioned data plane nodes without scaling the Red Hat OpenStack Services on OpenShift (RHOSO) cloud.
3.1. Replacing a pre-provisioned node Link kopierenLink in die Zwischenablage kopiert!
When you replace a faulty pre-provisioned node, your replacement node has the same hostname and IP address as the faulty node. You create a new OpenStackDataPlaneDeployment CR to deploy the new node.
If your replacement node has a different IP address to the faulty node, you set a new fixedIP for the control plane network for the node and a new ansibleHost for the node in the OpenStackDataPlaneNodeSet CR. Then you create a new OpenStackDataPlaneDeployment CR to deploy the new node.
When you replace a pre-provisioned node, you manually clean up the node you are removing.
For information about errors, data plane conditions and states, and returned statuses that can occur when you modify an OpenStackDataPlaneNodeSet CR, see Modifying an OpenStackDataPlaneNodeSet CR in Customizing the Red Hat OpenStack Services on OpenShift deployment.
Prerequisites
-
You are logged in to the RHOCP cluster as a user with
cluster-adminprivileges. - The workloads on the Compute nodes have been migrated to other Compute nodes.
Procedure
If the faulty node is still reachable, perform clean-up tasks to ensure that the faulty node does not impact the new node.
SSH into the removed node, and stop the
ovnandnova-computecontainers:$ ssh -i <key_file_name> cloud-admin@<node_IP_address> [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_controller [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_metadata_agent [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_nova_compute-
Replace
<key_file_name>with the name and location of the SSH key pair file you created to enable Ansible to manage the RHEL nodes. -
Replace
<node_IP_address>with the IP address for the removed node.
-
Replace
Remove the
systemdunit files that manage theovnandnova-computecontainers to prevent the agents from being automatically started and registered in the database if the removed node is rebooted:[cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_controller [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_metadata_agent [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_nova_computeDisconnect from the node:
$ exit
If your replacement node has the same IP address as the faulty node, proceed to Step 3 to create the
OpenStackDataPlaneDeploymentCR to deploy the new node. If your replacement node has a different IP address to the faulty node, open theOpenStackDataPlaneNodeSetCR file for the node set you want to update, for example,openstack_data_plane.yaml, and make the following changes.Update the
fixedIPfor the control plane network for the node you are replacing to specify the IP address of the new node:Example:
nodes: edpm-compute-0: hostName: edpm-compute-0 networks: - name: ctlplane subnetName: subnet1 defaultRoute: true fixedIP: 192.168.122.100 ...Update the
ansibleHostvalue for the node you are replacing to specify the IP address of the new node:Example:
nodes: edpm-compute-0: hostName: edpm-compute-0 ... ansible: ansibleHost: 192.168.122.100 ansibleUser: cloud-admin ansibleVars: fqdn_internal_api: edpm-compute-0.example.com ...-
Save the
OpenStackDataPlaneNodeSetCR file. Apply the updated
OpenStackDataPlaneNodeSetCR configuration:$ oc apply -f openstack_data_plane.yamlVerify that the data plane resource has been updated by confirming that the status is
SetupReady:$ oc wait openstackdataplanenodeset openstack-data-plane --for condition=SetupReady --timeout=10mWhen the status is
SetupReady, the command returns acondition metmessage, otherwise it returns a timeout error.
Create a file on your workstation to define the
OpenStackDataPlaneDeploymentCR to deploy the new node:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: <node_set_deployment_name>-
Replace
<node_set_deployment_name>with the name of theOpenStackDataPlaneDeploymentCR. The name must be unique, must consist of lower case alphanumeric characters,-(hyphen) or.(period), and must start and end with an alphanumeric character.
-
Replace
Add the
OpenStackDataPlaneNodeSetCR:spec: nodeSets: - <nodeSet_name>-
Save the
OpenStackDataPlaneDeploymentCR deployment file. Deploy the
OpenStackDataPlaneNodeSetCR:$ oc create -f openstack_data_plane_deploy.yaml -n openstackYou can view the Ansible logs while the deployment executes:
$ oc get pod -l app=openstackansibleee -w $ oc logs -l app=openstackansibleee -f --max-log-requests 10Verify that the
OpenStackDataPlaneNodeSetCR is deployed:$ oc get openstackdataplanedeployment -n openstack NAME STATUS MESSAGE openstack-data-plane True Setup Complete $ oc get openstackdataplanenodeset -n openstack NAME STATUS MESSAGE openstack-data-plane True NodeSet Ready
3.2. Replacing a provisioned node Link kopierenLink in die Zwischenablage kopiert!
To replace a faulty provisioned node from the data plane without scaling the Red Hat OpenStack Services on OpenShift (RHOSO) cloud, you delete the faulty bare metal host (BMH). The OpenStackBaremetalSet CR is reconciled to provision a new available BMH and reset the deployment status of the OpenStackDataPlaneNodeSet, prompting you to create a new OpenStackDataPlaneDeployment CR for deploying on the newly provisioned node.
Prerequisites
-
You are logged in to the RHOCP cluster as a user with
cluster-adminprivileges. - The workloads on the Compute nodes have been migrated to other Compute nodes.
Procedure
Verify that you have a spare BMH in the
availablestate that you can use to replace the faulty node:$ oc get bmhExample output:
NAME STATE CONSUMER ONLINE ERROR AGE leaf0-0 available false 11h leaf0-1 provisioned nodeset-0 true 11h leaf1-0 provisioned nodeset-1 true 11h leaf1-1 provisioned nodeset-1 true 11hDelete the faulty node:
$ oc delete bmh leaf0-1Example output:
baremetalhost.metal3.io "leaf0-1" deletedThe
OpenStackBaremetalSetCR is reconciled to provision a new available BMH and reset the deployment status of theOpenStackDataPlaneNodeSet, prompting you to create a newOpenStackDataPlaneDeploymentCR for deploying on the newly provisioned node.Wait for the node set that contained the faulty node to reach the
SetupReadystate.$ oc wait openstackdataplanenodeset openstack-data-plane --for condition=SetupReady --timeout=10mWhen the status is
SetupReady, the command returns acondition metmessage, otherwise it returns a timeout error.Create a file on your workstation to define the
OpenStackDataPlaneDeploymentCR:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: <node_set_deployment_name>Replace
<node_set_deployment_name>with the name of theOpenStackDataPlaneDeploymentCR. The name must be unique, must consist of lower case alphanumeric characters,-(hyphen) or.(period), and must start and end with an alphanumeric character.TipGive the definition file and the
OpenStackDataPlaneDeploymentCR unique and descriptive names that indicate the purpose of the modified node set.
Add the
OpenStackDataPlaneNodeSetCR that you modified:spec: nodeSets: - <nodeSet_name>-
Save the
OpenStackDataPlaneDeploymentCR deployment file. Deploy the modified
OpenStackDataPlaneNodeSetCR:$ oc create -f openstack_data_plane_deploy.yaml -n openstackYou can view the Ansible logs while the deployment executes:
$ oc get pod -l app=openstackansibleee -w $ oc logs -l app=openstackansibleee -f --max-log-requests 10Verify that the modified
OpenStackDataPlaneNodeSetCR is deployed:$ oc get openstackdataplanedeployment -n openstack NAME STATUS MESSAGE openstack-data-plane True Setup Complete $ oc get openstackdataplanenodeset -n openstack NAME STATUS MESSAGE openstack-data-plane True NodeSet ReadyVerify that the spare BMH has replaced the faulty BMH:
$ oc get bmhExample output:
NAME STATE CONSUMER ONLINE ERROR AGE leaf0-0 provisioned nodeset-0 true 13h leaf1-0 provisioned nodeset-1 true 13h leaf1-1 provisioned nodeset-1 true 13h