Maintaining the Red Hat OpenStack Services on OpenShift deployment


Red Hat OpenStack Services on OpenShift 18.0

Maintaining a Red Hat OpenStack Services on OpenShift environment on a Red Hat OpenShift Container Platform cluster

OpenStack Documentation Team

Abstract

Perform maintenance and management operations on your Red Hat OpenStack Services on OpenShift (RHOSO) deployment.

Providing feedback on Red Hat documentation

We appreciate your feedback. Tell us how we can improve the documentation.

To provide documentation feedback for Red Hat OpenStack Services on OpenShift (RHOSO), create a Jira issue in the OSPRH Jira project.

Procedure

  1. Log in to the Red Hat Atlassian Jira.
  2. Click the following link to open a Create Issue page: Create issue
  3. Complete the Summary and Description fields. In the Description field, include the documentation URL, chapter or section number, and a detailed description of the issue.
  4. Click Create.
  5. Review the details of the bug you created.

Chapter 1. Accessing the RHOSO cloud

You can access your Red Hat OpenStack Services on OpenShift (RHOSO) cloud to perform actions on your data plane by either accessing the OpenStackClient pod through a remote shell from your workstation, or by using a browser to access the Dashboard service (horizon) interface.

1.1. Accessing the OpenStackClient pod

You can execute Red Hat OpenStack Services on OpenShift (RHOSO) commands on the deployed data plane by using the OpenStackClient pod through a remote shell from your workstation. The OpenStack Operator created the OpenStackClient pod as a part of the OpenStackControlPlane resource. The OpenStackClient pod contains the client tools and authentication details that you require to perform actions on your data plane.

Prerequisites

  • You are logged on to a workstation that has access to the Red Hat OpenShift Container Platform (RHOCP) cluster as a user with cluster-admin privileges.

Procedure

  1. Access the remote shell for the OpenStackClient pod:

    $ oc rsh -n openstack openstackclient
  2. Run your openstack commands. For example, you can create a default network with the following command:

    $ openstack network create default
  3. Exit the OpenStackClient pod:

    $ exit

You can access the OpenStack Dashboard service (horizon) interface by providing the Dashboard service endpoint URL in a browser.

Prerequisites

  • The Dashboard service is enabled on the control plane. For information about how to enable the Dashboard service, see Enabling the Dashboard service (horizon) interface in Customizing the Red Hat OpenStack Services on OpenShift deployment.
  • You need to log into the Dashboard as the admin user.

Procedure

  1. Retrieve the admin password from the AdminPassword parameter in the osp-secret secret:

    $ oc get secret osp-secret -o jsonpath='{.data.AdminPassword}' | base64 -d
  2. Retrieve the Dashboard service endpoint URL:

    $ oc get horizons horizon -o jsonpath='{.status.endpoint}'
  3. Open a browser.
  4. Enter the Dashboard endpoint URL.
  5. Log in to the Dashboard by providing the username of admin and the admin password.

Chapter 2. Scaling data plane nodes

You can scale out your data plane by adding new nodes to existing node sets and by adding new node sets. You can scale in your data plane by removing nodes from node sets and by removing node sets.

2.1. Adding nodes to a node set

You can scale out your data plane by adding new nodes to the nodes section of an existing OpenStackDataPlaneNodeSet custom resource (CR).

Prerequisites

  • If you are adding unprovisioned nodes, then a BareMetalHost CR must be registered and inspected for each bare-metal data plane node. Each bare-metal node must be in the Available state after inspection.

Procedure

  1. Open the OpenStackDataPlaneNodeSet CR definition file for the node set you want to update, for example, openstack_data_plane.yaml.
  2. Add the new node to the node set:

    • Pre-Provisioned:

      apiVersion: dataplane.openstack.org/v1beta1
      kind: OpenStackDataPlaneNodeSet
      metadata:
        name: openstack-node-set
      spec:
        preProvisioned: True
        nodes:
        ...
          edpm-compute-2:
            hostName: edpm-compute-2
            ansible:
              ansibleHost: 192.168.122.102
            networks:
            - name: ctlplane
              subnetName: subnet1
              defaultRoute: true
              fixedIP: 192.168.122.102
            - name: internalapi
              subnetName: subnet1
            - name: storage
              subnetName: subnet1
            - name: tenant
              subnetName: subnet1
        ...
    • Unprovisioned:

      apiVersion: dataplane.openstack.org/v1beta1
      kind: OpenStackDataPlaneNodeSet
      metadata:
        name: openstack-node-set
      spec:
        preProvisioned: False
        nodes:
        ...
          edpm-compute-2:
            hostName: edpm-compute-2
        ...

    For information about the properties you can use to configure common node attributes, see OpenStackDataPlaneNodeSet CR spec properties in Deploying Red Hat OpenStack Services on OpenShift.

  3. Save the OpenStackDataPlaneNodeSet CR definition file.
  4. Apply the updated OpenStackDataPlaneNodeSet CR configuration:

    $ oc apply -f openstack_data_plane.yaml
  5. Verify that the data plane resource has been updated by confirming that the status is SetupReady:

    $ oc wait openstackdataplanenodeset openstack-node-set --for condition=SetupReady --timeout=10m

    When the status is SetupReady, the command returns a condition met message, otherwise it returns a timeout error. For information about the data plane conditions and states, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.

  6. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_deployment_name>
    • Replace <node_set_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, must consist of lower case alphanumeric characters, - (hyphen) or . (period), and must start and end with an alphanumeric character.
    Tip

    Give the definition file and the OpenStackDataPlaneDeployment CR unique and descriptive names that indicate the purpose of the modified node set.

  7. Add the OpenStackDataPlaneNodeSet CR that you modified:

    spec:
      nodeSets:
        - <nodeSet_name>
  8. Save the OpenStackDataPlaneDeployment CR deployment file.
  9. Deploy the modified OpenStackDataPlaneNodeSet CR:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10

    If the oc logs command returns an error similar to the following error, increase the --max-log-requests value:

    error: you are attempting to follow 19 log streams, but maximum allowed concurrency is 10, use --max-log-requests to increase the limit
  10. Verify that the modified OpenStackDataPlaneNodeSet CR is deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME                  NODESETS                  STATUS  MESSAGE
    openstack-data-plane  ["openstack-data-plane"]  True    Setup Complete
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME                  STATUS  MESSAGE
    openstack-data-plane  True    NodeSet Ready

    For information about the meaning of the returned status, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.

    If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For information, see Troubleshooting the data plane creation and deployment in Deploying Red Hat OpenStack Services on OpenShift.

  11. If the new nodes are Compute nodes, you must bring them online:

    1. Map the Compute nodes to the Compute cell that they are connected to:

      $ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verbose

      If you did not create additional cells, this command maps the Compute nodes to cell1.

    2. Verify that the hypervisor hostname is a fully qualified domain name (FQDN):

      $ hostname -f

      If the hypervisor hostname is not an FQDN, for example, if it was registered as a short name or full name instead, contact Red Hat Support.

    3. Access the remote shell for the openstackclient pod and verify that the deployed Compute nodes are visible on the control plane:

      $ oc rsh -n openstack openstackclient
      $ openstack hypervisor list

2.2. Adding a new node set to the data plane

You can scale out your data plane by adding a new OpenStackDataPlaneNodeSet CR to the data plane. To add the new node set to an existing data plane, you must create a new OpenStackDataPlaneDeployment CR that deploys the new OpenStackDataPlaneNodeSet CR. If you want to be able to perform move operations, such as instance migration and resize, between your new node set and other node sets on your data plane, then you must also create an additional OpenStackDataPlaneDeployment CR that runs the ssh-known-hosts service on all the node sets between which move operations must be able to be performed.

Procedure

  1. Create a file on your workstation to define the new OpenStackDataPlaneNodeSet CR.
  2. Define the node set. For details about how to create a node set, see one of the following procedures:

  3. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR to deploy the new node set:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_deployment_name>
    • Replace <node_set_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, must consist of lower case alphanumeric characters, - (hyphen) or . (period), and must start and end with an alphanumeric character.
    Tip

    Give the definition file and the OpenStackDataPlaneDeployment CR unique and descriptive names that indicate the purpose of the new node set.

  4. Add your new OpenStackDataPlaneNodeSet CR to the list of node sets to deploy:

    spec:
      nodeSets:
        - <nodeSet_name>
  5. Save the OpenStackDataPlaneDeployment CR deployment file.
  6. Deploy the new OpenStackDataPlaneNodeSet CR:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10

    If the oc logs command returns an error similar to the following error, increase the --max-log-requests value:

    error: you are attempting to follow 19 log streams, but maximum allowed concurrency is 10, use --max-log-requests to increase the limit
  7. Verify that the new OpenStackDataPlaneNodeSet CR is deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME                  NODESETS                  STATUS  MESSAGE
    openstack-data-plane  ["openstack-data-plane"]  True    Setup Complete
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME                  STATUS  MESSAGE
    openstack-data-plane  True    NodeSet Ready

    For information about the meaning of the returned status, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.

    If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For information, see Troubleshooting the data plane creation and deployment in Deploying Red Hat OpenStack Services on OpenShift.

  8. If you want to be able to migrate workloads between your new node set and other node sets on your data plane, or perform resize operations, then you must create an additional OpenStackDataPlaneDeployment CR that runs the ssh-known-hosts service on all the node sets that the move operations are expected to work between:

    1. Create a file on your workstation to define an OpenStackDataPlaneDeployment CR that enables move operations:

      apiVersion: dataplane.openstack.org/v1beta1
      kind: OpenStackDataPlaneDeployment
      metadata:
        name: enable-move-operations
    2. Add the new OpenStackDataPlaneNodeSet CR to the list of node sets and all the existing OpenStackDataPlaneNodeSet CRs that move operations must be able to be performed between:

      spec:
        nodeSets:
          - enable-move-operations
          - ...
          - <nodeSet_name>
    3. Specify that only the ssh-known-hosts service is executed on the specified node sets when deploying the node sets in the move operations OpenStackDataPlaneDeployment CR:

      spec:
        ...
        servicesOverride:
          - ssh-known-hosts
    4. Save the OpenStackDataPlaneDeployment CR deployment file.
    5. Deploy the ssh-known-hosts service to enable move operations between the new node set and the other specified node sets on the data plane:

      $ oc create -f enable_move_operations.yaml -n openstack
  9. If the new nodes are Compute nodes, you must bring them online:

    1. Map the Compute nodes to the Compute cell that they are connected to:

      $ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verbose

      If you did not create additional cells, this command maps the Compute nodes to cell1.

    2. Verify that the hypervisor hostname is a fully qualified domain name (FQDN):

      $ hostname -f

      If the hypervisor hostname is not an FQDN, for example, if it was registered as a short name or full name instead, contact Red Hat Support.

    3. Access the remote shell for the openstackclient pod and verify that the deployed Compute nodes are visible on the control plane:

      $ oc rsh -n openstack openstackclient
      $ openstack hypervisor list

2.3. Removing a Compute node from the data plane

You can remove a Compute node from a node set on the data plane. If you remove all the nodes from a node set, then you must also remove the node set from the data plane.

Prerequisites

  • You are logged in to the RHOCP cluster as a user with cluster-admin privileges.
  • The workloads on the Compute nodes have been migrated to other Compute nodes.

Procedure

  1. Access the remote shell for the openstackclient pod:

    $ oc rsh -n openstack openstackclient
  2. Retrieve the IP address of the Compute node that you want to remove:

    $ openstack hypervisor list
  3. Retrieve a list of your Compute nodes to identify the name and UUID of the node that you want to remove:

    $ openstack compute service list
  4. Disable the nova-compute service on the Compute node to be removed:

    $ openstack compute service set <hostname> nova-compute --disable
    Tip

    Use the --disable-reason option to add a short explanation on why the service is being disabled. This is useful if you intend to redeploy the Compute service.

  5. Exit the OpenStackClient pod:

    $ exit
  6. SSH into the Compute node to be removed and stop the ovn and nova-compute containers:

    $ ssh -i <key_file_name> cloud-admin@<node_IP_address>
    [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_controller
    [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_metadata_agent
    [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_nova_compute
    • Replace <key_file_name> with the name and location of the SSH key pair file you created to enable Ansible to manage the RHEL nodes.
    • Replace <node_IP_address> with the IP address for the Compute node that you retrieved in step 2.
  7. Remove the systemd unit files that manage the ovn and nova-compute containers to prevent the agents from being automatically started and registered in the database if the removed node is rebooted:

    [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_controller
    [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_metadata_agent
    [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_nova_compute
  8. Disconnect from the Compute node:

    $ exit
  9. Access the remote shell for openstackclient:

    $ oc rsh -n openstack openstackclient
  10. Delete the network agents for the Compute node to be removed:

    $ openstack network agent list [--host <hostname>]
    $ openstack network agent delete <agent_id>
  11. Delete the nova-compute service for the Compute node to be removed:

    $ openstack compute service delete <node_uuid>
    • Replace <node_uuid> with the UUID of the node to be removed that you retrieved in step 3.
  12. Exit the OpenStackClient pod:

    $ exit
  13. Remove the node from the OpenStackDataPlaneNodeSet CR:

    $ oc patch openstackdataplanenodeset/<node_set_name> --type json --patch '[{ "op": "remove", "path": "/spec/nodes/<node_name>" }]'
    • Replace <node_set_name> with the name of the OpenStackDataPlaneNodeSet CR that the node belongs to.
    • Replace <node_name> with the name of the node defined in the nodes section of the OpenStackDataPlaneNodeSet CR.
  14. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR to update the node set with the Compute node removed:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_deployment_name>
    • Replace <node_set_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, must consist of lower case alphanumeric characters, - (hyphen) or . (period), and must start and end with an alphanumeric character.
    Tip

    Give the definition file and the OpenStackDataPlaneDeployment CR unique and descriptive names that indicate the purpose of the modified node set.

  15. Add the OpenStackDataPlaneNodeSet CR that you removed the node from:

    spec:
      nodeSets:
        - <nodeSet_name>
  16. Save the OpenStackDataPlaneDeployment CR deployment file.
  17. Deploy the OpenStackDataPlaneDeployment CR to delete the removed nodes:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10

    If the oc logs command returns an error similar to the following error, increase the --max-log-requests value:

    error: you are attempting to follow 19 log streams, but maximum allowed concurrency is 10, use --max-log-requests to increase the limit
  18. Verify that the modified OpenStackDataPlaneNodeSet CR is deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME                  NODESETS                  STATUS  MESSAGE
    openstack-data-plane  ["openstack-data-plane"]  True    Setup Complete
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME                  STATUS  MESSAGE
    openstack-data-plane  True    NodeSet Ready

    For information about the meaning of the returned status, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.

    If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For information, see Troubleshooting the data plane creation and deployment in Deploying Red Hat OpenStack Services on OpenShift.

You can remove a whole node set from the data plane. To remove an OpenStackDataPlaneNodeSet resource, you must perform the following tasks:

  • Stop the ovn and nova-compute containers running on each Compute node in the node set.
  • Disable and delete the nova-compute service from each Compute node in the node set.
  • Delete the network agent from each Compute node in the node set.
  • Remove the SSH host keys of the removed nodes from the nodes in the remaining node sets.
  • Delete the node set and remove the node set from the data plane.

Prerequisites

  • You are logged in to the RHOCP cluster as a user with cluster-admin privileges.
  • The workloads on the node set Compute nodes have been migrated to Compute nodes on another node set.

Procedure

  1. Access the remote shell for the openstackclient pod:

    $ oc rsh -n openstack openstackclient
  2. Retrieve the IP address of each Compute node you want to remove:

    $ openstack hypervisor list
  3. Retrieve a list of your Compute nodes to identify the name and UUID of each node that you want to remove:

    $ openstack compute service list
  4. Disable the nova-compute service on each Compute node to be removed:

    $ openstack compute service set <hostname> nova-compute --disable
    Tip

    Use the --disable-reason option to add a short explanation on why the service is being disabled. This is useful if you intend to redeploy the Compute service.

  5. Exit the OpenStackClient pod:

    $ exit
  6. Perform the following operations on each Compute node to be removed:

    1. SSH into the Compute node to be removed:

      $ ssh -i <key_file_name> cloud-admin@<node_IP_address>
      • Replace <key_file_name> with the name and location of the SSH key pair file you created to enable Ansible to manage the RHEL nodes.
      • Replace <node_IP_address> with the IP address for the Compute node that you retrieved in step 2.
    2. Stop the ovn and nova-compute containers:

      [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_controller
      [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_metadata_agent
      [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_nova_compute
    3. Remove the systemd unit files that manage the ovn and nova-compute containers to prevent the agents from being automatically started and registered in the database if the removed node is rebooted:

      [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_controller
      [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_metadata_agent
      [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_nova_compute
    4. Disconnect from the Compute node:

      $ exit
  7. Access the remote shell for openstackclient:

    $ oc rsh -n openstack openstackclient
  8. Delete the network agents for each Compute node in the node set:

    $ openstack network agent list [--host <hostname>]
    $ openstack network agent delete <agent_id>
  9. Delete the nova-compute service for each Compute node in the node set:

    $ openstack compute service delete <node_uuid>
    • Replace <node_uuid> with the UUID of the node to be removed that you retrieved in step 3.
  10. Exit the openstackclient pod:

    $ exit
  11. Search for the string secretHashes in the output of the following command for the secrets in the node set to be deleted:

    $ oc get openstackdataplanenodeset <node_set_name> -n openstack -o yaml

    The secretHashes field lists all the node set secrets in key-value pair format: <key>:<value>. The following example illustrates the secretHashes format in YAML output:

    secretHashes:
    cert-libvirt-default-compute-4drna21w-0: n68chbfh678h5dfhcfh576h546h566h5c4h5cdh679hffh67h79h98h56fhc4h588h58fhb4h548h59fh554h54fh5cdh646h577hffhbdh569h5f9h68bq
    cert-libvirt-default-compute-4drna21w-1: n68dhdbh65chdbh5f7h695h7h54chcdh654h59ch564h5d6hdch66h54bh66ch556h649h666h76h55hc7h564h65dh5fch5c7h5fbh8bh55hcbh5b5q
  12. Delete the node set:

    $ oc delete openstackdataplanenodeset/<node_set_name> -n openstack
    • Replace <node_set_name> with the name of the OpenStackDataPlaneNodeSet CR to be deleted.
  13. Delete the node set secrets:

    $ oc delete secret <secret_name>
    • Replace <secret_name> with the key of the secretHashes key-value pair you retrieved in the previous step, for example, cert-libvirt-default-compute-4drna21w-0.
    Note

    You can ensure that secrets created by cert-manager get removed automatically by setting the --enable-certificate-owner-ref flag for the cert-manager Operator for Red Hat OpenShift. For more information, see Deleting a TLS secret automatically upon Certificate removal.

  14. If the node set you removed listed the global ssh-known-hosts service, then you must add the ssh-known-hosts service to one of the remaining OpenStackDataPlaneNodeSet CRs listed in the OpenStackDataPlaneDeployment CR. Open the definition file for one of the remaining OpenStackDataPlaneNodeSet CRs from your workstation and add the ssh-known-hosts service to the services field in the order that it should be executed relative to the other services:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: <node_set_name>
    spec:
      services:
        - download-cache
        - bootstrap
        - configure-network
        - validate-network
        - install-os
        - configure-os
        - ssh-known-hosts
        - run-os
        - libvirt
        - nova
        - ovn
        - neutron-metadata
        - telemetry
    Note

    When adding the ssh-known-hosts service to the services list in a node set definition, you must include all the required services, including the default services. If you include only the ssh-known-hosts service in the services list, then that is the only service that is deployed.

  15. Save the updated OpenStackDataPlaneNodeSet CR definition file.
  16. Apply the updated OpenStackDataPlaneNodeSet CR configuration:

    $ oc apply -f <node_set_name>.yaml
  17. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR that removes the node set from the data plane:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_deployment_name>
    • Replace <node_set_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, must consist of lower case alphanumeric characters, - (hyphen) or . (period), and must start and end with an alphanumeric character.
    Tip

    Give the definition file and the OpenStackDataPlaneDeployment CR unique and descriptive names that indicate the purpose of the modified node set.

  18. Add the remaining OpenStackDataPlaneNodeSet CRs in the data plane to the list of node sets to deploy:

    spec:
      nodeSets:
        - <nodeSet_name>
  19. Specify that the OpenStackDataPlaneDeployment CR should only run the ssh-known-hosts service when deploying the listed node sets:

    spec:
      ...
      servicesOverride:
        - ssh-known-hosts
  20. Save the OpenStackDataPlaneDeployment CR deployment file.
  21. Deploy the ssh-known-hosts service to delete the removed nodes from the known hosts lists on the remaining nodes:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10

    If the oc logs command returns an error similar to the following error, increase the --max-log-requests value:

    error: you are attempting to follow 19 log streams, but maximum allowed concurrency is 10, use --max-log-requests to increase the limit
  22. Verify that the modified OpenStackDataPlaneNodeSet CR is deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME                  NODESETS                  STATUS  MESSAGE
    openstack-data-plane  ["openstack-data-plane"]  True    Setup Complete
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME                  STATUS  MESSAGE
    openstack-data-plane  True    NodeSet Ready

    For information about the meaning of the returned status, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.

    If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For information, see Troubleshooting the data plane creation and deployment in Deploying Red Hat OpenStack Services on OpenShift.

Chapter 3. Replacing data plane nodes

You can replace pre-provisioned and provisioned data plane nodes without scaling the Red Hat OpenStack Services on OpenShift (RHOSO) cloud.

3.1. Replacing a pre-provisioned node

When you replace a faulty pre-provisioned node, your replacement node has the same hostname and IP address as the faulty node. You create a new OpenStackDataPlaneDeployment CR to deploy the new node.

If your replacement node has a different IP address to the faulty node, you set a new fixedIP for the control plane network for the node and a new ansibleHost for the node in the OpenStackDataPlaneNodeSet CR. Then you create a new OpenStackDataPlaneDeployment CR to deploy the new node.

When you replace a pre-provisioned node, you manually clean up the node you are removing.

For information about errors, data plane conditions and states, and returned statuses that can occur when you modify an OpenStackDataPlaneNodeSet CR, see Modifying an OpenStackDataPlaneNodeSet CR in Customizing the Red Hat OpenStack Services on OpenShift deployment.

Prerequisites

  • You are logged in to the RHOCP cluster as a user with cluster-admin privileges.
  • The workloads on the Compute nodes have been migrated to other Compute nodes.

Procedure

  1. If the faulty node is still reachable, perform clean-up tasks to ensure that the faulty node does not impact the new node.

    1. SSH into the removed node, and stop the ovn and nova-compute containers:

      $ ssh -i <key_file_name> cloud-admin@<node_IP_address>
      [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_controller
      [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_metadata_agent
      [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_nova_compute
      • Replace <key_file_name> with the name and location of the SSH key pair file you created to enable Ansible to manage the RHEL nodes.
      • Replace <node_IP_address> with the IP address for the removed node.
    2. Remove the systemd unit files that manage the ovn and nova-compute containers to prevent the agents from being automatically started and registered in the database if the removed node is rebooted:

      [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_controller
      [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_metadata_agent
      [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_nova_compute
    3. Disconnect from the node:

      $ exit
  2. If your replacement node has the same IP address as the faulty node, proceed to Step 3 to create the OpenStackDataPlaneDeployment CR to deploy the new node. If your replacement node has a different IP address to the faulty node, open the OpenStackDataPlaneNodeSet CR file for the node set you want to update, for example, openstack_data_plane.yaml, and make the following changes.

    1. Update the fixedIP for the control plane network for the node you are replacing to specify the IP address of the new node:

      Example:

        nodes:
          edpm-compute-0:
            hostName: edpm-compute-0
            networks:
            - name: ctlplane
              subnetName: subnet1
              defaultRoute: true
              fixedIP: 192.168.122.100
            ...
    2. Update the ansibleHost value for the node you are replacing to specify the IP address of the new node:

      Example:

        nodes:
          edpm-compute-0:
            hostName: edpm-compute-0
            ...
            ansible:
              ansibleHost: 192.168.122.100
              ansibleUser: cloud-admin
              ansibleVars:
                fqdn_internal_api: edpm-compute-0.example.com
      ...
    3. Save the OpenStackDataPlaneNodeSet CR file.
    4. Apply the updated OpenStackDataPlaneNodeSet CR configuration:

      $ oc apply -f openstack_data_plane.yaml
    5. Verify that the data plane resource has been updated by confirming that the status is SetupReady:

      $ oc wait openstackdataplanenodeset openstack-data-plane --for condition=SetupReady --timeout=10m

      When the status is SetupReady, the command returns a condition met message, otherwise it returns a timeout error.

  3. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR to deploy the new node:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_deployment_name>
    • Replace <node_set_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, must consist of lower case alphanumeric characters, - (hyphen) or . (period), and must start and end with an alphanumeric character.
  4. Add the OpenStackDataPlaneNodeSet CR:

    spec:
      nodeSets:
        - <nodeSet_name>
  5. Save the OpenStackDataPlaneDeployment CR deployment file.
  6. Deploy the OpenStackDataPlaneNodeSet CR:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10
  7. Verify that the OpenStackDataPlaneNodeSet CR is deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME             	  STATUS    MESSAGE
    openstack-data-plane   True     Setup Complete
    
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME             	  STATUS    MESSAGE
    openstack-data-plane   True     NodeSet Ready

3.2. Replacing a provisioned node

To replace a faulty provisioned node from the data plane without scaling the Red Hat OpenStack Services on OpenShift (RHOSO) cloud, you delete the faulty bare metal host (BMH). The OpenStackBaremetalSet CR is reconciled to provision a new available BMH and reset the deployment status of the OpenStackDataPlaneNodeSet, prompting you to create a new OpenStackDataPlaneDeployment CR for deploying on the newly provisioned node.

Prerequisites

  • You are logged in to the RHOCP cluster as a user with cluster-admin privileges.
  • The workloads on the Compute nodes have been migrated to other Compute nodes.

Procedure

  1. Verify that you have a spare BMH in the available state that you can use to replace the faulty node:

    $ oc get bmh

    Example output:

    NAME      STATE         CONSUMER    ONLINE   ERROR   AGE
    leaf0-0   available                 false            11h
    leaf0-1   provisioned   nodeset-0   true             11h
    leaf1-0   provisioned   nodeset-1   true             11h
    leaf1-1   provisioned   nodeset-1   true             11h
  2. Delete the faulty node:

    $ oc delete bmh leaf0-1

    Example output:

    baremetalhost.metal3.io "leaf0-1" deleted

    The OpenStackBaremetalSet CR is reconciled to provision a new available BMH and reset the deployment status of the OpenStackDataPlaneNodeSet, prompting you to create a new OpenStackDataPlaneDeployment CR for deploying on the newly provisioned node.

  3. Wait for the node set that contained the faulty node to reach the SetupReady state.

    $ oc wait openstackdataplanenodeset openstack-data-plane --for condition=SetupReady --timeout=10m

    When the status is SetupReady, the command returns a condition met message, otherwise it returns a timeout error.

  4. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_deployment_name>
    • Replace <node_set_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, must consist of lower case alphanumeric characters, - (hyphen) or . (period), and must start and end with an alphanumeric character.

      Tip

      Give the definition file and the OpenStackDataPlaneDeployment CR unique and descriptive names that indicate the purpose of the modified node set.

  5. Add the OpenStackDataPlaneNodeSet CR that you modified:

    spec:
      nodeSets:
        - <nodeSet_name>
  6. Save the OpenStackDataPlaneDeployment CR deployment file.
  7. Deploy the modified OpenStackDataPlaneNodeSet CR:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10
  8. Verify that the modified OpenStackDataPlaneNodeSet CR is deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME             	  STATUS    MESSAGE
    openstack-data-plane   True     Setup Complete
    
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME             	  STATUS    MESSAGE
    openstack-data-plane   True     NodeSet Ready
  9. Verify that the spare BMH has replaced the faulty BMH:

    $ oc get bmh

    Example output:

    NAME      STATE         CONSUMER    ONLINE   ERROR   AGE
    leaf0-0   provisioned   nodeset-0   true             13h
    leaf1-0   provisioned   nodeset-1   true             13h
    leaf1-1   provisioned   nodeset-1   true             13h

Chapter 4. Rebooting data plane nodes

You might need to reboot the nodes on your data plane. The reboot method is determined by the type of node. You can also configure how your data plane handles the reboot process based on the node type.

You can configure how the data plane handles the shut down and restart of instances that are hosted on Compute nodes if the instances are not migrated off the host node before the host Compute node is rebooted.

Prerequisites

  • You are logged on to a workstation that has access to the Red Hat OpenShift Container Platform (RHOCP) cluster as a user with cluster-admin privileges.

Procedure

  1. Open the nova-extra-config.yaml definition file for the default Compute service (nova) ConfigMap custom resource (CR) named nova-extra-config and add the following configuration:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: nova-extra-config
      namespace: openstack
    data:
       30-nova-reboot.conf: |
       [DEFAULT]
       resume_guests_state_on_host_boot=true 
    1
    
       shutdown_timeout=300 
    2
    1
    Set to true to return instances to the same state on the Compute node after reboot. When set to False, the instances remain down and you must start them manually.
    2
    Specify the number of seconds to wait for an instance to perform a controlled, clean shut down before being powered off and rebooted. Do not set this value to 0. A value of 0 (zero) means that the instance is powered off immediately with no opportunity for instance OS clean-up. The default value is 60.
  2. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_deployment_name>
    • Replace <node_set_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, must consist of lower case alphanumeric characters, - (hyphen) or . (period), and must start and end with an alphanumeric character.
    Tip

    Give the definition file and the OpenStackDataPlaneDeployment CR unique and descriptive names that reflect the purpose of the node sets in the deployment.

  3. Add the OpenStackDataPlaneNodeSet CRs that configure the Compute nodes on the data plane that run the default nova service:

    spec:
      nodeSets:
        - <nodeSet_name>
    • Replace <nodeSet_name> with the names of the OpenStackDataPlaneNodeSet CRs that you want to include in your data plane deployment.
  4. Save the OpenStackDataPlaneDeployment CR deployment file.
  5. Deploy the modified OpenStackDataPlaneNodeSet CRs:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10
  6. Verify that the modified OpenStackDataPlaneNodeSet CRs are deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME             	STATUS   MESSAGE
    openstack-data-plane   True     Setup Complete
    
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME             	STATUS   MESSAGE
    openstack-data-plane   True     NodeSet Ready

    For information about the meaning of the returned status, see Data plane conditions and states in the Deploying Red Hat OpenStack Services on OpenShift guide.

    If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For information, see Troubleshooting the data plane creation and deployment in the Deploying Red Hat OpenStack Services on OpenShift guide.

To reboot Object Storage service (swift) nodes, complete the following steps for every Object Storage node in your cluster. Reboot nodes one after the other individually instead at the same time to ensure that the cluster remains available even when a node is rebooting.

Procedure

  1. Log in to an Object Storage node.
  2. Reboot the node:

    $ sudo reboot
  3. Wait until the node boots.
  4. Repeat the reboot for each Object Storage node in the cluster.

4.3. Rebooting Compute nodes

You might need to reboot the Compute nodes on the Red Hat OpenShift on OpenStack (RHOSO) data plane. You can migrate all the virtual machine instances hosted on the Compute nodes that you want to reboot to other hosts before performing the reboot, which is recommended to ensure minimal downtime of instance workloads in your RHOSO environment. If you do not migrate the instances off the host Compute nodes before reboot, then how the instances are restarted is based on how the data plane is configured to handle instances on host reboot. For more information about Compute node reboot behavior, see Configuring the data plane Compute node reboot behavior.

Note

When you restart the Compute service (nova) and your Compute host detects a name change, you must discover why the Compute host name changed and correct that condition to return the hostname to the original value. When you correct the issue, you must restart the Compute service. For more information, see Section 4.3.1, “Troubleshooting Compute hostname change detection”.

Note

If you have a Multi-RHEL environment, and you want to migrate virtual machines from a Compute node that is running RHEL 9.4 or 9.6 to a Compute node that is running RHEL 9.2, only cold migration is supported. For more information about cold migration, see Cold migrating an instance in Configuring the Compute service for instance creation.

Procedure

  1. Access the remote shell for the openstackclient pod:

    $ oc rsh -n openstack openstackclient
  2. Retrieve a list of your Compute nodes to identify the host name of the nodes that you want to reboot:

    $ openstack compute service list
  3. Disable the Compute service on each Compute node that you want to reboot so that it does not provision new instances on the nodes:

    $ openstack compute service set <hostname> nova-compute --disable
    • Replace <hostname> with the host name of the Compute node on which you are disabling the service.
  4. List all the instances that are hosted on each Compute node:

    $ openstack server list --host <hostname> --all-projects
  5. Optional: Migrate the instances to another Compute node:

    $ openstack server migrate --live-migration [--host <dest>] <instance> --wait
    • Replace <instance> with the name or ID of the instance to migrate.
    • Optional: Include the --host option and replace <dest> with the name or ID of the destination Compute node. If the --host option is not specified, then the nova-scheduler selects the destination node for the instances.

    Repeat this migration command for each instance until none remain on the Compute node.

  6. Exit the openstackclient pod:

    $ exit
  7. Create an OpenStackDataPlaneDeployment CR to reboot the nodes and save it to a file named compute_reboot_nodes_deploy.yaml on your workstation::

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: compute-node-reboot
      namespace: openstack
    spec:
      nodeSets: 
    1
    
      - <nodeSet_name>
      - ...
      - <nodeSet_name>
      servicesOverride:
      - reboot-os
      ansibleExtraVars:
        edpm_reboot_strategy: force
      ansibleLimit: <node_hostname>,...,<node_hostname>
    • Replace <nodeSet_name> with the names of the OpenStackDataPlaneNodeSet CRs that contain the nodes that you are rebooting.
    • Replace <node_hostname> with the names of the nodes in the node set to reboot. If ansibleLimit is not set, all the nodes in the node set are rebooted at the same time.
  8. Deploy the data plane:

    $ oc create -f compute_reboot_nodes_deploy.yaml -n openstack
  9. Verify that the compute-node-reboot deployment completed:

    $ oc get openstackdataplanedeployment
    NAME                  STATUS   MESSAGE
    compute-node-reboot   True     Setup complete

    If the deployment fails, see Troubleshooting data plane creation and deployment in the Deploying Red Hat OpenStack Services on OpenShift guide.

  10. Access the remote shell for the openstackclient pod:

    $ oc rsh -n openstack openstackclient
  11. Re-enable the Compute service on each rebooted Compute node:

    $ openstack compute service set <hostname>  nova-compute --enable
  12. Confirm that the Compute service on each Compute node is enabled:

    $ openstack compute service list
  13. If you did not migrate the instances that are hosted on the Compute node before reboot, then check that the instances have restarted and are in the correct state:

    $ openstack server list --host <host> --all-projects
  14. Exit the openstackclient pod:

    $ exit

If you start the Compute service (nova) and your Compute host detects a name change, you receive the following error message:

Found 7 instances on hypervisor but no ComputeNode record exists with host compute01.example.com node compute01.example.com; potential rename detected - aborting startup

To recover from this error, you must know the reason for the change of the hostname. Here are the three possible reasons for the change of the hostname:

  • You have changed the name of the Compute host.
  • The DNS is not working correctly and has assigned a different name to the Compute host.
  • There is another deployment issue causing a Compute hostname to be incorrect.

When you resolve the issue, you must restart the Compute service.

Note

The nova-libvirt container contains information about the hostname. If you resolve an issue with a Compute hostname, restart both the nova-libvirt and nova-compute containers to ensure that all components contain the correct name.

For further assistance, contact the Red Hat Technical Support Team.

You can remove a Red Hat OpenStack Services on OpenShift (RHOSO) deployment from the Red Hat OpenShift Container Platform (RHOCP) environment if you no longer require the RHOSO deployment. You can also remove any Operators that were installed on your RHOCP cluster for use only by the RHOSO deployment.

Note

To create a new RHOSO deployment with the same data plane nodes after RHOSO is removed from the RHOCP environment, you must reprovision the nodes.

You can remove the resources created for your Red Hat OpenStack Services on OpenShift (RHOSO) deployment from the Red Hat OpenShift Container Platform (RHOCP) environment if you no longer require the RHOSO deployment.

Prerequisites

  • You are logged on to a workstation that has access to the Red Hat OpenShift Container Platform (RHOCP) cluster as a user with cluster-admin privileges.

Procedure

  1. Delete all the OpenStackDataPlaneDeployment objects in the RHOSO namespace:

    $ oc delete OpenStackDataPlaneDeployment --all -n openstack
  2. Delete all the OpenStackDataPlaneNodeSet objects in the RHOSO namespace:

    $ oc delete OpenStackDataPlaneNodeSet --all -n openstack
  3. Delete all the OpenStackDataPlaneService objects in the RHOSO namespace:

    $ oc delete OpenStackDataPlaneService --all -n openstack
  4. Delete the OpenStackControlPlane object from the RHOSO namespace:

    $ oc delete openstackcontrolplane \
     -l core.openstack.org/openstackcontrolplane -n openstack
  5. Confirm that all the deployment pods are deleted:

    $ oc get pods -n openstack
  6. Optional: Delete all Persistent Volume Claims (PVCs) from the RHOSO namespace:

    $ oc delete --all PersistentVolumeClaim -n openstack
    Tip

    Create a backup of your PVCs before deletion if you need to reuse them. For information about how to create a PVC backup, see the documentation for the storage backend used in your RHOSO environment.

  7. Optional: Release the PV to make it available for other applications:

    $ oc patch PersistentVolume <pv_name> -p '{"spec":{"claimRef": null}}'
    Note

    If you intend to re-use the released PV, ensure that the PV is cleaned before use. For more information, see Lifecycle of a volume and claim in the RHOCP Storage guide.

  8. Delete the Secret objects that contain the certificates issued for the control plane services and the data plane nodes when the RHOSO environment was deployed:

    $ oc delete secret -l service-cert -n openstack
    $ oc delete secret -l ca-cert -n openstack
    $ oc delete secret -l osdp-service -n openstack

    If the above commands return the message "No resources found" then the certificate secrets were automatically deleted when you deleted the control plane and data plane resources. For information about configuring the cert-manager Operator before RHOSO deployment to automatically delete the certificate secrets, see Deleting a TLS secret automatically upon Certificate removal in the RHOCP Security and Compliance guide.

  9. Verify that the resources have been deleted from the namespace:

    $ oc get all
    No resources found.
  10. Optional: Delete the namespace:

    $ oc delete namespace openstack
    Note

    You do not need to delete the namespace if you plan to create a new RHOSO deployment in the same namespace.

    If deleting the namespace is stuck in the terminating state:

    1. Check if any remaining objects have a finalizer:

      $ oc get $(oc api-resources|grep openstack.org|cut -d" " -f1 |paste -sd "," -),all -o custom-columns=Kind:.kind,Name:.metadata.name,Finalizers:.metadata.finalizers -n <namespace>
    2. Repeat the following command for each remaining object to remove the object finalizers and unblock deletion of each object:

      $ oc patch -n <namespace> <object-name> -p '{"metadata":{"finalizers":[]}}' --type=merge

You can remove any Operators that were installed on your Red Hat OpenShift Container Platform (RHOCP) cluster for use only by the Red Hat OpenStack Services on OpenShift (RHOSO) deployment. For more information about how to remove an Operator and uninstall all the resources associated with the Operator, see Deleting Operators from a cluster in the RHOCP Operators guide.

To perform maintenance on your Red Hat OpenStack on OpenShift (RHOSO) environment, you must shut down and start up the Red Hat OpenShift Container Platform (RHOCP) cluster and all the data plane nodes in a specific order to ensure minimal issues when you restart your cluster and data plane nodes.

Prerequisites

  • An operational RHOSO environment.
  • You are logged on to a workstation that has access to the RHOSO control plane as a user with cluster-admin privileges.
  • The oc command line tool is installed on the workstation.

6.1. RHOSO deployment shutdown order

To shut down the Red Hat OpenStack on OpenShift (RHOSO) environment, you must shut down the instances that host the workload, the data plane nodes, and the Red Hat OpenShift Container Platform (RHOCP) cluster nodes in the following order:

  1. Shut down instances hosted on the Compute nodes on the data plane.
  2. If your data plane includes hyperconverged infrastructure (HCI) nodes, shut down the Red Hat Ceph Storage cluster.
  3. Shut down Compute nodes.
  4. Shut down the RHOCP cluster nodes.

To shut down the Red Hat OpenStack Services on OpenShift (RHOSO) environment, you must first shut down all instances hosted on Compute nodes before shutting down the Compute nodes.

Procedure

  1. Access the remote shell for the OpenStackClient pod from your workstation:

    $ oc rsh -n openstack openstackclient
  2. List all running instances:

    $ openstack server list --all-projects
  3. Stop each instance:

    $ openstack server stop <instance_UUID>

    Repeat this step for each instance until you stop all running instances.

  4. Exit the OpenStackClient pod:

    $ exit

If your data plane includes hyperconverged infrastructure (HCI) nodes, shut down the Red Hat Ceph Storage cluster. For more information about how to shut down the Red Hat Ceph Storage cluster, see "Powering down and rebooting the cluster using the Ceph Orchestrator" in the Red Hat Ceph Storage Administration Guide:

6.4. Shutting down Compute nodes

As a part of shutting down the Red Hat OpenStack Services on OpenShift (RHOSO) environment, log in to and shut down each Compute node. For information about safely powering off bare-metal hosts, see Powering off bare-metal hosts in the RHOCP Scalability and performance guide.

Prerequisites

  • You have stopped all instances hosted on the Compute nodes.

Procedure

  1. Retrieve a list of the Compute nodes:

    $ oc rsh -n openstack openstackclient openstack compute service list
  2. If your Compute nodes are bare-metal nodes that were provisioned when deploying the data plane, annotate the BareMetalHost custom resource (CR) for each Compute node to prevent the Cluster Baremetal Operator (CBO) from rebooting the node:

    $ oc annotate bmh <compute_node_name> -n openshift-machine-api 'baremetalhost.metal3.io/detached=""'
  3. Log in as the root user to a Compute node and shut down the node:

    # shutdown -h now

    Repeat this step for each Compute node until you shut down all Compute nodes.

  4. Verify that the Compute nodes are shut down:

    $ oc rsh -n openstack openstackclient openstack hypervisor list -c ID -c State
    +--------------------------------------+---------+
    | ID                                   | State  |
    +--------------------------------------+---------+
    | 756968fd-272e-48d2-8f4a-54ef772b2acb | down |
    +--------------------------------------+---------+

6.5. Shutting down the RHOCP cluster

As a part of shutting down the Red Hat OpenStack Services on OpenShift (RHOSO) environment, you must shut down the Red Hat OpenShift Container Platform (RHOCP) cluster that hosts the RHOSO environment. For information about how to shut down a RHOCP cluster, see Shutting down the cluster gracefully in the RHOCP Backup and restore guide.

6.6. RHOSO deployment startup order

To start the Red Hat OpenStack Services on OpenShift (RHOSO) environment, you must start the Red Hat OpenShift Container Platform (RHOCP) cluster and data plane nodes in the following order:

  1. Start the RHOCP cluster.
  2. If your data plane includes hyperconverged infrastructure (HCI) nodes, start up the Red Hat Ceph Storage cluster.
  3. Start Compute nodes.
  4. Start instances on the Compute nodes.

6.7. Starting the RHOCP cluster

As a part of starting up the Red Hat OpenStack Services on OpenShift (RHOSO) environment, you must start the Red Hat OpenShift Container Platform (RHOCP) cluster that hosts the RHOSO environment. For information about how to start up a RHOCP cluster, see Restarting the cluster gracefully in the RHOCP Backup and restore guide.

If your data plane includes hyperconverged infrastructure (HCI) nodes, you must use the cephadm utility to unset the noout, norecover, norebalance, nobackfill, and nodown properties, and pause flagsstart. For more information about how to start the Red Hat Ceph Storage cluster, see "Powering down and rebooting the cluster using the Ceph Orchestrator" in the Red Hat Ceph Storage Administration Guide:

6.9. Starting Compute nodes

As a part of starting the Red Hat OpenStack Services on OpenShift (RHOSO) environment, power on each Compute node and check the services on the node.

Prerequisites

  • Powered down Compute nodes.

Procedure

  1. Power on each Compute node.
  2. Re-attach Compute nodes that you detached from the Cluster Baremetal Operator (CBO) during shut down:

    $ oc annotate bmh <compute_node_name> -n openshift-machine-api 'baremetalhost.metal3.io/detached-'

Verification

  1. Log in to each Compute as the root user.
  2. Check the services on the Compute node:

    $ systemctl -t service

6.10. Starting instances on Compute nodes

As a part of starting the Red Hat OpenStack Services on OpenShift (RHOSO) environment, start the instances on the Compute nodes.

Procedure

  1. Access the remote shell for the OpenStackClient pod from your workstation:

    $ oc rsh -n openstack openstackclient
  2. List all the instances:

    $ openstack server list --all-projects
  3. Start an instance:

    $ openstack server start <instance_UUID>

    Repeat this step for each instance until you start all the instances.

  4. Exit the OpenStackClient pod:

    $ exit

Legal Notice

Copyright © Red Hat.
Except as otherwise noted below, the text of and illustrations in this documentation are licensed by Red Hat under the Creative Commons Attribution–Share Alike 3.0 Unported license . If you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, the Red Hat logo, JBoss, Hibernate, and RHCE are trademarks or registered trademarks of Red Hat, LLC. or its subsidiaries in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
XFS is a trademark or registered trademark of Hewlett Packard Enterprise Development LP or its subsidiaries in the United States and other countries.
The OpenStack® Word Mark and OpenStack logo are trademarks or registered trademarks of the Linux Foundation, used under license.
All other trademarks are the property of their respective owners.
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top