Chapter 11. Performing a minor update of the RHOSP overcloud with director Operator
A minor update of your Red Hat OpenStack Platform (RHOSP) environment involves updating the RPM packages and containers on the overcloud nodes. You might also need to update the configuration of some services. The data plane and control plane are fully available during the minor update. You must complete each of the following steps to update your RHOSP environment:
- Prepare your RHOSP environment for the minor update.
-
Optional: Update the
ovn-controllercontainer. - Update Controller nodes and composable nodes that contain Pacemaker services.
- Update Compute nodes.
- Update Red Hat Ceph Storage nodes.
- Update the Red Hat Ceph Storage cluster.
- Reboot the overcloud nodes.
Prerequisites
- You have a backup of your RHOSP deployment. For more information, see Backing up and restoring a director Operator deployed overcloud.
11.1. Preparing director Operator for a minor update Copy linkLink copied to clipboard!
To prepare your Red Hat OpenStack Platform (RHOSP) environment to perform a minor update with director Operator (OSPdO), complete the following tasks:
-
Update the
openstackclientpod. - Lock the RHOSP environment to a Red Hat Enterprise Linux (RHEL) release.
- Update RHOSP repositories.
- Update the container image preparation file.
- Disable fencing in the overcloud.
11.1.1. Updating the openstackclient pod Copy linkLink copied to clipboard!
Update the openstackclient pod container image to use the correct director heat templates and Ansible roles.
Procedure
Change to the
openstackproject:$ oc project openstackEdit the CSV file:
$ oc edit csvUpdate the following values to the new RHOSP minor version:
OPENSTACKCLIENT_IMAGE_URL_DEFAULT HEAT_API_IMAGE_URL_DEFAULT HEAT_ENGINE_IMAGE_URL_DEFAULT MARIADB_IMAGE_URL_DEFAULT RABBITMQ_IMAGE_URL_DEFAULTDelete any existing ephemeral Heat instances:
$ oc delete openstackephemeralheat --allRemove the current
imageURLfrom theopenstackclientcustom resource to update the pod to the new image:$ oc patch openstackclient -n openstack openstackclient --type=json -p="[{'op': 'remove', 'path': '/spec/imageURL'}]"
11.1.2. Locking the RHOSP environment to a RHEL release Copy linkLink copied to clipboard!
Red Hat OpenStack Platform (RHOSP) 17.1 is supported on Red Hat Enterprise Linux (RHEL) 9.2. Before you perform the update, lock the overcloud repositories to the RHEL 9.2 release to avoid upgrading the operating system to a newer minor release.
Procedure
Copy the overcloud subscription management environment file,
rhsm.yaml, toopenstackclient:$ oc cp rhsm.yaml openstackclient:/home/cloud-admin/rhsm.yamlAccess the remote shell for the
openstackclientpod:$ oc rsh openstackclientOpen the
rhsm.yamlfile and check if your subscription management configuration includes therhsm_releaseparameter. If therhsm_releaseparameter is not present, add it and set it to9.2:parameter_defaults: RhsmVars: … rhsm_username: "myusername" rhsm_password: "p@55w0rd!" rhsm_org_id: "1234567" rhsm_pool_ids: "1a85f9223e3d5e43013e3d6e8ff506fd" rhsm_method: "portal" rhsm_release: "9.2"-
Save the
rhsm.yamlfile. Create a playbook named
set_release.yamlthat contains a task to lock the operating system version to RHEL 9.2 on all nodes:- hosts: all gather_facts: false tasks: - name: set release to 9.2 command: subscription-manager release --set=9.2 become: trueRun the
set_release.yamlplaybook on theopenstackclientpod:$ ansible-playbook -i /home/cloud-admin/ctlplane-ansible-inventory /home/cloud-admin/set_release.yaml --limit Controller,ComputeUse the
--limitoption to apply the content to all RHOSP nodes. Do not run this playbook against Red Hat Ceph Storage nodes because you might have a different subscription for these nodes.NoteTo manually lock a node to a version, log in to the node and run the
subscription-manager releasecommand:$ sudo subscription-manager release --set=9.2Exit the remote shell for the
openstackclientpod:$ exit
11.1.3. Updating RHOSP repositories Copy linkLink copied to clipboard!
Update your repositories to use Red Hat OpenStack Platform (RHOSP) 17.1.
Procedure
Open the
rhsm.yamlfile and update therhsm_reposparameter to the correct repository versions:parameter_defaults: RhsmVars: rhsm_repos: - rhel-9-for-x86_64-baseos-e4s-rpms - rhel-9-for-x86_64-appstream-e4s-rpms - rhel-9-for-x86_64-highavailability-e4s-rpms - openstack-17.1-for-rhel-9-x86_64-rpms - fast-datapath-for-rhel-9-x86_64-rpms-
Save the
rhsm.yamlfile. Access the remote shell for the
openstackclientpod:$ oc rsh openstackclientCreate a playbook named
update_rhosp_repos.yamlthat contains a task to set the repositories toRHOSP 17.1on all nodes:- hosts: all gather_facts: false tasks: - name: change osp repos command: subscription-manager repos --enable=openstack-17.1-for-rhel-9-x86_64-rpms become: trueRun the
update_rhosp_repos.yamlplaybook on theopenstackclientpod:$ ansible-playbook -i /home/cloud-admin/ctlplane-ansible-inventory /home/cloud-admin/update_rhosp_repos.yaml --limit Controller,ComputeUse the
--limitoption to apply the content to all RHOSP nodes. Do not run this playbook against Red Hat Ceph Storage nodes because they use a different subscription.Create a playbook named
update_ceph_repos.yamlthat contains a task to set the repositories toRHOSP 17.1on all Red Hat Ceph Storage nodes:- hosts: all gather_facts: false tasks: - name: change ceph repos command: subscription-manager repos --enable=openstack-17.1-deployment-tools-for-rhel-9-x86_64-rpms become: trueRun the
update_ceph_repos.yamlplaybook on theopenstackclientpod:$ ansible-playbook -i /home/cloud-admin/ctlplane-ansible-inventory /home/cloud-admin/update_ceph_repos.yaml --limit CephStorageUse the
--limitoption to apply the content to Red Hat Ceph Storage nodes.Exit the remote shell for the
openstackclientpod:$ exit
11.1.4. Updating the container image preparation file Copy linkLink copied to clipboard!
The container preparation file is the file that contains the ContainerImagePrepare parameter. You use this file to define the rules for obtaining container images for the overcloud.
Before you update your environment, check the file to ensure that you obtain the correct image versions.
Procedure
-
Edit the container preparation file. The default name for this file is
containers-prepare-parameter.yaml. Ensure the
tagparameter is set to17.1for each rule set:parameter_defaults: ContainerImagePrepare: - push_destination: false set: ... tag: '17.1' tag_from_label: '{version}-{release}'NoteIf you do not want to use a specific tag for the update, such as
17.1or17.1.1, remove thetagkey-value pair and specifytag_from_labelonly. Thetag_from_labeltag uses the installed Red Hat OpenStack Platform (RHOSP) version to determine the value for the tag to use as part of the update process. For more information about version tagging, see Guidelines for container image tagging in Customizing your Red Hat OpenStack Platform deployment.-
Save the
containers-prepare-parameter.yamlfile.
11.1.5. Disabling fencing in the overcloud Copy linkLink copied to clipboard!
Before you update the overcloud, ensure that fencing is disabled.
If fencing is deployed in your environment during the Controller nodes update process, the overcloud might detect certain nodes as disabled and attempt fencing operations, which can cause unintended results.
If you have enabled fencing in the overcloud, you must temporarily disable fencing for the duration of the update.
Procedure
Access the remote shell for the
openstackclientpod:$ oc rsh openstackclientLog in to a Controller node and run the Pacemaker command to disable fencing:
$ ssh <controller-0.ctlplane> "sudo pcs property set stonith-enabled=false"-
Replace
<controller-0.ctlplane>with the name of your Controller node.
-
Replace
Exit the remote shell for the
openstackclientpod:$ exit
Additional Resources
11.2. Running the overcloud update preparation for director Operator Copy linkLink copied to clipboard!
To prepare the overcloud for the update process, generate an update prepare configuration, which creates updated ansible playbooks and prepares the nodes for the update.
Procedure
Create a file on your workstation named
osconfiggenerator-update-prepare.yamlto define theOpenStackConfigGeneratorresource:apiVersion: osp-director.openstack.org/v1beta1 kind: OpenStackConfigGenerator metadata: name: "update" namespace: openstack spec: gitSecret: git-secret enableFencing: false heatEnvs:1 - lifecycle/update-prepare.yaml - ssl/tls-endpoints-public-dns.yaml - ssl/enable-tls.yaml heatEnvConfigMap: heat-env-config-update tarballConfigMap: tripleo-tarball-config-update- 1
- List all the heat environment files, including any custom heat environment files that you created for your deployment.
Apply the configuration:
$ oc apply -f osconfiggenerator-update-prepare.yaml- Wait until the update preparation process completes.
11.3. Updating the ovn-controller container on all overcloud servers Copy linkLink copied to clipboard!
If you deployed your overcloud with the Modular Layer 2 Open Virtual Network mechanism driver (ML2/OVN), update the ovn-controller container to the latest Red Hat OpenStack Platform (RHOSP) 17.1 version. The update occurs on every overcloud server that runs the ovn-controller container.
The following procedure updates the ovn-controller containers on Compute nodes before it updates the ovn-northd service on Controller nodes. If you accidentally update the ovn-northd service before following this procedure, you might not be able to reach your virtual machine instances or create new instances or virtual networks. The following procedure restores connectivity.
Procedure
Create an
OpenStackDeploycustom resource (CR) namedosdeploy-ovn-update.yaml:apiVersion: osp-director.openstack.org/v1beta1 kind: OpenStackDeploy metadata: name: ovn-update spec: configVersion: <config_version> configGenerator: update mode: externalUpdate advancedSettings: tags: - ovnApply the updated configuration:
$ oc apply -f osdeploy-ovn-update.yaml-
Wait until the
ovn-controllercontainer update completes.
11.4. Updating all Controller nodes Copy linkLink copied to clipboard!
Update all the Controller nodes to the latest Red Hat OpenStack Platform (RHOSP) 17.1 version.
Procedure
Create an
OpenStackDeploycustom resource (CR) namedosdeploy-controller-update.yaml:apiVersion: osp-director.openstack.org/v1beta1 kind: OpenStackDeploy metadata: name: controller-update spec: configVersion: <config_version> configGenerator: update mode: update advancedSettings: limit: ControllerApply the updated configuration:
$ oc apply -f osdeploy-controller-update.yaml- Wait until the Controller node update completes.
11.5. Updating all Compute nodes Copy linkLink copied to clipboard!
Update all Compute nodes to the latest Red Hat OpenStack Platform (RHOSP) 17.1 version. To update Compute nodes, create an OpenStackDeploy custom resource (CR) with the limit: Compute option to restrict operations only to the Compute nodes.
Procedure
Create an
OpenStackDeployCR namedosdeploy-compute-update.yaml:apiVersion: osp-director.openstack.org/v1beta1 kind: OpenStackDeploy metadata: name: compute-update spec: configVersion: <config_version> configGenerator: update mode: update advancedSettings: limit: ComputeApply the updated configuration:
$ oc apply -f osdeploy-compute-update.yaml- Wait until the Compute node update completes.
11.6. Updating all HCI Compute nodes Copy linkLink copied to clipboard!
Update the Hyperconverged Infrastructure (HCI) Compute nodes to the latest Red Hat OpenStack Platform (RHOSP) 17.1 version. To update the HCI Compute nodes, create an OpenStackDeploy custom resource (CR) with the limit: ComputeHCI option to restrict operations to only the HCI nodes. You must also create an OpenStackDeploy CR with the mode: external-update and tags: ["ceph"] options to perform an update to a containerized Red Hat Ceph Storage 4 cluster.
Procedure
Create an
OpenStackDeployCR namedosdeploy-computehci-update.yaml:apiVersion: osp-director.openstack.org/v1beta1 kind: OpenStackDeploy metadata: name: computehci-update spec: configVersion: <config_version> configGenerator: update mode: update advancedSettings: limit: ComputeHCIApply the updated configuration:
$ oc apply -f osdeploy-computehci-update.yaml- Wait until the ComputeHCI node update completes.
Create an
OpenStackDeployCR namedosdeploy-ceph-update.yaml:apiVersion: osp-director.openstack.org/v1beta1 kind: OpenStackDeploy metadata: name: ceph-update spec: configVersion: <config_version> configGenerator: update mode: external-update advancedSettings: tags: - cephApply the updated configuration:
$ oc apply -f osdeploy-ceph-update.yaml- Wait until the Red Hat Ceph Storage node update completes.
11.7. Updating all Red Hat Ceph Storage nodes Copy linkLink copied to clipboard!
Update the Red Hat Ceph Storage nodes to the latest Red Hat OpenStack Platform (RHOSP) 17.1 version.
RHOSP 17.1 is supported on RHEL 9.2. However, hosts that are mapped to the CephStorage role update to the latest major RHEL release. For more information, see Red Hat Ceph Storage: Supported configurations.
Procedure
Create an
OpenStackDeploycustom resource (CR) namedosdeploy-cephstorage-update.yaml:apiVersion: osp-director.openstack.org/v1beta1 kind: OpenStackDeploy metadata: name: cephstorage-update spec: configVersion: <config_version> configGenerator: update mode: externalUpdate advancedSettings: limit: CephStorageApply the updated configuration:
$ oc apply -f osdeploy-cephstorage-update.yaml- Wait until the Red Hat Ceph Storage node update completes.
Create an
OpenStackDeployCR namedosdeploy-ceph-update.yaml:apiVersion: osp-director.openstack.org/v1beta1 kind: OpenStackDeploy metadata: name: ceph-update spec: configVersion: <config_version> configGenerator: update mode: externalUpdate advancedSettings: tags: - cephApply the updated configuration:
$ oc apply -f osdeploy-ceph-update.yaml- Wait until the Red Hat Ceph Storage node update completes.
11.8. Updating the Red Hat Ceph Storage cluster Copy linkLink copied to clipboard!
Update the director-deployed Red Hat Ceph Storage cluster to the latest version that is compatible with Red Hat OpenStack Platform (RHOSP) 17.1 by using the cephadm Orchestrator.
This procedure uses cephadm to upgrade your deployment. If you are using pre-provisioned nodes, cephadm is available by default in the first Controller node. You can manually install it in the other Controllers to access the cephadm shell.
For more information about installing cephadm, see the Red Hat Ceph Storage 6 Installation Guide.
Procedure
Access the remote shell for the
openstackclientpod:$ oc rsh openstackclientLog in to the first Controller node:
$ ssh <controller-0.ctlplane>-
Replace
<controller-0.ctlplane>with the name of the first Controller node in your deployment.
-
Replace
-
Upgrade your Red Hat Ceph Storage cluster by using
cephadm. For more information, see Upgrade a Red Hat Ceph Storage cluster using cephadm in the Red Hat Ceph Storage 6 Upgrade Guide. Exit the remote shell for the
openstackclientpod:$ exit
11.9. Performing online database updates Copy linkLink copied to clipboard!
Some overcloud components require an online update or migration of their databases tables. Online database updates apply to the following components:
- Block Storage service (cinder)
- Compute service (nova)
Procedure
Create an
OpenStackDeploycustom resource (CR) namedosdeploy-online-migration.yaml:apiVersion: osp-director.openstack.org/v1beta1 kind: OpenStackDeploy metadata: name: online-migration spec: configVersion: <config_version> configGenerator: update mode: external-update advancedSettings: tags: - online_upgradeApply the updated configuration:
$ oc apply -f osdeploy-online-migration.yaml
11.10. Re-enabling fencing in the overcloud Copy linkLink copied to clipboard!
To update to the latest Red Hat OpenStack Platform (RHOSP) 17.1, you must re-enable fencing in the overcloud.
Procedure
Access the remote shell for the
openstackclientpod:$ oc rsh openstackclientLog in to a Controller node and run the Pacemaker command to enable fencing:
$ ssh <controller-0.ctlplane> "sudo pcs property set stonith-enabled=true"-
Replace
<controller-0.ctlplane>with the name of your Controller node.
-
Replace
Exit the remote shell for the
openstackclientpod:$ exit
11.11. Rebooting the overcloud Copy linkLink copied to clipboard!
After you perform a minor Red Hat OpenStack Platform (RHOSP) update to the latest 17.1 version, reboot your overcloud. The reboot refreshes the nodes with any associated kernel, system-level, and container component updates. These updates provide performance and security benefits. Plan downtime to perform the reboot procedures.
Use the following guidance to understand how to reboot different node types:
- If you reboot all nodes in one role, reboot each node individually. If you reboot all nodes in a role simultaneously, service downtime can occur during the reboot operation.
Complete the reboot procedures on the nodes in the following order:
11.11.1. Rebooting Controller and composable nodes Copy linkLink copied to clipboard!
Reboot Controller nodes and standalone nodes based on composable roles, and exclude Compute nodes and Ceph Storage nodes.
Procedure
- Log in to the node that you want to reboot.
Optional: If the node uses Pacemaker resources, stop the cluster:
[tripleo-admin@overcloud-controller-0 ~]$ sudo pcs cluster stopReboot the node:
[tripleo-admin@overcloud-controller-0 ~]$ sudo reboot- Wait until the node boots.
Verification
Verify that the services are enabled.
If the node uses Pacemaker services, check that the node has rejoined the cluster:
[tripleo-admin@overcloud-controller-0 ~]$ sudo pcs statusIf the node uses Systemd services, check that all services are enabled:
[tripleo-admin@overcloud-controller-0 ~]$ sudo systemctl statusIf the node uses containerized services, check that all containers on the node are active:
[tripleo-admin@overcloud-controller-0 ~]$ sudo podman ps
11.11.2. Rebooting a Ceph Storage (OSD) cluster Copy linkLink copied to clipboard!
Complete the following steps to reboot a cluster of Ceph Storage (OSD) nodes.
Prerequisites
On a Ceph Monitor or Controller node that is running the
ceph-monservice, check that the Red Hat Ceph Storage cluster status is healthy and the pg status isactive+clean:$ sudo cephadm shell -- ceph statusIf the Ceph cluster is healthy, it returns a status of
HEALTH_OK.If the Ceph cluster status is unhealthy, it returns a status of
HEALTH_WARNorHEALTH_ERR. For troubleshooting guidance, see the Red Hat Ceph Storage 5 Troubleshooting Guide or the Red Hat Ceph Storage 6 Troubleshooting Guide.
Procedure
Log in to a Ceph Monitor or Controller node that is running the
ceph-monservice, and disable Ceph Storage cluster rebalancing temporarily:$ sudo cephadm shell -- ceph osd set noout $ sudo cephadm shell -- ceph osd set norebalanceNoteIf you have a multistack or distributed compute node (DCN) architecture, you must specify the Ceph cluster name when you set the
nooutandnorebalanceflags. For example:sudo cephadm shell -c /etc/ceph/<cluster>.conf -k /etc/ceph/<cluster>.client.keyring.- Select the first Ceph Storage node that you want to reboot and log in to the node.
Reboot the node:
$ sudo reboot- Wait until the node boots.
Log in to the node and check the Ceph cluster status:
$ sudo cephadm shell -- ceph statusCheck that the
pgmapreports allpgsas normal (active+clean).- Log out of the node, reboot the next node, and check its status. Repeat this process until you have rebooted all Ceph Storage nodes.
When complete, log in to a Ceph Monitor or Controller node that is running the
ceph-monservice and enable Ceph cluster rebalancing:$ sudo cephadm shell -- ceph osd unset noout $ sudo cephadm shell -- ceph osd unset norebalanceNoteIf you have a multistack or distributed compute node (DCN) architecture, you must specify the Ceph cluster name when you unset the
nooutandnorebalanceflags. For example:sudo cephadm shell -c /etc/ceph/<cluster>.conf -k /etc/ceph/<cluster>.client.keyringPerform a final status check to verify that the cluster reports
HEALTH_OK:$ sudo cephadm shell ceph status
11.11.3. Rebooting Compute nodes Copy linkLink copied to clipboard!
To ensure minimal downtime of instances in your Red Hat OpenStack Platform environment, the Migrating instances workflow outlines the steps you must complete to migrate instances from the Compute node that you want to reboot.
Migrating instances workflow
- Decide whether to migrate instances to another Compute node before rebooting the node.
- Select and disable the Compute node that you want to reboot so that it does not provision new instances.
- Migrate the instances to another Compute node.
- Reboot the empty Compute node.
- Enable the empty Compute node.
Prerequisites
Before you reboot the Compute node, you must decide whether to migrate instances to another Compute node while the node is rebooting.
Review the list of migration constraints that you might encounter when you migrate virtual machine instances between Compute nodes. For more information, see Migration constraints in Configuring the Compute service for instance creation.
NoteIf you have a Multi-RHEL environment, and you want to migrate virtual machines from a Compute node that is running RHEL 9.2 to a Compute node that is running RHEL 8.4, only cold migration is supported. For more information about cold migration, see Cold migrating an instance in Configuring the Compute service for instance creation.
If you cannot migrate the instances, you can set the following core template parameters to control the state of the instances after the Compute node reboots:
NovaResumeGuestsStateOnHostBoot-
Determines whether to return instances to the same state on the Compute node after reboot. When set to
False, the instances remain down and you must start them manually. The default value isFalse. NovaResumeGuestsShutdownTimeoutNumber of seconds to wait for an instance to shut down before rebooting. It is not recommended to set this value to
0. The default value is300.For more information about overcloud parameters and their usage, see Overcloud parameters.
Procedure
-
Log in to the undercloud as the
stackuser. Retrieve a list of your Compute nodes to identify the host name of the node that you want to reboot:
(undercloud)$ source ~/overcloudrc (overcloud)$ openstack compute service listIdentify the host name of the Compute node that you want to reboot.
Disable the Compute service on the Compute node that you want to reboot:
(overcloud)$ openstack compute service list (overcloud)$ openstack compute service set <hostname> nova-compute --disable-
Replace
<hostname>with the host name of your Compute node.
-
Replace
List all instances on the Compute node:
(overcloud)$ openstack server list --host <hostname> --all-projectsOptional: To migrate the instances to another Compute node, complete the following steps:
If you decide to migrate the instances to another Compute node, use one of the following commands:
To migrate the instance to a different host, run the following command:
(overcloud) $ openstack server migrate <instance_id> --live <target_host> --wait-
Replace
<instance_id>with your instance ID. -
Replace
<target_host>with the host that you are migrating the instance to.
-
Replace
Let
nova-schedulerautomatically select the target host:(overcloud) $ nova live-migration <instance_id>Live migrate all instances at once:
$ nova host-evacuate-live <hostname>NoteThe
novacommand might cause some deprecation warnings, which are safe to ignore.
- Wait until migration completes.
Confirm that the migration was successful:
(overcloud) $ openstack server list --host <hostname> --all-projects- Continue to migrate instances until none remain on the Compute node.
Log in to the Compute node and reboot the node:
[tripleo-admin@overcloud-compute-0 ~]$ sudo reboot- Wait until the node boots.
Re-enable the Compute node:
$ source ~/overcloudrc (overcloud) $ openstack compute service set <hostname> nova-compute --enableCheck that the Compute node is enabled:
(overcloud) $ openstack compute service list
11.11.4. Validating RHOSP after the overcloud update Copy linkLink copied to clipboard!
After you update your Red Hat OpenStack Platform (RHOSP) environment, validate your overcloud with the tripleo-validations playbooks.
For more information about validations, see Using the validation framework in Installing and managing Red Hat OpenStack Platform with director.
Procedure
-
Log in to the undercloud host as the
stackuser. Source the
stackrcundercloud credentials file:$ source ~/stackrcRun the validation:
$ validation run -i ~/overcloud-deploy/<stack>/config-download/<stack>/tripleo-ansible-inventory.yaml --group post-update- Replace <stack> with the name of the stack.
Verification
- To view the results of the validation report, see Viewing validation history in Installing and managing Red Hat OpenStack Platform with director.
If a host is not found when you run a validation, the command reports the status as SKIPPED. A status of SKIPPED means that the validation is not executed, which is expected. Additionally, if a validation’s pass criteria is not met, the command reports the status as FAILED. A FAILED validation does not prevent you from using your updated RHOSP environment. However, a FAILED validation can indicate an issue with your environment.