Chapter 14. Upgrading the Compute node operating system
You can upgrade the operating system on all of your Compute nodes to RHEL 9.2, or upgrade some Compute nodes while the rest remain on RHEL 8.4.
If your deployment includes hyperconverged infrastructure (HCI) nodes, you must upgrade all HCI nodes to RHEL 9. For more information about upgrading to RHEL 9, see Upgrading Compute nodes to RHEL 9.2.
For information about the duration and impact of this upgrade procedure, see Upgrade duration and impact.
Prerequisites
If you are using Red Hat Ceph Storage, before performing the Leapp upgrade verify if the ceph-common package is present on your Compute nodes. If the ceph-common package is present on a node, take the precautions described in Upgrading RHCS 5 hosts from RHEL 8 to RHEL 9 removes ceph-common package. Services fail to start. to ensure the Red Hat Ceph Storage services restart after the Compute node reboots after the Leapp upgrade.
14.1. Selecting Compute nodes for upgrade testing Copy linkLink copied to clipboard!
The overcloud upgrade process allows you to either:
- Upgrade all nodes in a role.
- Upgrade individual nodes separately.
To ensure a smooth overcloud upgrade process, it is useful to test the upgrade on a few individual Compute nodes in your environment before upgrading all Compute nodes. This ensures no major issues occur during the upgrade while maintaining minimal downtime to your workloads.
Use the following recommendations to help choose test nodes for the upgrade:
- Select two or three Compute nodes for upgrade testing.
- Select nodes without any critical instances running.
If necessary, migrate critical instances from the selected test Compute nodes to other Compute nodes. Review which migration scenarios are supported:
Expand Source Compute node RHEL version Destination Compute node RHEL version Supported/Not supported RHEL 8
RHEL 8
Supported
RHEL 8
RHEL 9
Supported
RHEL 9
RHEL 9
Supported
RHEL 9
RHEL 8
Not supported
14.2. Upgrading all Compute nodes to RHEL 9.2 Copy linkLink copied to clipboard!
Upgrade all your Compute nodes to RHEL 9.2 to take advantage of the latest features and to reduce downtime.
Prerequisites
- If your deployment includes hyper-converged infrastructure (HCI) nodes, place hosts in maintenance mode to prepare the Red Hat Ceph Storage cluster on each HCI node for reboot. For more information, see Placing hosts in the maintenance mode using the Ceph Orchestrator in The Ceph Operations Guide.
If you are using RHOSP version 17.1.3 or earlier, before you run the system upgrade, ensure that no guests are running on the Compute hosts. Any guests that are running go into an error state. To avoid this issue, either live migrate your workloads or shut them down. For more information about live migration, see Live migrating an instance in Configuring the Compute service for instance creation.
Procedure
-
Log in to the undercloud host as the
stackuser. Source the
stackrcundercloud credentials file:$ source ~/stackrcIn the
container-image-prepare.yamlfile, ensure that only the tags specified in theContainerImagePrepareparameter are included, and theMultiRhelRoleContainerImagePrepareparameter is removed. For example:parameter_defaults: ContainerImagePrepare: - tag_from_label: "{version}-{release}" set: namespace: name_prefix: name_suffix: tag: rhel_containers: false neutron_driver: ovn ceph_namespace: ceph_image: ceph_tag:-
In the
roles_data.yamlfile, replace theOS::TripleO::Services::NovaLibvirtLegacyservice with theOS::TripleO::Services::NovaLibvirtservice that is required for RHEL 9.2. Include the
-esystem_upgrade.yamlargument and the other required-eenvironment file arguments in theovercloud_upgrade_prepare.shscript as shown in the following example:$ openstack overcloud upgrade prepare --yes … -e /home/stack/system_upgrade.yaml …-
Run the
overcloud_upgrade_prepare.shscript. Upgrade the operating system on the Compute nodes to RHEL 9.2. Use the
--limitoption with a comma-separated list of nodes that you want to upgrade. The following example upgrades thecompute-0,compute-1, andcompute-2nodes.$ openstack overcloud upgrade run --yes --tags system_upgrade --stack <stack> --limit compute-0,compute-1,compute-2-
Replace
<stack>with the name of your stack.
-
Replace
Upgrade the containers on the Compute nodes to RHEL 9.2. Use the
--limitoption with a comma-separated list of nodes that you want to upgrade. The following example upgrades thecompute-0,compute-1, andcompute-2nodes.$ openstack overcloud upgrade run --yes --stack <stack> --limit compute-0,compute-1,compute-2
14.3. Upgrading Compute nodes to a Multi-RHEL environment Copy linkLink copied to clipboard!
You can upgrade a portion of your Compute nodes to RHEL 9.2 while the rest of your Compute nodes remain on RHEL 8.4. This upgrade process involves the following fundamental steps:
-
Plan which nodes you want to upgrade to RHEL 9.2, and which nodes you want to remain on RHEL 8.4. Choose a role name for each role that you are creating for each batch of nodes, for example,
ComputeRHEL-9.2andComputeRHEL-8.4. Create roles that store the nodes that you want to upgrade to RHEL 9.2, or the nodes that you want to stay on RHEL 8.4. These roles can remain empty until you are ready to move your Compute nodes to a new role. You can create as many roles as you need and divide nodes among them any way you decide. For example:
-
If your environment uses a role called
ComputeSRIOVand you need to run a canary test to upgrade to RHEL 9.2, you can create a newComputeSRIOVRHEL9role and move the canary node to the new role. -
If your environment uses a role called
ComputeOffloadand you want to upgrade most nodes in that role to RHEL 9.2, but keep a few nodes on RHEL 8.4, you can create a newComputeOffloadRHEL8role to store the RHEL 8.4 nodes. You can then select the nodes in the originalComputeOffloadrole to upgrade to RHEL 9.2.
-
If your environment uses a role called
- Move the nodes from each Compute role to the new role.
Upgrade the operating system on specific Compute nodes to RHEL 9.2. You can upgrade nodes in batches from the same role or multiple roles.
NoteIn a Multi-RHEL environment, the deployment should continue to use the pc-i440fx machine type. Do not update the default to Q35. Migrating to the Q35 machine type is a separate, post-upgrade procedure to follow after all Compute nodes are upgraded to RHEL 9.2. For more information about migrating the Q35 machine type, see Updating the default machine type for hosts after an upgrade to RHOSP 17.
Use the following procedures to upgrade Compute nodes to a Multi-RHEL environment:
14.3.1. Creating roles for Multi-RHEL Compute nodes Copy linkLink copied to clipboard!
Create new roles to store the nodes that you are upgrading to RHEL 9.2 or that are staying on RHEL 8.4, and move the nodes into the new roles.
Procedure
Create the relevant roles for your environment. In the
role_data.yamlfile, copy the source Compute role to use for the new role.Repeat this step for each additional role required. Roles can remain empty until you are ready to move your Compute nodes to the new roles.
If you are creating a RHEL 8 role:
name: <ComputeRHEL8> description: | Basic Compute Node role CountDefault: 1 rhsm_enforce_multios: 8.4 ... ServicesDefault: ... - OS::TripleO::Services::NovaLibvirtLegacyNoteRoles that contain nodes remaining on RHEL 8.4 must include the
NovaLibvirtLegacyservice.-
Replace
<ComputeRHEL8>with the name of your RHEL 8.4 role. If you are creating a RHEL 9 role:
name: <ComputeRHEL9> description: | Basic Compute Node role CountDefault: 1 ... ServicesDefault: ... - OS::TripleO::Services::NovaLibvirtNoteRoles that contain nodes being upgraded to RHEL 9.2 must include the
NovaLibvirtservice. ReplaceOS::TripleO::Services::NovaLibvirtLegacywithOS::TripleO::Services::NovaLibvirt.- Replace <ComputeRHEL9> with the name of your RHEL 9.2 role.
Copy the
overcloud_upgrade_prepare.shfile to thecopy_role_Compute_param.shfile:$ cp overcloud_upgrade_prepare.sh copy_role_Compute_param.shEdit the
copy_role_Compute_param.shfile to include thecopy_role_params.pyscript. This script generates the environment file that contains the additional parameters and resources for the new role. For example:/usr/share/openstack-tripleo-heat-templates/tools/copy_role_params.py --rolename-src <Compute_source_role> --rolename-dst <Compute_destination_role> \ -o <Compute_new_role_params.yaml> \ -e /home/stack/templates/internal.yaml \ -e /home/stack/templates/network/network-environment.yaml \ -e /home/stack/templates/inject-trust-anchor.yaml \ -e /home/stack/templates/hostnames.yml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \ -e /home/stack/templates/nodes_data.yaml \ -e /home/stack/templates/debug.yaml \ -e /home/stack/templates/firstboot.yaml \ -e /home/stack/overcloud-params.yaml \ -e /home/stack/overcloud-deploy/overcloud/overcloud-network-environment.yaml \ -e /home/stack/templates/baremetal-deployment.yaml \ -e /home/stack/templates/generated-networks-deployed.yaml \ -e /home/stack/templates/generated-vip-deployed.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/nova-hw-machine-type-upgrade.yaml \ -e ~/containers-prepare-parameter.yaml-
Replace
<Compute_source_role>with the name of your source Compute role that you are copying. -
Replace
<Compute_destination_role>with the name of your new role. -
Use the -o option to define the name of the output file that includes all the non-default values of the source Compute role for the new role. Replace
<Compute_new_role_params.yaml>with the name of your output file.
-
Replace
Run the
copy_role_Compute_param.shscript:$ sh /home/stack/copy_role_Compute_param.shMove the Compute nodes from the source role to the new role:
python3 /usr/share/openstack-tripleo-heat-templates/tools/baremetal_transition.py --baremetal-deployment /home/stack/tripleo-<stack>-baremetal-deployment.yaml --src-role <Compute_source_role> --dst-role <Compute_destination_role> <Compute-0> <Compute-1> <Compute-2>NoteThis tool includes the original
/home/stack/tripleo-<stack>-baremetal-deployment.yamlfile that you exported during the undercloud upgrade. The tool copies and renames the source role definition in the/home/stack/tripleo-<stack>-baremetal-deployment.yamlfile. Then, it changes thehostname_formatto prevent a conflict with the newly created destination role. The tool then moves the node from the source role to the destination role and changes thecountvalues.-
Replace
<stack>with the name of your stack. -
Replace
<Compute_source_role>with the name of the source Compute role that contains the nodes that you are moving to your new role. -
Replace
<Compute_destination_role>with the name of your new role. -
Replace
<Compute-0><Compute-1><Compute-2>with the names of the nodes that you are moving to your new role.
-
Replace
Reprovision the nodes to update the environment files in the stack with the new role location:
$ openstack overcloud node provision --stack <stack> --output /home/stack/templates/baremetal-deployment.yaml /home/stack/tripleo-<stack>-baremetal-deployment.yamlNoteThe output
baremetal-deployment.yamlfile is the same file that is used in theovercloud_upgrade_prepare.shfile during overcloud adoption.Include any Compute roles that are remaining on RHEL 8.4 in the
COMPUTE_ROLESparameter, and run the following script. For example, if you have a role calledComputeRHEL8that contains the nodes that are remaining on RHEL 8.4,COMPUTE_ROLES = --role ComputeRHEL8.python3 /usr/share/openstack-tripleo-heat-templates/tools/multi-rhel-container-image-prepare.py \ ${COMPUTE_ROLES} \ --enable-multi-rhel \ --excludes collectd \ --excludes nova-libvirt \ --minor-override "{${EL8_TAGS}${EL8_NAMESPACE}${CEPH_OVERRIDE}${NEUTRON_DRIVER}\"no_tag\":\"not_used\"}" \ --major-override "{${EL9_TAGS}${NAMESPACE}${CEPH_OVERRIDE}${NEUTRON_DRIVER}\"no_tag\":\"not_used\"}" \ --output-env-file \ /home/stack/containers-prepare-parameter.yaml- Repeat this procedure to create additional roles and to move additional Compute nodes to those new roles.
14.3.2. Upgrading the Compute node operating system Copy linkLink copied to clipboard!
Upgrade the operating system on selected Compute nodes to RHEL 9.2. You can upgrade multiple nodes from different roles at the same time.
Prerequisites
- Ensure that you have created the necessary roles for your environment. For more information about creating roles for a Multi-RHEL environment, see Creating roles for Multi-RHEL Compute nodes.
- If you plan to migrate your instances from your Compute nodes, review the supported migration scenarios. For more information, see Selecting Compute nodes for upgrade testing.
If you are using RHOSP version 17.1.3 or earlier, before you run the system upgrade, ensure that no guests are running on the Compute hosts. Any guests that are running go into an error state. To avoid this issue, either live migrate your workloads or shut them down. For more information about live migration, see Live migrating an instance in Configuring the Compute service for instance creation.
Procedure
In the
skip_rhel_release.yamlfile, set theSkipRhelEnforcementparameter tofalse:parameter_defaults: SkipRhelEnforcement: falseInclude the
-esystem_upgrade.yamlargument and the other required-eenvironment file arguments in theovercloud_upgrade_prepare.shscript as shown in the following example:$ openstack overcloud upgrade prepare --yes \ ... -e /home/stack/system_upgrade.yaml \ -e /home/stack/<Compute_new_role_params.yaml> \ ...-
Include the
system_upgrade.yamlfile with the upgrade-specific parameters (-e). -
Include the environment file that contains the parameters needed for the new role (-e). Replace
<Compute_new_role_params.yaml>with the name of the environment file you created for your new role. - If you are upgrading nodes from multiple roles at the same time, include the environment file for each new role that you created.
-
Include the
- Optional: Migrate your instances. For more information on migration strategies, see Migrating virtual machines between Compute nodes and Preparing to migrate.
-
Run the
overcloud_upgrade_prepare.shscript. Upgrade the operating system on specific Compute nodes. Use the
--limitoption with a comma-separated list of nodes that you want to upgrade. The following example upgrades thecomputerhel9-0,computerhel9-1,computerhel9-2, andcomputesriov-42nodes from theComputeRHEL9andComputeSRIOVroles.$ openstack overcloud upgrade run --yes --tags system_upgrade --stack <stack> --limit computerhel9-0,computerhel9-1,computerhel9-2,computesriov-42- Replace <stack> with the name of your stack.
Upgrade the containers on the Compute nodes to RHEL 9.2. Use the
--limitoption with a comma-separated list of nodes that you want to upgrade. The following example upgrades thecomputerhel9-0,computerhel9-1,computerhel9-2, andcomputesriov-42nodes from theComputeRHEL9andComputeSRIOVroles.$ openstack overcloud upgrade run --yes --stack <stack> --limit computerhel9-0,computerhel9-1,computerhel9-2,computesriov-42