Chapter 20. Speeding up an overcloud upgrade
To speed up the overcloud upgrade process, you can upgrade 1/3 of the control plane at a time, starting with the bootstrap nodes.
After the upgrade of first 1/3 of the control plane is complete, you can move your environment into mixed-mode where the control plane APIs are running and the cloud is operational. High availability operational performance can be resumed only after the entire control plane has been upgraded.
When you upgrade a large number of Compute nodes, to improve performance, you can run the openstack overcloud upgrade run command with the --limit Compute option in parallel on groups of 20 nodes. You can run multiple upgrade tasks in the background, where each task upgrades a separate group of 20 nodes.
This scenario contains an example upgrade process for an overcloud environment that includes the following node types with composable roles:
- Three Controller nodes
- Three Database nodes
- Three Networker nodes
- Three Ceph Storage nodes
- Multiple Compute nodes
20.1. Running the overcloud upgrade preparation Copy linkLink copied to clipboard!
The upgrade requires running openstack overcloud upgrade prepare command, which performs the following tasks:
- Updates the overcloud plan to OpenStack Platform 16.1
- Prepares the nodes for the upgrade
If you are not using the default stack name (overcloud), set your stack name with the --stack STACK NAME option replacing STACK NAME with the name of your stack.
Procedure
Source the
stackrcfile:source ~/stackrc
$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the upgrade preparation command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Include the following options relevant to your environment:
-
The environment file (
upgrades-environment.yaml) with the upgrade-specific parameters (-e). -
The environment file (
rhsm.yaml) with the registration and subscription parameters (-e). -
The environment file (
containers-prepare-parameter.yaml) with your new container image locations (-e). In most cases, this is the same environment file that the undercloud uses. -
The environment file (
neutron-ovs.yaml) to maintain OVS compatibility. -
Any custom configuration environment files (
-e) relevant to your deployment. -
If applicable, your custom roles (
roles_data) file using--roles-file. -
If applicable, your composable network (
network_data) file using--networks-file. -
If you use a custom stack name, pass the name with the
--stackoption.
-
The environment file (
- Wait until the upgrade preparation completes.
Download the container images:
openstack overcloud external-upgrade run --stack STACK NAME --tags container_image_prepare
$ openstack overcloud external-upgrade run --stack STACK NAME --tags container_image_prepareCopy to Clipboard Copied! Toggle word wrap Toggle overflow
20.2. Upgrading the control plane nodes Copy linkLink copied to clipboard!
To upgrade the control plane nodes in your environment to OpenStack Platform 16.1, you must upgrade 1/3 of your control plane nodes at a time, starting with the bootstrap nodes.
During the bootstrap Controller node upgrade process, a new Pacemaker cluster is created and new Red Hat OpenStack 16.1 containers are started on the node, while the remaining Controller nodes continue to run on Red Hat OpenStack 13.
In this example, the control plane nodes are named using the default overcloud-ROLE-NODEID convention. This includes the following node types with composable roles:
-
overcloud-controller-0 -
overcloud-controller-1 -
overcloud-controller-2 -
overcloud-database-0 -
overcloud-database-1 -
overcloud-database-2 -
overcloud-networker-0 -
overcloud-networker-1 -
overcloud-networker-2 -
overcloud-ceph-0 -
overcloud-ceph-1 -
overcloud-ceph-2
Substitute these values for your own node names where applicable.
After you upgrade the overcloud-controller-0, overcloud-database-0, overcloud-networker-0, and overcloud-ceph-0 bootstrap nodes, which comprise the first 1/3 of your control plane nodes, you must upgrade each additional 1/3 of the nodes with Pacemaker services and ensure that each node joins the new Pacemaker cluster started with the bootstrap node. Therefore, you must upgrade overcloud-controller-1, overcloud-database-1, overcloud-networker-1, and overcloud-ceph-1 before you upgrade overcloud-controller-2, overcloud-database-2, overcloud-networker-2, and overcloud-ceph-2.
If you are not using the default stack name overcloud, use the --stack STACK NAME option to set your stack name and replace STACK NAME with the name of your stack.
Procedure
-
Log in to the undercloud host as the
stackuser. Source the
stackrcfile:source ~/stackrc
$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow On the undercloud node, run the following command to identify the bootstrap Controller node:
tripleo-ansible-inventory --list --stack overcloud |jq .overcloud_Controller.hosts[0]
$ tripleo-ansible-inventory --list --stack overcloud |jq .overcloud_Controller.hosts[0]Copy to Clipboard Copied! Toggle word wrap Toggle overflow Upgrade the
overcloud-controller-0,overcloud-database-0,overcloud-networker-0, andovercloud-ceph-0control plane nodes:Run the external upgrade command with the
ceph_systemdtag:openstack overcloud external-upgrade run --stack <stack_name> --tags ceph_systemd -e ceph_ansible_limit=overcloud-controller-0,overcloud-database-0,overcloud-networker-0,overcloud-ceph-0
$ openstack overcloud external-upgrade run --stack <stack_name> --tags ceph_systemd -e ceph_ansible_limit=overcloud-controller-0,overcloud-database-0,overcloud-networker-0,overcloud-ceph-0Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<stack_name>with the name of your stack.This command performs the following actions:
- Changes the systemd units that control the Ceph Storage containers to use Podman management.
-
Limits actions to the selected nodes using the
ceph_ansible_limitvariable.
This step is a preliminary measure to prepare the Ceph Storage services for the
leappupgrade.Run the upgrade command with the
system_upgradetag:openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-controller-0 & openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-database-0 & openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-networker-0 & openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-ceph-0 &
$ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-controller-0 & $ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-database-0 & $ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-networker-0 & $ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-ceph-0 &Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
Performs a reboot as a part of the Leapp upgrade.
ImportantThe next command causes an outage on the control plane. You cannot perform any standard operations on the overcloud during the next few steps.
Run the external upgrade command with the
system_upgrade_transfer_datatag:openstack overcloud external-upgrade run --stack STACK NAME --tags system_upgrade_transfer_data
$ openstack overcloud external-upgrade run --stack STACK NAME --tags system_upgrade_transfer_dataCopy to Clipboard Copied! Toggle word wrap Toggle overflow This command copies the latest version of the database from an existing node to the bootstrap node.
Run the upgrade command with the
nova_hybrid_statetag and run only theupgrade_steps_playbook.yamlplaybook:openstack overcloud upgrade run --stack STACK NAME --playbook upgrade_steps_playbook.yaml --tags nova_hybrid_state --limit all
$ openstack overcloud upgrade run --stack STACK NAME --playbook upgrade_steps_playbook.yaml --tags nova_hybrid_state --limit allCopy to Clipboard Copied! Toggle word wrap Toggle overflow This command launches temporary 16.1 containers on Compute nodes to help facilitate workload migration when you upgrade Compute nodes at a later step.
Run the upgrade command with no tags:
openstack overcloud upgrade run --stack STACK NAME --limit overcloud-controller-0,overcloud-database-0,overcloud-networker-0,overcloud-ceph-0 --playbook all
$ openstack overcloud upgrade run --stack STACK NAME --limit overcloud-controller-0,overcloud-database-0,overcloud-networker-0,overcloud-ceph-0 --playbook allCopy to Clipboard Copied! Toggle word wrap Toggle overflow This command performs the Red Hat OpenStack Platform upgrade.
ImportantThe control plane becomes active when this command finishes. You can perform standard operations on the overcloud again.
Optional: On the bootstrap Contoller node, verify that after the upgrade, the new Pacemaker cluster is started and that the control plane services such as galera, rabbit, haproxy, and redis are running:
sudo pcs status
$ sudo pcs statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Upgrade the
overcloud-controller-1,overcloud-database-1,overcloud-networker-1, andovercloud-ceph-1control plane nodes:Log in to the
overcloud-controller-1node and verify that the old cluster is no longer running:sudo pcs status
$ sudo pcs statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow An error similar to the following is displayed when the cluster is not running:
Error: cluster is not currently running on this node
Error: cluster is not currently running on this nodeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the external upgrade command with the
ceph_systemdtag:openstack overcloud external-upgrade run --stack STACK NAME --tags ceph_systemd -e ceph_ansible_limit=overcloud-controller-1,overcloud-database-1,overcloud-networker-1,overcloud-ceph-1
$ openstack overcloud external-upgrade run --stack STACK NAME --tags ceph_systemd -e ceph_ansible_limit=overcloud-controller-1,overcloud-database-1,overcloud-networker-1,overcloud-ceph-1Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command performs the following functions:
- Changes the systemd units that control the Ceph Storage containers to use Podman management.
-
Limits actions to the selected nodes using the
ceph_ansible_limitvariable.
This step is a preliminary measure to prepare the Ceph Storage services for The
leappupgrade.Run the upgrade command with the
system_upgradetag:openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-controller-1,overcloud-database-1,overcloud-networker-1,overcloud-ceph-1
$ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-controller-1,overcloud-database-1,overcloud-networker-1,overcloud-ceph-1Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
- Performs a reboot as a part of the Leapp upgrade.
Run the upgrade command with no tags:
openstack overcloud upgrade run --stack STACK NAME --limit overcloud-controller-0,overcloud-controller-1,overcloud-database-0,overcloud-database-1,overcloud-networker-0,overcloud-networker-1,overcloud-ceph-0,overcloud-ceph-1
$ openstack overcloud upgrade run --stack STACK NAME --limit overcloud-controller-0,overcloud-controller-1,overcloud-database-0,overcloud-database-1,overcloud-networker-0,overcloud-networker-1,overcloud-ceph-0,overcloud-ceph-1Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command performs the Red Hat OpenStack Platform upgrade. In addition to this node, include the previously upgraded bootstrap nodes in the
--limitoption.
Upgrade the
overcloud-controller-2,overcloud-database-2,overcloud-networker-2, andovercloud-ceph-2control plane nodes:Log in to the
overcloud-controller-2node and verify that the old cluster is no longer running:sudo pcs status
$ sudo pcs statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow An error similar to the following is displayed when the cluster is not running:
Error: cluster is not currently running on this node
Error: cluster is not currently running on this nodeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the external upgrade command with the
ceph_systemdtag:openstack overcloud external-upgrade run --stack STACK NAME --tags ceph_systemd -e ceph_ansible_limit=overcloud-controller-2,overcloud-database-2,overcloud-networker-2,overcloud-ceph-2
$ openstack overcloud external-upgrade run --stack STACK NAME --tags ceph_systemd -e ceph_ansible_limit=overcloud-controller-2,overcloud-database-2,overcloud-networker-2,overcloud-ceph-2Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command performs the following functions:
- Changes the systemd units that control the Ceph Storage containers to use Podman management.
-
Limits actions to the selected nodes using the
ceph_ansible_limitvariable.
This step is a preliminary measure to prepare the Ceph Storage services for The
leappupgrade.Run the upgrade command with the
system_upgradetag:openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-controller-2,overcloud-database-2,overcloud-networker-2,overcloud-ceph-2
$ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-controller-2,overcloud-database-2,overcloud-networker-2,overcloud-ceph-2Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
- Performs a reboot as a part of the Leapp upgrade.
Run the upgrade command with no tags:
openstack overcloud upgrade run --stack STACK NAME --limit overcloud-controller-0,overcloud-controller-1,overcloud-controller-2,overcloud-database-0,overcloud-database-1,overcloud-database-2,overcloud-networker-0,overcloud-networker-1,overcloud-networker-2,overcloud-ceph-0,overcloud-ceph-1,overcloud-ceph-2
$ openstack overcloud upgrade run --stack STACK NAME --limit overcloud-controller-0,overcloud-controller-1,overcloud-controller-2,overcloud-database-0,overcloud-database-1,overcloud-database-2,overcloud-networker-0,overcloud-networker-1,overcloud-networker-2,overcloud-ceph-0,overcloud-ceph-1,overcloud-ceph-2Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command performs the Red Hat OpenStack Platform upgrade. Include all control plane nodes in the
--limitoption.
20.3. Upgrading Compute nodes in parallel Copy linkLink copied to clipboard!
To upgrade a large number of Compute nodes to OpenStack Platform 16.1, you can run the openstack overcloud upgrade run command with the --limit Compute option in parallel on groups of 20 nodes.
You can run multiple upgrade tasks in the background, where each task upgrades a separate group of 20 nodes. When you use this method to upgrade Compute nodes in parallel, you cannot select which nodes you upgrade. The selection of nodes is based on the inventory file that you generate when you run the tripleo-ansible-inventory command. For example, if you have 80 Compute nodes in your deployment, you can run the following commands to update the Compute nodes in parallel:
openstack overcloud upgrade run -y --limit 'Compute[0:19]' > upgrade-compute-00-19.log 2>&1 & openstack overcloud upgrade run -y --limit 'Compute[20:29]' > upgrade-compute-20-29.log 2>&1 & openstack overcloud upgrade run -y --limit 'Compute[40:59]' > update-compute-40-59.log 2>&1 & openstack overcloud upgrade run -y --limit 'Compute[60:79]' > update-compute-60-79.log 2>&1 &
$ openstack overcloud upgrade run -y --limit 'Compute[0:19]' > upgrade-compute-00-19.log 2>&1 &
$ openstack overcloud upgrade run -y --limit 'Compute[20:29]' > upgrade-compute-20-29.log 2>&1 &
$ openstack overcloud upgrade run -y --limit 'Compute[40:59]' > update-compute-40-59.log 2>&1 &
$ openstack overcloud upgrade run -y --limit 'Compute[60:79]' > update-compute-60-79.log 2>&1 &
To upgrade specific Compute nodes, use a comma-separated list of nodes:
openstack overcloud upgrade run --limit <Compute0>,<Compute1>,<Compute2>,<Compute3>
$ openstack overcloud upgrade run --limit <Compute0>,<Compute1>,<Compute2>,<Compute3>
If you are not using the default stack name overcloud, use the --stack STACK NAME option and replace STACK NAME with name of your stack.
Procedure
-
Log in to the undercloud host as the
stackuser. Source the
stackrcfile:source ~/stackrc
$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Migrate your instances. For more information on migration strategies, see Migrating virtual machines between Compute nodes.
Run the upgrade command with the
system_upgradetag:openstack overcloud upgrade run -y --stack STACK NAME --tags system_upgrade --limit 'Compute[0:19]' > upgrade-compute-00-19.log 2>&1 & openstack overcloud upgrade run -y --stack STACK NAME --tags system_upgrade --limit 'Compute[20:29]' > upgrade-compute-20-29.log 2>&1 & openstack overcloud upgrade run -y --stack STACK NAME --tags system_upgrade --limit 'Compute[40:59]' > update-compute-40-59.log 2>&1 & openstack overcloud upgrade run -y --stack STACK NAME --tags system_upgrade --limit 'Compute[60:79]' > update-compute-60-79.log 2>&1 &
$ openstack overcloud upgrade run -y --stack STACK NAME --tags system_upgrade --limit 'Compute[0:19]' > upgrade-compute-00-19.log 2>&1 & $ openstack overcloud upgrade run -y --stack STACK NAME --tags system_upgrade --limit 'Compute[20:29]' > upgrade-compute-20-29.log 2>&1 & $ openstack overcloud upgrade run -y --stack STACK NAME --tags system_upgrade --limit 'Compute[40:59]' > update-compute-40-59.log 2>&1 & $ openstack overcloud upgrade run -y --stack STACK NAME --tags system_upgrade --limit 'Compute[60:79]' > update-compute-60-79.log 2>&1 &Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
- Performs a reboot as a part of the Leapp upgrade.
Run the upgrade command with no tags:
openstack overcloud upgrade run -y --stack STACK NAME --limit 'Compute[0:19]' > upgrade-compute-00-19.log 2>&1 & openstack overcloud upgrade run -y --stack STACK NAME --limit 'Compute[20:29]' > upgrade-compute-20-29.log 2>&1 & openstack overcloud upgrade run -y --stack STACK NAME --limit 'Compute[40:59]' > update-compute-40-59.log 2>&1 & openstack overcloud upgrade run -y --stack STACK NAME --limit 'Compute[60:79]' > update-compute-60-79.log 2>&1 &
$ openstack overcloud upgrade run -y --stack STACK NAME --limit 'Compute[0:19]' > upgrade-compute-00-19.log 2>&1 & $ openstack overcloud upgrade run -y --stack STACK NAME --limit 'Compute[20:29]' > upgrade-compute-20-29.log 2>&1 & $ openstack overcloud upgrade run -y --stack STACK NAME --limit 'Compute[40:59]' > update-compute-40-59.log 2>&1 & $ openstack overcloud upgrade run -y --stack STACK NAME --limit 'Compute[60:79]' > update-compute-60-79.log 2>&1 &Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command performs the Red Hat OpenStack Platform upgrade.
Optional: To upgrade selected Compute nodes, use the
--limitoption with a comma-separated list of nodes that you want to upgrade. The following example upgrades theovercloud-compute-0,overcloud-compute-1,overcloud-compute-2nodes in parallel.Run the upgrade command with the
system_upgradetag:openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-compute-0,overcloud-compute-1,overcloud-compute-2
$ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-compute-0,overcloud-compute-1,overcloud-compute-2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the upgrade command with no tags:
openstack overcloud upgrade run --stack STACK NAME --limit overcloud-compute-0,overcloud-compute-1,overcloud-compute-2
$ openstack overcloud upgrade run --stack STACK NAME --limit overcloud-compute-0,overcloud-compute-1,overcloud-compute-2Copy to Clipboard Copied! Toggle word wrap Toggle overflow
20.4. Synchronizing the overcloud stack Copy linkLink copied to clipboard!
The upgrade requires an update the overcloud stack to ensure that the stack resource structure and parameters align with a fresh deployment of OpenStack Platform 16.1.
If you are not using the default stack name (overcloud), set your stack name with the --stack STACK NAME option replacing STACK NAME with the name of your stack.
Procedure
Source the
stackrcfile:source ~/stackrc
$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the
containers-prepare-parameter.yamlfile and remove the following parameters and their values:-
ceph3_namespace -
ceph3_tag -
ceph3_image -
name_prefix_stein -
name_suffix_stein -
namespace_stein -
tag_stein
-
-
To re-enable fencing in your overcloud, set the
EnableFencingparameter totruein thefencing.yamlenvironment file. Run the upgrade finalization command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Include the following options relevant to your environment:
-
The environment file (
upgrades-environment.yaml) with the upgrade-specific parameters (-e). -
The environment file (
fencing.yaml) with theEnableFencingparameter set totrue. -
The environment file (
rhsm.yaml) with the registration and subscription parameters (-e). -
The environment file (
containers-prepare-parameter.yaml) with your new container image locations (-e). In most cases, this is the same environment file that the undercloud uses. -
The environment file (
neutron-ovs.yaml) to maintain OVS compatibility. -
Any custom configuration environment files (
-e) relevant to your deployment. -
If applicable, your custom roles (
roles_data) file using--roles-file. -
If applicable, your composable network (
network_data) file using--networks-file. -
If you use a custom stack name, pass the name with the
--stackoption.
-
The environment file (
- Wait until the stack synchronization completes.
You do not need the upgrades-environment.yaml file for any further deployment operations.