Chapter 3. Updating the overcloud
After you update the undercloud, you can update the overcloud by running the overcloud and container image preparation commands, updating your nodes, and running the overcloud update converge
command. The control plane API is fully available during a minor update.
Prerequisites
- You have updated the undercloud node to the latest version. For more information, see Chapter 2, Updating the undercloud.
-
If you use a local set of core templates in your
stack
user home directory, ensure that you update the templates and use the recommended workflow in Using Customized Core Heat Templates in the Advanced Overcloud Customization guide. You must update the local copy before you upgrade the overcloud.
Procedure
To update the overcloud, you must complete the following procedures:
- Section 3.1, “Running the overcloud update preparation”
- Section 3.2, “Running the container image preparation”
- Section 3.3, “Optional: Updating the ovn-controller container on all overcloud servers”
- Section 3.4, “Updating all Controller nodes”
- Section 3.5, “Updating all Compute nodes”
- Section 3.6, “Updating all HCI Compute nodes”
- Section 3.8, “Updating all Ceph Storage nodes”
- Section 3.9, “Performing online database updates”
- Section 3.10, “Finalizing the update”
3.1. Running the overcloud update preparation
To prepare the overcloud for the update process, you must run the openstack overcloud update prepare
command, which updates the overcloud plan to Red Hat OpenStack Platform (RHOSP) 16.2 and prepares the nodes for the update.
Prerequisites
-
If you use a Ceph subscription and have configured director to use the
overcloud-minimal
image for Ceph storage nodes, you must ensure that in theroles_data.yaml
role definition file, therhsm_enforce
parameter is set toFalse
. -
If you rendered custom NIC templates, you must regenerate the templates with the updated version of the
openstack-tripleo-heat-templates
collection to avoid incompatibility with the overcloud version. For more information about custom NIC templates, see Rendering default network interface templates for customization in the Advanced Overcloud Customization guide.
For distributed compute node (edge) architectures with OVN deployments, you must complete this procedure for each stack with Compute, DistributedCompute, or DistributedComputeHCI nodes before proceeding with section Updating the ovn-controller container on all overcloud servers.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update preparation command:
$ openstack overcloud update prepare \ --templates \ --stack <stack_name> \ -r <roles_data_file> \ -n <network_data_file> \ -e <environment_file> \ -e <environment_file> \ ...
Include the following options relevant to your environment:
-
If the name of your overcloud stack is different to the default name
overcloud
, include the--stack
option in the update preparation command and replace<stack_name>
with the name of your stack. -
If you use your own custom roles, include your custom roles (
<roles_data>
) file (-r
). -
If you use custom networks, include your composable network (
<network_data>
) file (-n
). -
If you deploy a high availability cluster, include the
--ntp-server
option in the update preparation command, or include theNtpServer
parameter and value in your environment file. -
Any custom configuration environment files (
-e
).
-
If the name of your overcloud stack is different to the default name
- Wait until the update preparation process completes.
3.2. Running the container image preparation
Before you can update the overcloud, you must prepare all container image configurations that are required for your environment and pull the latest RHOSP 16.2 container images to your undercloud.
To complete the container image preparation, you must run the openstack overcloud external-update run
command against tasks that have the container_image_prepare
tag.
If you are not using the default stack name, which is overcloud
, set your stack name with the --stack <stack_name>
option and replace <stack_name>
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the
openstack overcloud external-update run
command against tasks that have thecontainer_image_prepare
tag:$ openstack overcloud external-update run --stack <stack_name> --tags container_image_prepare
3.3. Optional: Updating the ovn-controller container on all overcloud servers
If you deployed your overcloud with the Modular Layer 2 Open Virtual Network mechanism driver (ML2/OVN), update the ovn-controller container to the latest RHOSP 16.2 version. The update occurs on every overcloud server that runs the ovn-controller container.
The following procedure updates the ovn-controller containers on servers that are assigned the Compute role before it updates the ovn-northd service on servers that are assigned the Controller role.
If you accidentally updated the ovn-northd service before following this procedure, you might not be able to reach your virtual machines or create new virtual machines or virtual networks. The following procedure restores connectivity.
For distributed compute node (edge) architectures, you must complete this procedure for each stack with Compute, DistributedCompute, or DistributedComputeHCI nodes before proceeding with section Updating all Controller nodes.
Procedure
-
Log into the undercloud as the
stack
user. Source the
stackrc
file:$ source ~/stackrc
Run the openstack overcloud external-update run command against the tasks that have the ovn tag:
$ openstack overcloud external-update run --stack <stack_name> --tags ovn
-
If the name of your overcloud stack is different from the default stack name
overcloud
, set your stack name with the--stack
option and replace<stack_name>
with the name of your stack.
-
If the name of your overcloud stack is different from the default stack name
- Wait until the ovn-controller container update completes.
3.4. Updating all Controller nodes
Update all the Controller nodes to the latest RHOSP 16.2 version. Run the openstack overcloud update run
command and include the --limit Controller
option to restrict operations to the Controller nodes only. The control plane API is fully available during the minor update.
Until BZ#1872404 is resolved, for nodes based on composable roles, you must update the Database
role first, before you can update Controller
, Messaging
, Compute
, Ceph
, and other roles.
If you are not using the default stack name, which is overcloud
, set your stack name with the --stack <stack_name>
option and replace <stack_name>
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update command:
$ openstack overcloud update run --stack <stack_name> --limit Controller
- Wait until the Controller node update completes.
3.5. Updating all Compute nodes
Update all Compute nodes to the latest RHOSP 16.2 version. To update Compute nodes, run the openstack overcloud update run
command and include the --limit Compute
option to restrict operations to the Compute nodes only.
- Parallelization considerations
When you update a large number of Compute nodes, to improve performance, you can run multiple update tasks in the background and configure each task to update a separate group of 20 nodes. For example, if you have 80 Compute nodes in your deployment, you can run the following commands to update the Compute nodes in parallel:
$ openstack overcloud update run -y --limit 'Compute[0:19]' > update-compute-0-19.log 2>&1 & $ openstack overcloud update run -y --limit 'Compute[20:39]' > update-compute-20-39.log 2>&1 & $ openstack overcloud update run -y --limit 'Compute[40:59]' > update-compute-40-59.log 2>&1 & $ openstack overcloud update run -y --limit 'Compute[60:79]' > update-compute-60-79.log 2>&1 &
This method of partitioning the nodes space is random and you do not have control over which nodes are updated. The selection of nodes is based on the inventory file that you generate when you run the
tripleo-ansible-inventory
command.To update specific Compute nodes, list the nodes that you want to update in a batch separated by a comma:
$ openstack overcloud update run --limit <Compute0>,<Compute1>,<Compute2>,<Compute3>
If you are not using the default stack name, which is overcloud
, set your stack name with the --stack <stack_name>
option and replace <stack_name>
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update command:
$ openstack overcloud update run --stack <stack_name> --limit Compute
- Wait until the Compute node update completes.
3.6. Updating all HCI Compute nodes
Update the Hyperconverged Infrastructure (HCI) Compute nodes to the latest RHOSP 16.2 version. To update the HCI Compute nodes, run the openstack overcloud update run
command and include the --limit ComputeHCI
option to restrict operations to only the HCI nodes. You must also run the openstack overcloud external-update run --tags ceph
command to perform an update to a containerized Red Hat Ceph Storage 4 cluster.
Prerequisites
On a Ceph Monitor or Controller node that is running the
ceph-mon
service, check that the Red Hat Ceph Storage cluster status is healthy and the pg status isactive+clean
:$ sudo podman exec -it ceph-mon-controller-0 ceph -s
If the Ceph cluster is healthy, it returns a status of
HEALTH_OK
.If the Ceph cluster status is unhealthy, it returns a status of
HEALTH_WARN
orHEALTH_ERR
. For troubleshooting guidance, see the Red Hat Ceph Storage 4 Troubleshooting Guide.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update command:
$ openstack overcloud update run --stack <stack_name> --limit ComputeHCI
-
Replace
<stack_name>
with the name of with the name of your stack. if not specified, the default isovercloud
.
-
Replace
- Wait until the node update completes.
Run the Ceph Storage update command:
$ openstack overcloud external-update run --stack <stack_name> --tags ceph
- Wait until the Compute HCI node update completes.
3.7. Updating all DistributedComputeHCI nodes
Update roles specific to distributed compute node architecture. When you upgrade distributed compute nodes, update DistributedComputeHCI
nodes first, and then update DistributedComputeHCIScaleOut
nodes.
If you are not using the default stack name, which is overcloud, set your stack name with the --stack <stack_name>
option and replace <_stack_name_> with the name of your stack.
Prerequisites
On a Ceph Monitor or Controller node that is running the
ceph-mon
service, check that the Red Hat Ceph Storage cluster status is healthy and the pg status isactive+clean
:$ sudo podman exec -it ceph-mon-controller-0 ceph -s
If the Ceph cluster is healthy, it returns a status of
HEALTH_OK
.If the Ceph cluster status is unhealthy, it returns a status of
HEALTH_WARN
orHEALTH_ERR
. For troubleshooting guidance, see the Red Hat Ceph Storage 4 Troubleshooting Guide.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update command:
$ openstack overcloud update run --stack <stack_name> --limit DistributedComputeHCI
-
Wait until the
DistributedComputeHCI
node update completes. Run the Ceph Storage update command:
$ openstack overcloud external-update run --stack <stack_name> --tags ceph
-
Wait until the
DistributedComputeHCI
node update completes. -
Use the same process to update
DistributedComputeHCIScaleOut
nodes.
3.8. Updating all Ceph Storage nodes
Update the Red Hat Ceph Storage nodes to the latest RHOSP 16.2 version.
RHOSP 16.2 is supported on RHEL 8.4. However, hosts that are mapped to the Ceph Storage role update to the latest major RHEL release. For more information, see Red Hat Ceph Storage: Supported configurations.
Prerequisites
On a Ceph Monitor or Controller node that is running the
ceph-mon
service, check that the Red Hat Ceph Storage cluster status is healthy and the pg status isactive+clean
:$ sudo podman exec -it ceph-mon-controller-0 ceph -s
If the Ceph cluster is healthy, it returns a status of
HEALTH_OK
.If the Ceph cluster status is unhealthy, it returns a status of
HEALTH_WARN
orHEALTH_ERR
. For troubleshooting guidance, see the Red Hat Ceph Storage 4 Troubleshooting Guide.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Update group nodes.
To update all nodes in a group:
$ openstack overcloud update run --limit <GROUP_NAME>
To update a single node in a group:
$ openstack overcloud update run --limit <GROUP_NAME> [NODE_INDEX]
NoteEnsure that you update all nodes if you choose to update nodes individually.
The index of the first node in a group is zero (0). For example, to update the first node in a group named
CephStorage
:openstack overcloud update run --limit CephStorage[0]
- Wait until the node update completes.
Run the Ceph Storage container update command to run
ceph-ansible
as an external process and update the Red Hat Ceph Storage 4 containers:$ openstack overcloud external-update run --stack <stack_name> --tags ceph
-
Replace
<stack_name>
with the name of your stack. If not specified, the default isovercloud
.
-
Replace
- Wait until the Ceph Storage container update completes.
3.9. Performing online database updates
Some overcloud components require an online update or migration of their databases tables. To perform online database updates, run the openstack overcloud external-update run
command against tasks that have the online_upgrade
tag.
Online database updates apply to the following components:
- OpenStack Block Storage (cinder)
- OpenStack Compute (nova)
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the
openstack overcloud external-update run
command against tasks that use theonline_upgrade
tag:$ openstack overcloud external-update run --stack <stack_name> --tags online_upgrade
-
Replace
<stack_name>
with the name of your stack. If not specified, the default isovercloud
.
-
Replace
3.10. Finalizing the update
You are no longer required to run the openstack overcloud update converge
command. However, if you disabled fencing and plan to skip the converge step, you must manually re-enable fencing.
You can update the overcloud stack to the latest RHOSP 16.2 version. This ensures that the stack resource structure aligns with a standard deployment of OSP 16.2 and you can perform regular openstack overcloud deploy
functions in the future.
Procedure
-
Log in to the undercloud as the
stack
user. Source the
stackrc
file:$ source ~/stackrc
If fencing is disabled, and you do not run
openstack overcloud update converge
, then you must re-enable fencing:Log in to a Controller node and run the Pacemaker command to re-enable fencing:
$ ssh tripleo-admin@<controller_ip> "sudo pcs property set stonith-enabled=true"
-
Replace
<controller_ip>
with the IP address of a Controller node. You can find the IP addresses of your Controller nodes with theopenstack server list
command.
-
Replace
-
In the
fencing.yaml
environment file, set the value of theEnableFencing
parameter totrue
.
Optional: Run the update finalization command:
$ openstack overcloud update converge \ --templates \ --stack <stack_name> \ -r <roles_data_file> \ -n <network_data_file> \ -e <environment_file> \ -e <environment_file> \ ... ...
Include the following options that are relevant to your environment:
-
The
fencing.yaml
environment file, with theEnableFencing
parameter set totrue
. -
If the name of your overcloud stack is different to the default name
overcloud
, include the--stack
option in the update preparation command and replace<stack_name>
with the name of your stack. -
If you use custom roles, include your custom roles (
<roles_data>
) file (-r
). -
If you use custom networks, include your composable network (
<network_data>
) file (-n
). Any custom configuration environment files (
-e
).Wait until the update finalization completes.
-
The