Chapter 18. Rebooting nodes
You might need to reboot the nodes in the undercloud and overcloud. Use the following procedures to understand how to reboot different node types.
- If you reboot all nodes in one role, it is advisable to reboot each node individually. If you reboot all nodes in a role simultaneously, service downtime can occur during the reboot operation.
- If you reboot all nodes in your OpenStack Platform environment, reboot the nodes in the following sequential order:
Recommended node reboot order
- Reboot the undercloud node.
- Reboot Controller and other composable nodes.
- Reboot standalone Ceph MON nodes.
- Reboot Ceph Storage nodes.
- Reboot Object Storage service (swift) nodes.
- Reboot Compute nodes.
18.1. Rebooting the undercloud node
Complete the following steps to reboot the undercloud node.
Procedure
-
Log in to the undercloud as the
stack
user. Reboot the undercloud:
$ sudo reboot
- Wait until the node boots.
18.2. Rebooting Controller and composable nodes
Reboot Controller nodes and standalone nodes based on composable roles, and exclude Compute nodes and Ceph Storage nodes.
Procedure
- Log in to the node that you want to reboot.
Optional: If the node uses Pacemaker resources, stop the cluster:
[heat-admin@overcloud-controller-0 ~]$ sudo pcs cluster stop
Reboot the node:
[heat-admin@overcloud-controller-0 ~]$ sudo reboot
- Wait until the node boots.
Verfication
Verify that the services are enabled.
If the node uses Pacemaker services, check that the node has rejoined the cluster:
[heat-admin@overcloud-controller-0 ~]$ sudo pcs status
If the node uses Systemd services, check that all services are enabled:
[heat-admin@overcloud-controller-0 ~]$ sudo systemctl status
If the node uses containerized services, check that all containers on the node are active:
[heat-admin@overcloud-controller-0 ~]$ sudo podman ps
18.3. Rebooting standalone Ceph MON nodes
Complete the following steps to reboot standalone Ceph MON nodes.
Procedure
- Log in to a Ceph MON node.
Reboot the node:
$ sudo reboot
- Wait until the node boots and rejoins the MON cluster.
Repeat these steps for each MON node in the cluster.
18.4. Rebooting a Ceph Storage (OSD) cluster
Complete the following steps to reboot a cluster of Ceph Storage (OSD) nodes.
Procedure
Log in to a Ceph MON or Controller node and disable Ceph Storage cluster rebalancing temporarily:
$ sudo podman exec -it ceph-mon-controller-0 ceph osd set noout $ sudo podman exec -it ceph-mon-controller-0 ceph osd set norebalance
NoteIf you have a multistack or distributed compute node (DCN) architecture, you must specify the cluster name when you set the
noout
andnorebalance
flags. For example:sudo podman exec -it ceph-mon-controller-0 ceph osd set noout --cluster <cluster_name>
- Select the first Ceph Storage node that you want to reboot and log in to the node.
Reboot the node:
$ sudo reboot
- Wait until the node boots.
Log in to the node and check the cluster status:
$ sudo podman exec -it ceph-mon-controller-0 ceph status
Check that the
pgmap
reports allpgs
as normal (active+clean
).- Log out of the node, reboot the next node, and check its status. Repeat this process until you have rebooted all Ceph Storage nodes.
When complete, log in to a Ceph MON or Controller node and re-enable cluster rebalancing:
$ sudo podman exec -it ceph-mon-controller-0 ceph osd unset noout $ sudo podman exec -it ceph-mon-controller-0 ceph osd unset norebalance
NoteIf you have a multistack or distributed compute node (DCN) architecture, you must specify the cluster name when you unset the
noout
andnorebalance
flags. For example:sudo podman exec -it ceph-mon-controller-0 ceph osd set noout --cluster <cluster_name>
Perform a final status check to verify that the cluster reports
HEALTH_OK
:$ sudo podman exec -it ceph-mon-controller-0 ceph status
18.5. Rebooting Compute nodes
To ensure minimal downtime of instances in your Red Hat OpenStack Platform environment, the Migrating instances workflow outlines the steps you must complete to migrate instances from the Compute node that you want to reboot.
If you do not migrate the instances from the source Compute node to another Compute node, the instances might be restarted on the source Compute node, which might cause the upgrade to fail. This is related to the known issue around changes to Podman and the libvirt service:
Migrating instances workflow
- Decide whether to migrate instances to another Compute node before rebooting the node.
- Select and disable the Compute node that you want to reboot so that it does not provision new instances.
- Migrate the instances to another Compute node.
- Reboot the empty Compute node.
- Enable the empty Compute node.
Prerequisites
Before you reboot the Compute node, you must decide whether to migrate instances to another Compute node while the node is rebooting.
Review the list of migration constraints that you might encounter when you migrate virtual machine instances between Compute nodes. For more information, see Migration constraints in Configuring the Compute Service for Instance Creation.
If you cannot migrate the instances, you can set the following core template parameters to control the state of the instances after the Compute node reboots:
NovaResumeGuestsStateOnHostBoot
-
Determines whether to return instances to the same state on the Compute node after reboot. When set to
False
, the instances remain down and you must start them manually. The default value isFalse
. NovaResumeGuestsShutdownTimeout
Number of seconds to wait for an instance to shut down before rebooting. It is not recommended to set this value to
0
. The default value is300
.For more information about overcloud parameters and their usage, see Overcloud Parameters.
Procedure
-
Log in to the undercloud as the
stack
user. List all Compute nodes and their UUIDs:
$ source ~/stackrc (undercloud) $ openstack server list --name compute
Identify the UUID of the Compute node that you want to reboot.
From the undercloud, select a Compute node and disable it:
$ source ~/overcloudrc (overcloud) $ openstack compute service list (overcloud) $ openstack compute service set <hostname> nova-compute --disable
List all instances on the Compute node:
(overcloud) $ openstack server list --host <hostname> --all-projects
Optional: If you decide to migrate the instances to another Compute node, complete the following steps:
If you decide to migrate the instances to another Compute node, use one of the following commands:
To migrate the instance to a different host, run the following command:
(overcloud) $ openstack server migrate <instance_id> --live <target_host> --wait
Let
nova-scheduler
automatically select the target host:(overcloud) $ nova live-migration <instance_id>
Live migrate all instances at once:
$ nova host-evacuate-live <hostname>
NoteThe
nova
command might cause some deprecation warnings, which are safe to ignore.
- Wait until migration completes.
Confirm that the migration was successful:
(overcloud) $ openstack server list --host <hostname> --all-projects
- Continue to migrate instances until none remain on the Compute node.
Log in to the Compute node and reboot the node:
[heat-admin@overcloud-compute-0 ~]$ sudo reboot
- Wait until the node boots.
Re-enable the Compute node:
$ source ~/overcloudrc (overcloud) $ openstack compute service set <hostname> nova-compute --enable
Check that the Compute node is enabled:
(overcloud) $ openstack compute service list