Chapter 19. Shutting down and starting up the undercloud and overcloud
If you must perform maintenance on the undercloud and overcloud, you must shut down and start up the undercloud and overcloud nodes in a specific order to ensure minimal issues when your start your overcloud.
Prerequisites
- A running undercloud and overcloud
19.1. Undercloud and overcloud shutdown order
To shut down the Red Hat OpenStack Platform environment, you must shut down the overcloud and undercloud in the following order:
- Shut down instances on overcloud Compute nodes
- Shut down Compute nodes
- Stop all high availability and OpenStack Platform services on Controller nodes
- Shut down Ceph Storage nodes
- Shut down Controller nodes
- Shut down the undercloud
19.2. Shutting down instances on overcloud Compute nodes
As a part of shutting down the Red Hat OpenStack Platform environment, shut down all instances on Compute nodes before shutting down the Compute nodes.
Prerequisites
- An overcloud with active Compute services
Procedure
-
Log in to the undercloud as the
stack
user. Source the credentials file for your overcloud:
$ source ~/overcloudrc
View running instances in the overcloud:
$ openstack server list --all-projects
Stop each instance in the overcloud:
$ openstack server stop <INSTANCE>
Repeat this step for each instance until you stop all instances in the overcloud.
19.3. Shutting down Compute nodes
As a part of shutting down the Red Hat OpenStack Platform environment, log in to and shut down each Compute node.
Prerequisites
- Shut down all instances on the Compute nodes
Procedure
-
Log in as the
root
user to a Compute node. Shut down the node:
# shutdown -h now
- Perform these steps for each Compute node until you shut down all Compute nodes.
19.4. Stopping services on Controller nodes
As a part of shutting down the Red Hat OpenStack Platform environment, stop services on the Controller nodes before shutting down the nodes. This includes Pacemaker and systemd services.
Prerequisites
- An overcloud with active Pacemaker services
Procedure
-
Log in as the
root
user to a Controller node. Stop the Pacemaker cluster.
# pcs cluster stop --all
This command stops the cluster on all nodes.
Wait until the Pacemaker services stop and check that the services stopped.
Check the Pacemaker status:
# pcs status
Check that no Pacemaker services are running in Podman:
# podman ps --filter "name=.*-bundle.*"
Stop the Red Hat OpenStack Platform services:
# systemctl stop 'tripleo_*'
Wait until the services stop and check that services are no longer running in Podman:
# podman ps
19.5. Shutting down Ceph Storage nodes
As a part of shutting down the Red Hat OpenStack Platform environment, disable Ceph Storage services then log in to and shut down each Ceph Storage node.
Prerequisites
- A healthy Ceph Storage cluster
- Ceph MON services are running on standalone Ceph MON nodes or on Controller nodes
Procedure
-
Log in as the
root
user to a node that runs Ceph MON services, such as a Controller node or a standalone Ceph MON node. Check the health of the cluster. In the following example, the
podman
command runs a status check within a Ceph MON container on a Controller node:# sudo podman exec -it ceph-mon-controller-0 ceph status
Ensure that the status is
HEALTH_OK
.Set the
noout
,norecover
,norebalance
,nobackfill
,nodown
, andpause
flags for the cluster. In the following example, thepodman
commands set these flags through a Ceph MON container on a Controller node:# sudo podman exec -it ceph-mon-controller-0 ceph osd set noout # sudo podman exec -it ceph-mon-controller-0 ceph osd set norecover # sudo podman exec -it ceph-mon-controller-0 ceph osd set norebalance # sudo podman exec -it ceph-mon-controller-0 ceph osd set nobackfill # sudo podman exec -it ceph-mon-controller-0 ceph osd set nodown # sudo podman exec -it ceph-mon-controller-0 ceph osd set pause
Shut down each Ceph Storage node:
-
Log in as the
root
user to a Ceph Storage node. Shut down the node:
# shutdown -h now
- Perform these steps for each Ceph Storage node until you shut down all Ceph Storage nodes.
-
Log in as the
Shut down any standalone Ceph MON nodes:
-
Log in as the
root
user to a standalone Ceph MON node. Shut down the node:
# shutdown -h now
- Perform these steps for each standalone Ceph MON node until you shut down all standalone Ceph MON nodes.
-
Log in as the
Additional resources
19.6. Shutting down Controller nodes
As a part of shutting down the Red Hat OpenStack Platform environment, log in to and shut down each Controller node.
Prerequisites
- Stop the Pacemaker cluster
- Stop all Red Hat OpenStack Platform services on the Controller nodes
Procedure
-
Log in as the
root
user to a Controller node. Shut down the node:
# shutdown -h now
- Perform these steps for each Controller node until you shut down all Controller nodes.
19.7. Shutting down the undercloud
As a part of shutting down the Red Hat OpenStack Platform environment, log in to the undercloud node and shut down the undercloud.
Prerequisites
- A running undercloud
Procedure
-
Log in to the undercloud as the
stack
user. Shut down the undercloud:
$ sudo shutdown -h now
19.8. Performing system maintenance
After you completely shut down the undercloud and overcloud, perform any maintenance to the systems in your environment and then start up the undercloud and overcloud.
19.9. Undercloud and overcloud startup order
To start the Red Hat OpenStack Platform environment, you must start the undercloud and overcloud in the following order:
- Start the undercloud.
- Start Controller nodes.
- Start Ceph Storage nodes.
- Start Compute nodes.
- Start instances on overcloud Compute nodes.
19.10. Starting the undercloud
As a part of starting the Red Hat OpenStack Platform environment, power on the undercloud node, log in to the undercloud, and check the undercloud services.
Prerequisites
- The undercloud is powered down.
Procedure
- Power on the undercloud and wait until the undercloud boots.
Verification
-
Log in to the undercloud host as the
stack
user. Source the
stackrc
undercloud credentials file:$ source ~/stackrc
Check the services on the undercloud:
$ systemctl list-units 'tripleo_*'
Create and validate a static inventory file named
inventory.yaml
:$ tripleo-ansible-inventory --static-yaml-inventory inventory.yaml $ openstack tripleo validator run --group pre-introspection \ -i inventory.yaml
Check that all services and containers are active and healthy:
$ openstack tripleo validator run --validation service-status \ --limit undercloud -i inventory.yaml
Additional resources
19.11. Starting Controller nodes
As a part of starting the Red Hat OpenStack Platform environment, power on each Controller node and check the non-Pacemaker services on the node.
Prerequisites
- The Controller nodes are powered down.
Procedure
- Power on each Controller node.
Verification
-
Log in to each Controller node as the
root
user. Check the services on the Controller node:
$ systemctl -t service
Only non-Pacemaker based services are running.
Wait until the Pacemaker services start and check that the services started:
$ pcs status
NoteIf your environment uses Instance HA, the Pacemaker resources do not start until you start the Compute nodes or perform a manual unfence operation with the
pcs stonith confirm <compute_node>
command. You must run this command on each Compute node that uses Instance HA.
19.12. Starting Ceph Storage nodes
As a part of starting the Red Hat OpenStack Platform environment, power on the Ceph MON and Ceph Storage nodes and enable Ceph Storage services.
Prerequisites
- A powered down Ceph Storage cluster
- Ceph MON services are enabled on powered down standalone Ceph MON nodes or on powered on Controller nodes
Procedure
- If your environment has standalone Ceph MON nodes, power on each Ceph MON node.
- Power on each Ceph Storage node.
-
Log in as the
root
user to a node that runs Ceph MON services, such as a Controller node or a standalone Ceph MON node. Check the status of the cluster nodes. In the following example, the
podman
command runs a status check within a Ceph MON container on a Controller node:# sudo podman exec -it ceph-mon-controller-0 ceph status
Ensure that each node is powered on and connected.
Unset the
noout
,norecover
,norebalance
,nobackfill
,nodown
andpause
flags for the cluster. In the following example, thepodman
commands unset these flags through a Ceph MON container on a Controller node:# sudo podman exec -it ceph-mon-controller-0 ceph osd unset noout # sudo podman exec -it ceph-mon-controller-0 ceph osd unset norecover # sudo podman exec -it ceph-mon-controller-0 ceph osd unset norebalance # sudo podman exec -it ceph-mon-controller-0 ceph osd unset nobackfill # sudo podman exec -it ceph-mon-controller-0 ceph osd unset nodown # sudo podman exec -it ceph-mon-controller-0 ceph osd unset pause
Verification
Check the health of the cluster. In the following example, the
podman
command runs a status check within a Ceph MON container on a Controller node:# sudo podman exec -it ceph-mon-controller-0 ceph status
Ensure the status is
HEALTH_OK
.
Additional resources
19.13. Starting Compute nodes
As a part of starting the Red Hat OpenStack Platform environment, power on each Compute node and check the services on the node.
Prerequisites
- Powered down Compute nodes
Procedure
- Power on each Compute node.
Verification
-
Log in to each Compute as the
root
user. Check the services on the Compute node:
$ systemctl -t service
19.14. Starting instances on overcloud Compute nodes
As a part of starting the Red Hat OpenStack Platform environment, start the instances on on Compute nodes.
Prerequisites
- An active overcloud with active nodes
Procedure
-
Log in to the undercloud as the
stack
user. Source the credentials file for your overcloud:
$ source ~/overcloudrc
View running instances in the overcloud:
$ openstack server list --all-projects
Start an instance in the overcloud:
$ openstack server start <INSTANCE>