Chapter 1. Red Hat OpenStack Platform high availability overview and planning
Red Hat OpenStack Platform (RHOSP) high availability (HA) is a collection of services that orchestrate failover and recovery for your deployment. When you plan your HA deployment, ensure that you review the considerations for different aspects of the environment, such as hardware assignments and network configuration.
1.1. Red Hat OpenStack Platform high availability services
Red Hat OpenStack Platform (RHOSP) employs several technologies to provide the services required to implement high availability (HA). These services include Galera, RabbitMQ, Redis, HAProxy, individual services that Pacemaker manages, and Systemd and plain container services that Podman manages.
1.1.1. Service types
- Core container
Core container services are Galera, RabbitMQ, Redis, and HAProxy. These services run on all Controller nodes and require specific management and constraints for the start, stop and restart actions. You use Pacemaker to launch, manage, and troubleshoot core container services.
NoteRHOSP uses the MariaDB Galera Cluster to manage database replication.
- Active-passive
-
Active-passive services run on one Controller node at a time, and include services such as
openstack-cinder-volume
. To move an active-passive service, you must use Pacemaker to ensure that the correct stop-start sequence is followed. - Systemd and plain container
Systemd and plain container services are independent services that can withstand a service interruption. Therefore, if you restart a high availability service such as Galera, you do not need to manually restart any other service, such as
nova-api
. You can use systemd or Podman to directly manage systemd and plain container services.When orchestrating your HA deployment, director uses templates and Puppet modules to ensure that all services are configured and launched correctly. In addition, when troubleshooting HA issues, you must interact with services in the HA framework using the
podman
command or thesystemctl
command.
1.1.2. Service modes
HA services can run in one of the following modes:
- Active-active
Pacemaker runs the same service on multiple Controller nodes, and uses HAProxy to distribute traffic across the nodes or to a specific Controller with a single IP address. In some cases, HAProxy distributes traffic to active-active services with Round Robin scheduling. You can add more Controller nodes to improve performance.
ImportantActive-active mode is supported only in distributed compute node (DCN) architecture at Edge sites.
- Active-passive
- Services that are unable to run in active-active mode must run in active-passive mode. In this mode, only one instance of the service is active at a time. For example, HAProxy uses stick-table options to direct incoming Galera database connection requests to a single back-end service. This helps prevent too many simultaneous connections to the same data from multiple Galera nodes.
1.2. Planning high availability hardware assignments
When you plan hardware assignments, consider the number of nodes that you want to run in your deployment, as well as the number of Virtual Machine (vm) instances that you plan to run on Compute nodes.
- Controller nodes
- Most non-storage services run on Controller nodes. All services are replicated across the three nodes and are configured as active-active or active-passive services. A high availability (HA) environment requires a minimum of three nodes.
- Red Hat Ceph Storage nodes
- Storage services run on these nodes and provide pools of Red Hat Ceph Storage areas to the Compute nodes. A minimum of three nodes are required.
- Compute nodes
- Virtual machine (VM) instances run on Compute nodes. You can deploy as many Compute nodes as you need to meet your capacity requirements, as well as migration and reboot operations. You must connect Compute nodes to the storage network and to the project network to ensure that VMs can access storage nodes, VMs on other Compute nodes, and public networks.
- STONITH
- You must configure a STONITH device for each node that is a part of the Pacemaker cluster in a highly available overcloud. Deploying a highly available overcloud without STONITH is not supported. For more information on STONITH and Pacemaker, see Fencing in a Red Hat High Availability Cluster and Support Policies for RHEL High Availability Clusters.
1.3. Planning high availability networking
When you plan the virtual and physical networks, consider the provisioning network switch configuration and the external network switch configuration.
In addition to the network configuration, you must deploy the following components:
- Provisioning network switch
- This switch must be able to connect the undercloud to all the physical computers in the overcloud.
- The NIC on each overcloud node that is connected to this switch must be able to PXE boot from the undercloud.
-
The
portfast
parameter must be enabled.
- Controller/External network switch
- This switch must be configured to perform VLAN tagging for the other VLANs in the deployment.
- Allow only VLAN 100 traffic to external networks.
- Networking hardware and keystone endpoint
To prevent a Controller node network card or network switch failure disrupting overcloud services availability, ensure that the keystone admin endpoint is located on a network that uses bonded network cards or networking hardware redundancy.
If you move the keystone endpoint to a different network, such as
internal_api
, ensure that the undercloud can reach the VLAN or subnet. For more information, see the Red Hat Knowledgebase solution How to migrate Keystone Admin Endpoint to internal_api network.
1.4. Accessing the high availability environment
To investigate high availability (HA) nodes, use the stack
user to log in to the overcloud nodes and run the openstack server list
command to view the status and details of the nodes.
Prerequisites
- High availability is deployed and running.
Procedure
-
In a running HA environment, log in to the undercloud as the
stack
user. Identify the IP addresses of your overcloud nodes:
$ source ~/stackrc (undercloud) $ openstack server list +-------+------------------------+---+----------------------+---+ | ID | Name |...| Networks |...| +-------+------------------------+---+----------------------+---+ | d1... | overcloud-controller-0 |...| ctlplane=*10.200.0.11* |...| ...
To log in to one of the overcloud nodes, run the following commands:
(undercloud) $ ssh heat-admin@<node_IP>
Replace
<node_ip>
with the IP address of the node that you want to log in to.