High Availability Deployment and Usage
Planning, deploying, and managing high availability in Red Hat OpenStack Platform
Abstract
To keep your OpenStack environment up and running efficiently, use the Red Hat OpenStack Platform director to create configurations that offer high availability and load-balancing across all major services in OpenStack.Making open source more inclusive
Red Hat is committed to replacing problematic language in our code, documentation, and web properties. We are beginning with these four terms: master, slave, blacklist, and whitelist. Because of the enormity of this endeavor, these changes will be implemented gradually over several upcoming releases. For more details, see our CTO Chris Wright’s message.
Chapter 1. Red Hat OpenStack Platform high availability overview and planning
Red Hat OpenStack Platform (RHOSP) high availability (HA) is a collection of services that orchestrate failover and recovery for your deployment. When you plan your HA deployment, ensure that you review the considerations for different aspects of the environment, such as hardware assignments and network configuration.
1.1. Red Hat OpenStack Platform high availability services
Red Hat OpenStack Platform (RHOSP) employs several technologies to provide the services required to implement high availability (HA). These services include Galera, RabbitMQ, Redis, HAProxy, individual services that Pacemaker manages, and Systemd and plain container services that Podman manages.
1.1.1. Service types
- Core container
Core container services are Galera, RabbitMQ, Redis, and HAProxy. These services run on all Controller nodes and require specific management and constraints for the start, stop and restart actions. You use Pacemaker to launch, manage, and troubleshoot core container services.
NoteRHOSP uses the MariaDB Galera Cluster to manage database replication.
- Active-passive
-
Active-passive services run on one Controller node at a time, and include services such as
openstack-cinder-volume
. To move an active-passive service, you must use Pacemaker to ensure that the correct stop-start sequence is followed. - Systemd and plain container
Systemd and plain container services are independent services that can withstand a service interruption. Therefore, if you restart a high availability service such as Galera, you do not need to manually restart any other service, such as
nova-api
. You can use systemd or Podman to directly manage systemd and plain container services.When orchestrating your HA deployment, director uses templates and Puppet modules to ensure that all services are configured and launched correctly. In addition, when troubleshooting HA issues, you must interact with services in the HA framework using the
podman
command or thesystemctl
command.
1.1.2. Service modes
HA services can run in one of the following modes:
- Active-active
Pacemaker runs the same service on multiple Controller nodes, and uses HAProxy to distribute traffic across the nodes or to a specific Controller with a single IP address. In some cases, HAProxy distributes traffic to active-active services with Round Robin scheduling. You can add more Controller nodes to improve performance.
ImportantActive-active mode is supported only in distributed compute node (DCN) architecture at Edge sites.
- Active-passive
- Services that are unable to run in active-active mode must run in active-passive mode. In this mode, only one instance of the service is active at a time. For example, HAProxy uses stick-table options to direct incoming Galera database connection requests to a single back-end service. This helps prevent too many simultaneous connections to the same data from multiple Galera nodes.
1.2. Planning high availability hardware assignments
When you plan hardware assignments, consider the number of nodes that you want to run in your deployment, as well as the number of Virtual Machine (vm) instances that you plan to run on Compute nodes.
- Controller nodes
- Most non-storage services run on Controller nodes. All services are replicated across the three nodes and are configured as active-active or active-passive services. A high availability (HA) environment requires a minimum of three nodes.
- Red Hat Ceph Storage nodes
- Storage services run on these nodes and provide pools of Red Hat Ceph Storage areas to the Compute nodes. A minimum of three nodes are required.
- Compute nodes
- Virtual machine (VM) instances run on Compute nodes. You can deploy as many Compute nodes as you need to meet your capacity requirements, as well as migration and reboot operations. You must connect Compute nodes to the storage network and to the project network to ensure that VMs can access storage nodes, VMs on other Compute nodes, and public networks.
- STONITH
- You must configure a STONITH device for each node that is a part of the Pacemaker cluster in a highly available overcloud. Deploying a highly available overcloud without STONITH is not supported. For more information on STONITH and Pacemaker, see Fencing in a Red Hat High Availability Cluster and Support Policies for RHEL High Availability Clusters.
1.3. Planning high availability networking
When you plan the virtual and physical networks, consider the provisioning network switch configuration and the external network switch configuration.
In addition to the network configuration, you must deploy the following components:
- Provisioning network switch
- This switch must be able to connect the undercloud to all the physical computers in the overcloud.
- The NIC on each overcloud node that is connected to this switch must be able to PXE boot from the undercloud.
-
The
portfast
parameter must be enabled.
- Controller/External network switch
- This switch must be configured to perform VLAN tagging for the other VLANs in the deployment.
- Allow only VLAN 100 traffic to external networks.
- Networking hardware and keystone endpoint
To prevent a Controller node network card or network switch failure disrupting overcloud services availability, ensure that the keystone admin endpoint is located on a network that uses bonded network cards or networking hardware redundancy.
If you move the keystone endpoint to a different network, such as
internal_api
, ensure that the undercloud can reach the VLAN or subnet. For more information, see the Red Hat Knowledgebase solution How to migrate Keystone Admin Endpoint to internal_api network.
1.4. Accessing the high availability environment
To investigate high availability (HA) nodes, use the stack
user to log in to the overcloud nodes and run the openstack server list
command to view the status and details of the nodes.
Prerequisites
- High availability is deployed and running.
Procedure
-
In a running HA environment, log in to the undercloud as the
stack
user. Identify the IP addresses of your overcloud nodes:
$ source ~/stackrc (undercloud) $ openstack server list +-------+------------------------+---+----------------------+---+ | ID | Name |...| Networks |...| +-------+------------------------+---+----------------------+---+ | d1... | overcloud-controller-0 |...| ctlplane=*10.200.0.11* |...| ...
Log in to one of the overcloud nodes:
(undercloud) $ ssh tripleo-admin@<node_IP>
Replace
<node_ip>
with the IP address of the node that you want to log in to.
1.5. Additional resources
Chapter 2. Example deployment: High availability cluster with Compute and Ceph
This example scenario shows the architecture, hardware and network specifications, and the undercloud and overcloud configuration files for a high availability deployment with the OpenStack Compute service and Red Hat Ceph Storage.
This deployment is intended to use as a reference for test environments and is not supported for production environments.
Figure 2.1. Example high availability deployment architecture
2.1. Example high availability hardware specifications
The example HA deployment uses a specific hardware configuration. You can adjust the CPU, memory, storage, or NICs as needed in your own test deployment.
Number of Computers | Purpose | CPUs | Memory | Disk Space | Power Management | NICs |
---|---|---|---|---|---|---|
1 | undercloud node | 4 | 24 GB | 40 GB | IPMI | 2 (1 external; 1 on provisioning) + 1 IPMI |
3 | Controller nodes | 4 | 24 GB | 40 GB | IPMI | 3 (2 bonded on overcloud; 1 on provisioning) + 1 IPMI |
3 | Ceph Storage nodes | 4 | 24 GB | 40 GB | IPMI | 3 (2 bonded on overcloud; 1 on provisioning) + 1 IPMI |
2 | Compute nodes (add more as needed) | 4 | 24 GB | 40 GB | IPMI | 3 (2 bonded on overcloud; 1 on provisioning) + 1 IPMI |
2.2. Example high availability network specifications
The example HA deployment uses a specific virtual and physical network configuration. You can adjust the configuration as needed in your own test deployment.
This example does not include hardware redundancy for the control plane and the provisioning network where the overcloud keystone admin endpoint is configured. For information about planning your high availability networking, see Section 1.3, “Planning high availability networking”.
Physical NICs | Purpose | VLANs | Description |
---|---|---|---|
eth0 | Provisioning network (undercloud) | N/A | Manages all nodes from director (undercloud) |
eth1 and eth2 | Controller/External (overcloud) | N/A | Bonded NICs with VLANs |
External network | VLAN 100 | Allows access from outside the environment to the project networks, internal API, and OpenStack Horizon Dashboard | |
Internal API | VLAN 201 | Provides access to the internal API between Compute nodes and Controller nodes | |
Storage access | VLAN 202 | Connects Compute nodes to storage media | |
Storage management | VLAN 203 | Manages storage media | |
Project network | VLAN 204 | Provides project network services to RHOSP |
2.3. Example high availability undercloud configuration files
The example HA deployment uses the undercloud configuration files instackenv.json
, undercloud.conf
, and network-environment.yaml
.
instackenv.json
{ "nodes": [ { "pm_password": "testpass", "memory": "24", "pm_addr": "10.100.0.11", "mac": [ "2c:c2:60:3b:b3:94" ], "pm_type": "ipmi", "disk": "40", "arch": "x86_64", "cpu": "1", "pm_user": "admin" }, { "pm_password": "testpass", "memory": "24", "pm_addr": "10.100.0.12", "mac": [ "2c:c2:60:51:b7:fb" ], "pm_type": "ipmi", "disk": "40", "arch": "x86_64", "cpu": "1", "pm_user": "admin" }, { "pm_password": "testpass", "memory": "24", "pm_addr": "10.100.0.13", "mac": [ "2c:c2:60:76:ce:a5" ], "pm_type": "ipmi", "disk": "40", "arch": "x86_64", "cpu": "1", "pm_user": "admin" }, { "pm_password": "testpass", "memory": "24", "pm_addr": "10.100.0.51", "mac": [ "2c:c2:60:08:b1:e2" ], "pm_type": "ipmi", "disk": "40", "arch": "x86_64", "cpu": "1", "pm_user": "admin" }, { "pm_password": "testpass", "memory": "24", "pm_addr": "10.100.0.52", "mac": [ "2c:c2:60:20:a1:9e" ], "pm_type": "ipmi", "disk": "40", "arch": "x86_64", "cpu": "1", "pm_user": "admin" }, { "pm_password": "testpass", "memory": "24", "pm_addr": "10.100.0.53", "mac": [ "2c:c2:60:58:10:33" ], "pm_type": "ipmi", "disk": "40", "arch": "x86_64", "cpu": "1", "pm_user": "admin" }, { "pm_password": "testpass", "memory": "24", "pm_addr": "10.100.0.101", "mac": [ "2c:c2:60:31:a9:55" ], "pm_type": "ipmi", "disk": "40", "arch": "x86_64", "cpu": "2", "pm_user": "admin" }, { "pm_password": "testpass", "memory": "24", "pm_addr": "10.100.0.102", "mac": [ "2c:c2:60:0d:e7:d1" ], "pm_type": "ipmi", "disk": "40", "arch": "x86_64", "cpu": "2", "pm_user": "admin" } ], "overcloud": {"password": "7adbbbeedc5b7a07ba1917e1b3b228334f9a2d4e", "endpoint": "http://192.168.1.150:5000/v2.0/" } }
undercloud.conf
[DEFAULT] image_path = /home/stack/images local_ip = 10.200.0.1/24 undercloud_public_vip = 10.200.0.2 undercloud_admin_vip = 10.200.0.3 undercloud_service_certificate = /etc/pki/instack-certs/undercloud.pem local_interface = eth0 masquerade_network = 10.200.0.0/24 dhcp_start = 10.200.0.5 dhcp_end = 10.200.0.24 network_cidr = 10.200.0.0/24 network_gateway = 10.200.0.1 #discovery_interface = br-ctlplane discovery_iprange = 10.200.0.150,10.200.0.200 discovery_runbench = 1 undercloud_admin_password = testpass ...
network-environment.yaml
resource_registry: OS::TripleO::BlockStorage::Net::SoftwareConfig: /home/stack/templates/nic-configs/cinder-storage.yaml OS::TripleO::Compute::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute.yaml OS::TripleO::Controller::Net::SoftwareConfig: /home/stack/templates/nic-configs/controller.yaml OS::TripleO::ObjectStorage::Net::SoftwareConfig: /home/stack/templates/nic-configs/swift-storage.yaml OS::TripleO::CephStorage::Net::SoftwareConfig: /home/stack/templates/nic-configs/ceph-storage.yaml parameter_defaults: InternalApiNetCidr: 172.16.0.0/24 TenantNetCidr: 172.17.0.0/24 StorageNetCidr: 172.18.0.0/24 StorageMgmtNetCidr: 172.19.0.0/24 ExternalNetCidr: 192.168.1.0/24 InternalApiAllocationPools: [{start: 172.16.0.10, end: 172.16.0.200}] TenantAllocationPools: [{start: 172.17.0.10, end: 172.17.0.200}] StorageAllocationPools: [{start: 172.18.0.10, end: 172.18.0.200}] StorageMgmtAllocationPools: [{start: 172.19.0.10, end: 172.19.0.200}] # Leave room for floating IPs in the External allocation pool ExternalAllocationPools: [{start: 192.168.1.150, end: 192.168.1.199}] InternalApiNetworkVlanID: 201 StorageNetworkVlanID: 202 StorageMgmtNetworkVlanID: 203 TenantNetworkVlanID: 204 ExternalNetworkVlanID: 100 # Set to the router gateway on the external network ExternalInterfaceDefaultRoute: 192.168.1.1 # Set to "br-ex" if using floating IPs on native VLAN on bridge br-ex NeutronExternalNetworkBridge: "''" # Customize bonding options if required BondInterfaceOvsOptions: "bond_mode=active-backup lacp=off other_config:bond-miimon-interval=100"
2.4. Example high availability overcloud configuration files
The example HA deployment uses the overcloud configuration files haproxy.cfg
, corosync.cfg
, and ceph.cfg
.
/var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg (Controller nodes)
This file identifies the services that HAProxy manages. It contains the settings for the services that HAProxy monitors. This file is identical on all Controller nodes.
# This file is managed by Puppet global daemon group haproxy log /dev/log local0 maxconn 20480 pidfile /var/run/haproxy.pid ssl-default-bind-ciphers !SSLv2:kEECDH:kRSA:kEDH:kPSK:+3DES:!aNULL:!eNULL:!MD5:!EXP:!RC4:!SEED:!IDEA:!DES ssl-default-bind-options no-sslv3 stats socket /var/lib/haproxy/stats mode 600 level user stats timeout 2m user haproxy defaults log global maxconn 4096 mode tcp retries 3 timeout http-request 10s timeout queue 2m timeout connect 10s timeout client 2m timeout server 2m timeout check 10s listen aodh bind 192.168.1.150:8042 transparent bind 172.16.0.10:8042 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk server overcloud-controller-0.internalapi.localdomain 172.16.0.13:8042 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:8042 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:8042 check fall 5 inter 2000 rise 2 listen cinder bind 192.168.1.150:8776 transparent bind 172.16.0.10:8776 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk server overcloud-controller-0.internalapi.localdomain 172.16.0.13:8776 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:8776 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:8776 check fall 5 inter 2000 rise 2 listen glance_api bind 192.168.1.150:9292 transparent bind 172.18.0.10:9292 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk GET /healthcheck server overcloud-controller-0.internalapi.localdomain 172.18.0.17:9292 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.18.0.15:9292 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.18.0.16:9292 check fall 5 inter 2000 rise 2 listen gnocchi bind 192.168.1.150:8041 transparent bind 172.16.0.10:8041 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk server overcloud-controller-0.internalapi.localdomain 172.16.0.13:8041 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:8041 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:8041 check fall 5 inter 2000 rise 2 listen haproxy.stats bind 10.200.0.6:1993 transparent mode http stats enable stats uri / stats auth admin:PnDD32EzdVCf73CpjHhFGHZdV listen heat_api bind 192.168.1.150:8004 transparent bind 172.16.0.10:8004 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk timeout client 10m timeout server 10m server overcloud-controller-0.internalapi.localdomain 172.16.0.13:8004 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:8004 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:8004 check fall 5 inter 2000 rise 2 listen heat_cfn bind 192.168.1.150:8000 transparent bind 172.16.0.10:8000 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk timeout client 10m timeout server 10m server overcloud-controller-0.internalapi.localdomain 172.16.0.13:8000 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:8000 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:8000 check fall 5 inter 2000 rise 2 listen horizon bind 192.168.1.150:80 transparent bind 172.16.0.10:80 transparent mode http cookie SERVERID insert indirect nocache option forwardfor option httpchk server overcloud-controller-0.internalapi.localdomain 172.16.0.13:80 check cookie overcloud-controller-0 fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:80 check cookie overcloud-controller-0 fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:80 check cookie overcloud-controller-0 fall 5 inter 2000 rise 2 listen keystone_admin bind 192.168.24.15:35357 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk GET /v3 server overcloud-controller-0.ctlplane.localdomain 192.168.24.9:35357 check fall 5 inter 2000 rise 2 server overcloud-controller-1.ctlplane.localdomain 192.168.24.8:35357 check fall 5 inter 2000 rise 2 server overcloud-controller-2.ctlplane.localdomain 192.168.24.18:35357 check fall 5 inter 2000 rise 2 listen keystone_public bind 192.168.1.150:5000 transparent bind 172.16.0.10:5000 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk GET /v3 server overcloud-controller-0.internalapi.localdomain 172.16.0.13:5000 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:5000 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:5000 check fall 5 inter 2000 rise 2 listen mysql bind 172.16.0.10:3306 transparent option tcpka option httpchk stick on dst stick-table type ip size 1000 timeout client 90m timeout server 90m server overcloud-controller-0.internalapi.localdomain 172.16.0.13:3306 backup check inter 1s on-marked-down shutdown-sessions port 9200 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:3306 backup check inter 1s on-marked-down shutdown-sessions port 9200 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:3306 backup check inter 1s on-marked-down shutdown-sessions port 9200 listen neutron bind 192.168.1.150:9696 transparent bind 172.16.0.10:9696 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk server overcloud-controller-0.internalapi.localdomain 172.16.0.13:9696 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:9696 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:9696 check fall 5 inter 2000 rise 2 listen nova_metadata bind 172.16.0.10:8775 transparent option httpchk server overcloud-controller-0.internalapi.localdomain 172.16.0.13:8775 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:8775 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:8775 check fall 5 inter 2000 rise 2 listen nova_novncproxy bind 192.168.1.150:6080 transparent bind 172.16.0.10:6080 transparent balance source http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option tcpka timeout tunnel 1h server overcloud-controller-0.internalapi.localdomain 172.16.0.13:6080 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:6080 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:6080 check fall 5 inter 2000 rise 2 listen nova_osapi bind 192.168.1.150:8774 transparent bind 172.16.0.10:8774 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk server overcloud-controller-0.internalapi.localdomain 172.16.0.13:8774 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:8774 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:8774 check fall 5 inter 2000 rise 2 listen nova_placement bind 192.168.1.150:8778 transparent bind 172.16.0.10:8778 transparent mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk server overcloud-controller-0.internalapi.localdomain 172.16.0.13:8778 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:8778 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:8778 check fall 5 inter 2000 rise 2 listen panko bind 192.168.1.150:8977 transparent bind 172.16.0.10:8977 transparent http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk server overcloud-controller-0.internalapi.localdomain 172.16.0.13:8977 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:8977 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:8977 check fall 5 inter 2000 rise 2 listen redis bind 172.16.0.13:6379 transparent balance first option tcp-check tcp-check send AUTH\ V2EgUh2pvkr8VzU6yuE4XHsr9\r\n tcp-check send PING\r\n tcp-check expect string +PONG tcp-check send info\ replication\r\n tcp-check expect string role:master tcp-check send QUIT\r\n tcp-check expect string +OK server overcloud-controller-0.internalapi.localdomain 172.16.0.13:6379 check fall 5 inter 2000 rise 2 server overcloud-controller-1.internalapi.localdomain 172.16.0.14:6379 check fall 5 inter 2000 rise 2 server overcloud-controller-2.internalapi.localdomain 172.16.0.15:6379 check fall 5 inter 2000 rise 2 listen swift_proxy_server bind 192.168.1.150:8080 transparent bind 172.18.0.10:8080 transparent option httpchk GET /healthcheck timeout client 2m timeout server 2m server overcloud-controller-0.storage.localdomain 172.18.0.17:8080 check fall 5 inter 2000 rise 2 server overcloud-controller-1.storage.localdomain 172.18.0.15:8080 check fall 5 inter 2000 rise 2 server overcloud-controller-2.storage.localdomain 172.18.0.16:8080 check fall 5 inter 2000 rise 2
/etc/corosync/corosync.conf file (Controller nodes)
This file defines the cluster infrastructure, and is available on all Controller nodes.
totem { version: 2 cluster_name: tripleo_cluster transport: udpu token: 10000 } nodelist { node { ring0_addr: overcloud-controller-0 nodeid: 1 } node { ring0_addr: overcloud-controller-1 nodeid: 2 } node { ring0_addr: overcloud-controller-2 nodeid: 3 } } quorum { provider: corosync_votequorum } logging { to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes }
/etc/ceph/ceph.conf (Ceph nodes)
This file contains Ceph high availability settings, including the hostnames and IP addresses of the monitoring hosts.
[global] osd_pool_default_pgp_num = 128 osd_pool_default_min_size = 1 auth_service_required = cephx mon_initial_members = overcloud-controller-0,overcloud-controller-1,overcloud-controller-2 fsid = 8c835acc-6838-11e5-bb96-2cc260178a92 cluster_network = 172.19.0.11/24 auth_supported = cephx auth_cluster_required = cephx mon_host = 172.18.0.17,172.18.0.15,172.18.0.16 auth_client_required = cephx osd_pool_default_size = 3 osd_pool_default_pg_num = 128 public_network = 172.18.0.17/24
2.5. Additional resources
Chapter 3. Managing high availability services with Pacemaker
The Pacemaker service manages core container and active-passive services, such as Galera, RabbitMQ, Redis, and HAProxy. You use Pacemaker to view and manage general information about the managed services, virtual IP addresses, power management, and fencing.
3.1. Pacemaker resource bundles and containers
Pacemaker manages Red Hat OpenStack Platform (RHOSP) services as Bundle Set resources, or bundles. Most of these services are active-active services that start in the same way and always run on each Controller node.
Pacemaker manages the following resource types:
- Bundle
- A bundle resource configures and replicates the same container on all Controller nodes, maps the necessary storage paths to the container directories, and sets specific attributes related to the resource itself.
- Container
-
A container can run different kinds of resources, from simple
systemd
services like HAProxy to complex services like Galera, which requires specific resource agents that control and set the state of the service on the different nodes.
-
You cannot use
podman
orsystemctl
to manage bundles or containers. You can use the commands to check the status of the services, but you must use Pacemaker to perform actions on these services. -
Podman containers that Pacemaker controls have a
RestartPolicy
set tono
by Podman. This is to ensure that Pacemaker, and not Podman, controls the container start and stop actions.
3.1.1. Simple Bundle Set resources (simple bundles)
A simple Bundle Set resource, or simple bundle, is a set of containers that each include the same Pacemaker services that you want to deploy across the Controller nodes.
The following example shows a list of simple bundles from the output of the pcs status
command:
Podman container set: haproxy-bundle [192.168.24.1:8787/rhosp-rhel9/openstack-haproxy:pcmklatest] haproxy-bundle-podman-0 (ocf::heartbeat:podman): Started overcloud-controller-0 haproxy-bundle-podman-1 (ocf::heartbeat:podman): Started overcloud-controller-1 haproxy-bundle-podman-2 (ocf::heartbeat:podman): Started overcloud-controller-2
For each bundle, you can see the following details:
- The name that Pacemaker assigns to the service.
- The reference to the container that is associated with the bundle.
- The list and status of replicas that are running on the different Controller nodes.
The following example shows the settings for the haproxy-bundle
simple bundle:
$ sudo pcs resource show haproxy-bundle
Bundle: haproxy-bundle
Podman: image=192.168.24.1:8787/rhosp-rhel9/openstack-haproxy:pcmklatest network=host options="--user=root --log-driver=journald -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS" replicas=3 run-command="/bin/bash /usr/local/bin/kolla_start"
Storage Mapping:
options=ro source-dir=/var/lib/kolla/config_files/haproxy.json target-dir=/var/lib/kolla/config_files/config.json (haproxy-cfg-files)
options=ro source-dir=/var/lib/config-data/puppet-generated/haproxy/ target-dir=/var/lib/kolla/config_files/src (haproxy-cfg-data)
options=ro source-dir=/etc/hosts target-dir=/etc/hosts (haproxy-hosts)
options=ro source-dir=/etc/localtime target-dir=/etc/localtime (haproxy-localtime)
options=ro source-dir=/etc/pki/ca-trust/extracted target-dir=/etc/pki/ca-trust/extracted (haproxy-pki-extracted)
options=ro source-dir=/etc/pki/tls/certs/ca-bundle.crt target-dir=/etc/pki/tls/certs/ca-bundle.crt (haproxy-pki-ca-bundle-crt)
options=ro source-dir=/etc/pki/tls/certs/ca-bundle.trust.crt target-dir=/etc/pki/tls/certs/ca-bundle.trust.crt (haproxy-pki-ca-bundle-trust-crt)
options=ro source-dir=/etc/pki/tls/cert.pem target-dir=/etc/pki/tls/cert.pem (haproxy-pki-cert)
options=rw source-dir=/dev/log target-dir=/dev/log (haproxy-dev-log)
The example shows the following information about the containers in the bundle:
-
image
: Image used by the container, which refers to the local registry of the undercloud. -
network
: Container network type, which is"host"
in the example. -
options
: Specific options for the container. -
replicas
: Indicates how many copies of the container must run in the cluster. Each bundle includes three containers, one for each Controller node. -
run-command
: System command used to spawn the container. -
Storage Mapping: Mapping of the local path on each host to the container. The
haproxy
configuration is located in the/var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg
file instead of the/etc/haproxy/haproxy.cfg
file.
Although HAProxy provides high availability services by load balancing traffic to selected services, you configure HAProxy as a highly available service by managing it as a Pacemaker bundle service.
3.1.2. Complex Bundle Set resources (complex bundles)
Complex Bundle Set resources, or complex bundles, are Pacemaker services that specify a resource configuration in addition to the basic container configuration that is included in simple bundles.
This configuration is needed to manage multi-state resources, which are services that can have different states depending on the Controller node they run on.
This example shows a list of complex bundles from the output of the pcs status
command:
Podman container set: rabbitmq-bundle [192.168.24.1:8787/rhosp-rhel9/openstack-rabbitmq:pcmklatest] rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-0 rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-1 rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-2 Podman container set: galera-bundle [192.168.24.1:8787/rhosp-rhel9/openstack-mariadb:pcmklatest] galera-bundle-0 (ocf::heartbeat:galera): Master overcloud-controller-0 galera-bundle-1 (ocf::heartbeat:galera): Master overcloud-controller-1 galera-bundle-2 (ocf::heartbeat:galera): Master overcloud-controller-2 Podman container set: redis-bundle [192.168.24.1:8787/rhosp-rhel9/openstack-redis:pcmklatest] redis-bundle-0 (ocf::heartbeat:redis): Master overcloud-controller-0 redis-bundle-1 (ocf::heartbeat:redis): Slave overcloud-controller-1 redis-bundle-2 (ocf::heartbeat:redis): Slave overcloud-controller-2
This output shows the following information about each complex bundle:
- RabbitMQ: All three Controller nodes run a standalone instance of the service, similar to a simple bundle.
- Galera: All three Controller nodes are running as Galera masters under the same constraints.
- Redis: The overcloud-controller-0 container is running as the master, while the other two Controller nodes are running as slaves. Each container type might run under different constraints.
The following example shows the settings for the galera-bundle
complex bundle:
[...]
Bundle: galera-bundle
Podman: image=192.168.24.1:8787/rhosp-rhel9/openstack-mariadb:pcmklatest masters=3 network=host options="--user=root --log-driver=journald -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS" replicas=3 run-command="/bin/bash /usr/local/bin/kolla_start"
Network: control-port=3123
Storage Mapping:
options=ro source-dir=/var/lib/kolla/config_files/mysql.json target-dir=/var/lib/kolla/config_files/config.json (mysql-cfg-files)
options=ro source-dir=/var/lib/config-data/puppet-generated/mysql/ target-dir=/var/lib/kolla/config_files/src (mysql-cfg-data)
options=ro source-dir=/etc/hosts target-dir=/etc/hosts (mysql-hosts)
options=ro source-dir=/etc/localtime target-dir=/etc/localtime (mysql-localtime)
options=rw source-dir=/var/lib/mysql target-dir=/var/lib/mysql (mysql-lib)
options=rw source-dir=/var/log/mariadb target-dir=/var/log/mariadb (mysql-log-mariadb)
options=rw source-dir=/dev/log target-dir=/dev/log (mysql-dev-log)
Resource: galera (class=ocf provider=heartbeat type=galera)
Attributes: additional_parameters=--open-files-limit=16384 cluster_host_map=overcloud-controller-0:overcloud-controller-0.internalapi.localdomain;overcloud-controller-1:overcloud-controller-1.internalapi.localdomain;overcloud-controller-2:overcloud-controller-2.internalapi.localdomain enable_creation=true wsrep_cluster_address=gcomm://overcloud-controller-0.internalapi.localdomain,overcloud-controller-1.internalapi.localdomain,overcloud-controller-2.internalapi.localdomain
Meta Attrs: container-attribute-target=host master-max=3 ordered=true
Operations: demote interval=0s timeout=120 (galera-demote-interval-0s)
monitor interval=20 timeout=30 (galera-monitor-interval-20)
monitor interval=10 role=Master timeout=30 (galera-monitor-interval-10)
monitor interval=30 role=Slave timeout=30 (galera-monitor-interval-30)
promote interval=0s on-fail=block timeout=300s (galera-promote-interval-0s)
start interval=0s timeout=120 (galera-start-interval-0s)
stop interval=0s timeout=120 (galera-stop-interval-0s)
[...]
This output shows that, unlike in a simple bundle, the galera-bundle
resource includes explicit resource configuration that determines all aspects of the multi-state resource.
Although a service can run on multiple Controller nodes at the same time, the Controller node itself might not be listening at the IP address that is required to reach those services. For information about how to check the IP address of a service, see Section 3.4, “Viewing resource information for virtual IPs in a high availability cluster”.
3.2. Checking Pacemaker cluster status
You can check the status of the Pacemaker cluster in any node where Pacemaker is running, and view information about the number of resources that are active and running.
Prerequisites
- High availability is deployed and running.
Procedure
Log in to any Controller node as the
tripleo-admin
user.$ ssh tripleo-admin@overcloud-controller-0
Run the
pcs status
command:[tripleo-admin@overcloud-controller-0 ~] $ sudo pcs status
Example output:
Cluster name: tripleo_cluster Stack: corosync Current DC: overcloud-controller-1 (version 2.0.1-4.el9-0eb7991564) - partition with quorum Last updated: Thu Feb 8 14:29:21 2018 Last change: Sat Feb 3 11:37:17 2018 by root via cibadmin on overcloud-controller-2 12 nodes configured 37 resources configured Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] GuestOnline: [ galera-bundle-0@overcloud-controller-0 galera-bundle-1@overcloud-controller-1 galera-bundle-2@overcloud-controller-2 rabbitmq-bundle-0@overcloud-controller-0 rabbitmq-bundle-1@overcloud-controller-1 rabbitmq-bundle-2@overcloud-controller-2 redis-bundle-0@overcloud-controller-0 redis-bundle-1@overcloud-controller-1 redis-bundle-2@overcloud-controller-2 ] Full list of resources: [...]
The main sections of the output show the following information about the cluster:
-
Cluster name
: Name of the cluster. -
[NUM] nodes configured
: Number of nodes that are configured for the cluster. -
[NUM] resources configured
: Number of resources that are configured for the cluster. -
Online
: Names of the Controller nodes that are currently online. -
GuestOnline
: Names of the guest nodes that are currently online. Each guest node consists of a complex Bundle Set resource. For more information about bundle sets, see Section 3.1, “Pacemaker resource bundles and containers”.
-
3.3. Checking bundle status in a high availability cluster
You can check the status of a bundle from an undercloud node or log in to one of the Controller nodes to check the bundle status directly.
Prerequisites
- High availability is deployed and running.
Procedure
Use one of the following options:
Log in to an undercloud node and check the bundle status, in this example
haproxy-bundle
:$ sudo podman exec -it haproxy-bundle-podman-0 ps -efww | grep haproxy*
Example output:
root 7 1 0 06:08 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws haproxy 11 7 0 06:08 ? 00:00:17 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws
The output shows that the
haproxy
process is running inside the container.Log in to a Controller node and check the bundle status, in this example
haproxy
:$ ps -ef | grep haproxy*
Example output:
root 17774 17729 0 06:08 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws 42454 17819 17774 0 06:08 ? 00:00:21 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws root 288508 237714 0 07:04 pts/0 00:00:00 grep --color=auto haproxy* [root@controller-0 ~]# ps -ef | grep -e 17774 -e 17819 root 17774 17729 0 06:08 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws 42454 17819 17774 0 06:08 ? 00:00:22 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws root 301950 237714 0 07:07 pts/0 00:00:00 grep --color=auto -e 17774 -e 17819
3.4. Viewing resource information for virtual IPs in a high availability cluster
To check the status of all virtual IPs (VIPs) or a specific VIP, run the pcs resource show
command with the relevant options. Each IPaddr2 resource sets a virtual IP address that clients use to request access to a service. If the Controller node with that IP address fails, the IPaddr2 resource reassigns the IP address to a different Controller node.
Prerequisites
- High availability is deployed and running.
Procedure
Log in to any Controller node as the
tripleo-admin
user.$ ssh tripleo-admin@overcloud-controller-0
Use one of the following options:
Show all resources that use virtual IPs by running the
pcs resource show
command with the--full
option:$ sudo pcs resource show --full
Example output:
ip-10.200.0.6 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 ip-192.168.1.150 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-172.16.0.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 ip-172.16.0.11 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-172.18.0.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 ip-172.19.0.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2
Each IP address is initially attached to a specific Controller node. For example,
192.168.1.150
is started on overcloud-controller-0. However, if that Controller node fails, the IP address is reassigned to other Controller nodes in the cluster.The following table describes the IP addresses in the example output and shows the original allocation of each IP address.
Table 3.1. IP address description and allocation source IP Address Description Allocated From 10.200.0.6
Controller virtual IP address
Part of the
dhcp_start
anddhcp_end
range set to10.200.0.5-10.200.0.24
in theundercloud.conf
file192.168.1.150
Public IP address
ExternalAllocationPools
attribute in thenetwork-environment.yaml
file172.16.0.10
Provides access to OpenStack API services on a Controller node
InternalApiAllocationPools
in thenetwork-environment.yaml
file172.16.0.11
Provides access to Redis service on a Controller node
InternalApiAllocationPools
in thenetwork-environment.yaml
file172.18.0.10
Storage virtual IP address that provides access to the Glance API and to Swift Proxy services
StorageAllocationPools
attribute in thenetwork-environment.yaml
file172.19.0.10
Provides access to storage management
StorageMgmtAlloctionPools
in thenetwork-environment.yaml
fileView a specific VIP address by running the
pcs resource show
command with the name of the resource that uses that VIP, in this example ip-192.168.1.150:$ sudo pcs resource show ip-192.168.1.150
Example output:
Resource: ip-192.168.1.150 (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=192.168.1.150 cidr_netmask=32 Operations: start interval=0s timeout=20s (ip-192.168.1.150-start-timeout-20s) stop interval=0s timeout=20s (ip-192.168.1.150-stop-timeout-20s) monitor interval=10s timeout=20s (ip-192.168.1.150-monitor-interval-10s)
3.5. Viewing network information for virtual IPs in a high availability cluster
You can view the network interface information for a Controller node that is assigned to a specific virtual IP (VIP), and view port number assignments for a specific service.
Prerequisites
- High availability is deployed and running.
Procedure
Log in to the Controller node that is assigned to the IP address you want to view and run the
ip addr show
command on the network interface, in this examplevlan100
:$ ip addr show vlan100
Example output:
9: vlan100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether be:ab:aa:37:34:e7 brd ff:ff:ff:ff:ff:ff inet *192.168.1.151/24* brd 192.168.1.255 scope global vlan100 valid_lft forever preferred_lft forever inet *192.168.1.150/32* brd 192.168.1.255 scope global vlan100 valid_lft forever preferred_lft forever
Run the
netstat
command to show all processes that listen to the IP address, in this example192.168.1.150.haproxy
:$ sudo netstat -tupln | grep "192.168.1.150.haproxy"
Example output:
tcp 0 0 192.168.1.150:8778 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8042 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:9292 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8080 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:80 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8977 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:6080 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:9696 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8000 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8004 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8774 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:5000 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8776 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8041 0.0.0.0:* LISTEN 61029/haproxy
NoteProcesses that are listening to all local addresses, such as
0.0.0.0
, are also available through192.168.1.150
. These processes includesshd
,mysqld
,dhclient
,ntpd
.View the default port number assignments and the services they listen to by opening the configuration file for the HA service, in this example
/var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg
:-
TCP port 6080:
nova_novncproxy
-
TCP port 9696:
neutron
-
TCP port 8000:
heat_cfn
-
TCP port 80:
horizon
TCP port 8776:
cinder
In this example, most services that are defined in the
haproxy.cfg
file listen to the192.168.1.150
IP address on all three Controller nodes. However, only the controller-0 node is listening externally to the192.168.1.150
IP address.Therefore, if the controller-0 node fails, HAProxy only needs to re-assign
192.168.1.150
to another Controller node and all other services will already be running on the fallback Controller node.
-
TCP port 6080:
3.6. Checking fencing agent and Pacemaker daemon status
You can check the status of the fencing agent and the status of the Pacemaker daemons in any node where Pacemaker is running, and view information about the number of Controller nodes that are active and running.
Prerequisites
- High availability is deployed and running.
Procedure
Log in to any Controller node as the
tripleo-admin
user.$ ssh tripleo-admin@overcloud-controller-0
Run the
pcs status
command:[tripleo-admin@overcloud-controller-0 ~] $ sudo pcs status
Example output:
my-ipmilan-for-controller-0 (stonith:fence_ipmilan): Started my-ipmilan-for-controller-0 my-ipmilan-for-controller-1 (stonith:fence_ipmilan): Started my-ipmilan-for-controller-1 my-ipmilan-for-controller-2 (stonith:fence_ipmilan): Started my-ipmilan-for-controller-2 PCSD Status: overcloud-controller-0: Online overcloud-controller-1: Online overcloud-controller-2: Online Daemon Status: corosync: active/enabled pacemaker: active/enabled openstack-cinder-volume (systemd:openstack-cinder-volume): Started overcloud-controller-0 pcsd: active/enabled
The output shows the following sections of the
pcs status
command output:- my-ipmilan-for-controller: Shows the type of fencing for each Controller node (stonith:fence_ipmilan) and whether or not the IPMI service is stopped or running.
-
PCSD Status
: Shows that all three Controller nodes are currently online. -
Daemon Status
: Shows the status of the three Pacemaker daemons:corosync
,pacemaker
, andpcsd
. In the example, all three services are active and enabled.
3.7. Additional resources
Chapter 4. Fencing Controller nodes with STONITH
Fencing is the process of isolating a failed node to protect the cluster and the cluster resources. Without fencing, a failed node might result in data corruption in a cluster. Director uses Pacemaker to provide a highly available cluster of Controller nodes.
Pacemaker uses a process called STONITH to fence failed nodes. STONITH is an acronym for "Shoot the other node in the head". STONITH is disabled by default and requires manual configuration so that Pacemaker can control the power management of each node in the cluster.
If a Controller node fails a health check, the Controller node that acts as the Pacemaker designated coordinator (DC) uses the Pacemaker stonith
service to fence the impacted Controller node.
Deploying a highly available overcloud without STONITH is not supported. You must configure a STONITH device for each node that is a part of the Pacemaker cluster in a highly available overcloud. For more information on STONITH and Pacemaker, see Fencing in a Red Hat High Availability Cluster and Support Policies for RHEL High Availability Clusters.
4.1. Supported fencing agents
When you deploy a high availability environment with fencing, you can choose the fencing agents based on your environment needs. To change the fencing agent, you must configure additional parameters in the fencing.yaml
file.
Red Hat OpenStack Platform (RHOSP) supports the following fencing agents:
- Intelligent Platform Management Interface (IPMI)
- Default fencing mechanism that Red Hat OpenStack Platform (RHOSP) uses to manage fencing.
- STONITH Block Device (SBD)
The SBD (Storage-Based Death) daemon integrates with Pacemaker and a watchdog device to arrange for nodes to reliably shut down when fencing is triggered and in cases where traditional fencing mechanisms are not available.
Important-
SBD fencing is not supported in clusters with remote bare metal or virtual machine nodes that use
pacemaker_remote
, so it is not supported if your deployment uses Instance HA. -
fence_sbd
andsbd poison-pill
fencing with block storage devices are not supported. - SBD fencing is only supported with compatible watchdog devices. For more information, see Support Policies for RHEL High Availability Clusters - sbd and fence_sbd.
-
SBD fencing is not supported in clusters with remote bare metal or virtual machine nodes that use
fence_kdump
Use in deployments with the
kdump
crash recovery service. If you choose this agent, ensure that you have enough disk space to store the dump files.You can configure this agent as a secondary mechanism in addition to the IPMI,
fence_rhevm
, or Redfish fencing agents. If you configure multiple fencing agents, make sure that you allocate enough time for the first agent to complete the task before the second agent starts the next task.Important-
RHOSP director supports only the configuration of the
fence_kdump
STONITH agent, and not the configuration of the fullkdump
service that the fencing agent depends on. For information about configuring thekdump
service, see the article How do I configure fence_kdump in a Red Hat Pacemaker cluster. -
fence_kdump
is not supported if the Pacemaker network traffic interface uses theovs_bridges
orovs_bonds
network device. To enablefence_kdump
, you must change the network device tolinux_bond
orlinux_bridge
. For more information about network interface configuration, see Network interface reference.
-
RHOSP director supports only the configuration of the
- Redfish
-
Use in deployments with servers that support the DMTF Redfish APIs. To specify this agent, change the value of the
agent
parameter tofence_redfish
in thefencing.yaml
file. For more information about Redfish, see the DTMF Documentation. fence_rhevm
for Red Hat Virtualization (RHV)Use to configure fencing for Controller nodes that run in RHV environments. You can generate the
fencing.yaml
file in the same way as you do for IPMI fencing, but you must define thepm_type
parameter in thenodes.json
file to use RHV.By default, the
ssl_insecure
parameter is set to accept self-signed certificates. You can change the parameter value based on your security requirements.ImportantEnsure that you use a role that has permissions to create and launch virtual machines in RHV, such as
UserVMManager
.- Multi-layered fencing
-
You can configure multiple fencing agents to support complex fencing use cases. For example, you can configure IPMI fencing together with
fence_kdump
. The order of the fencing agents determines the order in which Pacemaker triggers each mechanism.
4.2. Deploying fencing on the overcloud
To deploy fencing on the overcloud, first review the state of STONITH and Pacemaker and configure the fencing.yaml
file. Then, deploy the overcloud and configure additional parameters. Finally, test that fencing is deployed correctly on the overcloud.
Prerequisites
- Choose the correct fencing agent for your deployment. For the list of supported fencing agents, see Section 4.1, “Supported fencing agents”.
-
Ensure that you can access the
nodes.json
file that you created when you registered your nodes in director. This file is a required input for thefencing.yaml
file that you generate during deployment. -
The
nodes.json
file must contain the MAC address of one of the network interfaces (NICs) on the node. For more information, see Registering Nodes for the Overcloud. -
If you use the Red Hat Virtualization (RHV) fencing agent, use a role that has permissions to manage virtual machines, such as
UserVMManager
.
Procedure
-
Log in to each Controller node as the
tripleo-admin
user. Verify that the cluster is running:
$ sudo pcs status
Example output:
Cluster name: openstackHA Last updated: Wed Jun 24 12:40:27 2015 Last change: Wed Jun 24 11:36:18 2015 Stack: corosync Current DC: lb-c1a2 (2) - partition with quorum Version: 1.1.12-a14efad 3 Nodes configured 141 Resources configured
Verify that STONITH is disabled:
$ sudo pcs property show
Example output:
Cluster Properties: cluster-infrastructure: corosync cluster-name: openstackHA dc-version: 1.1.12-a14efad have-watchdog: false stonith-enabled: false
Depending on the fencing agent that you want to use, choose one of the following options:
If you use the IPMI or RHV fencing agent, generate the
fencing.yaml
environment file:(undercloud) $ openstack overcloud generate fencing --output fencing.yaml nodes.json
NoteThis command converts
ilo
anddrac
power management details to IPMI equivalents.-
If you use a different fencing agent, such as STONITH Block Device (SBD),
fence_kdump
, or Redfish, or if you use pre-provisioned nodes, create thefencing.yaml
file manually.
SBD fencing only: Add the following parameter to the
fencing.yaml
file:parameter_defaults: ExtraConfig: pacemaker::corosync::enable_sbd: true
NoteThis step is applicable to initial overcloud deployments only. For more information about how to enable SBD fencing on an existing overcloud, see Enabling sbd fencing in RHEL 7 and 8.
Multi-layered fencing only: Add the level-specific parameters to the generated
fencing.yaml
file:parameter_defaults: EnableFencing: true FencingConfig: devices: level1: - agent: [VALUE] host_mac: aa:bb:cc:dd:ee:ff params: <parameter>: <value> level2: - agent: fence_agent2 host_mac: aa:bb:cc:dd:ee:ff params: <parameter>: <value>
Replace
<parameter>
and<value>
with the actual parameters and values that the fencing agent requires.Run the
overcloud deploy
command and include thefencing.yaml
file and any other environment files that are relevant for your deployment:openstack overcloud deploy --templates \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e ~/templates/network-environment.yaml \ -e ~/templates/storage-environment.yaml --ntp-server pool.ntp.org --neutron-network-type vxlan --neutron-tunnel-types vxlan \ -e fencing.yaml
SBD fencing only: Set the watchdog timer device interval and check that the interval is set correctly.
# pcs property set stonith-watchdog-timeout=<interval> # pcs property show
Verification
Log in to the overcloud as the
heat-admin
user and ensure that Pacemaker is configured as the resource manager:$ source stackrc $ openstack server list | grep controller $ ssh tripleo-admin@<controller-x_ip> $ sudo pcs status | grep fence stonith-overcloud-controller-x (stonith:fence_ipmilan): Started overcloud-controller-y
In this example, Pacemaker is configured to use a STONITH resource for each of the Controller nodes that are specified in the
fencing.yaml
file.NoteYou must not configure the
fence-resource
process on the same node that it controls.Check the fencing resource attributes. The STONITH attribute values must match the values in the
fencing.yaml
file:$ sudo pcs stonith show <stonith-resource-controller-x>
4.3. Testing fencing on the overcloud
To test that fencing works correctly, trigger fencing by closing all ports on a Controller node and restarting the server.
This procedure deliberately drops all connections to the Controller node, which causes the node to restart.
Prerequisites
- Fencing is deployed and running on the overcloud. For information on how to deploy fencing, see Section 4.2, “Deploying fencing on the overcloud”.
- Controller node is available for a restart.
Procedure
Log in to a Controller node as the
stack
user and source the credentials file:$ source stackrc $ openstack server list | grep controller $ ssh tripleo-admin@<controller-x_ip>
Change to the
root
user and close all connections to the Controller node:$ sudo -i iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT && iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT && iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 5016 -j ACCEPT && iptables -A INPUT -p udp -m state --state NEW -m udp --dport 5016 -j ACCEPT && iptables -A INPUT ! -i lo -j REJECT --reject-with icmp-host-prohibited && iptables -A OUTPUT -p tcp --sport 22 -j ACCEPT && iptables -A OUTPUT -p tcp --sport 5016 -j ACCEPT && iptables -A OUTPUT -p udp --sport 5016 -j ACCEPT && iptables -A OUTPUT ! -o lo -j REJECT --reject-with icmp-host-prohibited
From a different Controller node, locate the fencing event in the Pacemaker log file:
$ ssh tripleo-admin@<controller-x_ip> $ less /var/log/cluster/corosync.log (less): /fenc*
If the STONITH service performed the fencing action on the Controller, the log file shows a fencing event.
-
Wait a few minutes and then verify that the restarted Controller node is running in the cluster again by running the
pcs status
command. If you can see the Controller node that you restarted in the output, fencing functions correctly.
4.4. Viewing STONITH device information
To see how STONITH configures your fencing devices, run the pcs stonith show --full
command from the overcloud.
Prerequisites
- Fencing is deployed and running on the overcloud. For information on how to deploy fencing, see Section 4.2, “Deploying fencing on the overcloud”.
Procedure
Show the list of Controller nodes and the status of their STONITH devices:
$ sudo pcs stonith show --full Resource: my-ipmilan-for-controller-0 (class=stonith type=fence_ipmilan) Attributes: pcmk_host_list=overcloud-controller-0 ipaddr=10.100.0.51 login=admin passwd=abc lanplus=1 cipher=3 Operations: monitor interval=60s (my-ipmilan-for-controller-0-monitor-interval-60s) Resource: my-ipmilan-for-controller-1 (class=stonith type=fence_ipmilan) Attributes: pcmk_host_list=overcloud-controller-1 ipaddr=10.100.0.52 login=admin passwd=abc lanplus=1 cipher=3 Operations: monitor interval=60s (my-ipmilan-for-controller-1-monitor-interval-60s) Resource: my-ipmilan-for-controller-2 (class=stonith type=fence_ipmilan) Attributes: pcmk_host_list=overcloud-controller-2 ipaddr=10.100.0.53 login=admin passwd=abc lanplus=1 cipher=3 Operations: monitor interval=60s (my-ipmilan-for-controller-2-monitor-interval-60s)
This output shows the following information for each resource:
-
IPMI power management service that the fencing device uses to turn the machines on and off as needed, such as
fence_ipmilan
. -
IP address of the IPMI interface, such as
10.100.0.51
. -
User name to log in with, such as
admin
. -
Password to use to log in to the node, such as
abc
. -
Interval in seconds at which each host is monitored, such as
60s
.
-
IPMI power management service that the fencing device uses to turn the machines on and off as needed, such as
4.5. Fencing parameters
When you deploy fencing on the overcloud, you generate the fencing.yaml
file with the required parameters to configure fencing.
The following example shows the structure of the fencing.yaml
environment file:
parameter_defaults: EnableFencing: true FencingConfig: devices: - agent: fence_ipmilan host_mac: 11:11:11:11:11:11 params: ipaddr: 10.0.0.101 lanplus: true login: admin passwd: InsertComplexPasswordHere pcmk_host_list: host04 privlvl: administrator
This file contains the following parameters:
- EnableFencing
- Enables the fencing functionality for Pacemaker-managed nodes.
- FencingConfig
Lists the fencing devices and the parameters for each device:
-
agent
: Fencing agent name. host_mac
: The mac address in lowercase of the provisioning interface or any other network interface on the server. You can use this as a unique identifier for the fencing device.ImportantDo not use the MAC address of the IPMI interface.
-
params
: List of fencing device parameters.
-
- Fencing device parameters
Lists the fencing device parameters. This example shows the parameters for the IPMI fencing agent:
-
auth
: IPMI authentication type (md5
,password
, or none). -
ipaddr
: IPMI IP address. -
ipport
: IPMI port. -
login
: Username for the IPMI device. -
passwd
: Password for the IPMI device. -
lanplus
: Use lanplus to improve security of connection. -
privlvl
: Privilege level on IPMI device -
pcmk_host_list
: List of Pacemaker hosts.
-
Additional resources
4.6. Additional resources
Chapter 5. Load balancing traffic with HAProxy
The HAProxy service provides load balancing of traffic to Controller nodes in the high availability cluster, as well as logging and sample configurations. The haproxy
package contains the haproxy
daemon, which corresponds to the systemd
service of the same name. Pacemaker manages the HAProxy service as a highly available service called haproxy-bundle
.
5.1. How HAProxy works
Director can configure most Red Hat OpenStack Platform services to use the HAProxy service. Director configures those services in the /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg
file, which instructs HAProxy to run in a dedicated container on each overcloud node.
The following table shows the list of services that HAProxy manages:
aodh | cinder | glance_api | gnocchi |
haproxy.stats | heat_api | heat_cfn | horizon |
keystone_admin | keystone_public | mysql | neutron |
nova_metadata | nova_novncproxy | nova_osapi | nova_placement |
For each service in the haproxy.cfg
file, you can see the following properties:
- listen: The name of the service that is listening for requests.
- bind: The IP address and TCP port number on which the service is listening.
- server: The name of each Controller node server that uses HAProxy, the IP address and listening port, and additional information about the server.
The following example shows the OpenStack Block Storage (cinder) service configuration in the haproxy.cfg
file:
listen cinder bind 172.16.0.10:8776 bind 192.168.1.150:8776 mode http http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } option httpchk server overcloud-controller-0 172.16.0.13:8777 check fall 5 inter 2000 rise 2 server overcloud-controller-1 172.16.0.14:8777 check fall 5 inter 2000 rise 2 server overcloud-controller-2 172.16.0.15:8777 check fall 5 inter 2000 rise 2
This example output shows the following information about the OpenStack Block Storage (cinder) service:
-
172.16.0.10:8776
: Virtual IP address and port on the Internal API network (VLAN201) to use within the overcloud. -
192.168.1.150:8776
: Virtual IP address and port on the External network (VLAN100) that provides access to the API network from outside the overcloud. -
8776
: Port number on which the OpenStack Block Storage (cinder) service is listening. -
server
: Controller node names and IP addresses. HAProxy can direct requests made to those IP addresses to one of the Controller nodes listed in theserver
output. -
httpchk
: Enables health checks on the Controller node servers. -
fall 5
: Number of failed health checks to determine that the service is offline. -
inter 2000
: Interval between two consecutive health checks in milliseconds. -
rise 2
: Number of successful health checks to determine that the service is running.
For more information about settings you can use in the haproxy.cfg
file, see the /usr/share/doc/haproxy-[VERSION]/configuration.txt
file on any node where the haproxy
package is installed.
5.2. Viewing HAProxy Stats
By default, the director also enables HAProxy Stats, or statistics, on all HA deployments. With this feature, you can view detailed information about data transfer, connections, and server states on the HAProxy Stats page.
The director also sets the IP:Port address that you use to reach the HAProxy Stats page and stores the information in the haproxy.cfg
file.
Prerequisites
- High availability is deployed and running.
Procedure
-
Open the
/var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg
file in any Controller node where HAProxy is installed. Locate the listen haproxy.stats section:
listen haproxy.stats bind 10.200.0.6:1993 mode http stats enable stats uri / stats auth admin:<haproxy-stats-password>
-
In a Web browser, navigate to 10.200.0.6:1993 and enter the credentials from the
stats auth
row to view the HAProxy Stats page.
5.3. Additional resources
Chapter 6. Managing database replication with Galera
Red Hat OpenStack Platform uses the MariaDB Galera Cluster to manage database replication. Pacemaker runs the Galera service as a bundle set resource that manages the database master/slave status. You can use Galera to test and verify different aspects of the database cluster, such as hostname resolution, cluster integrity, node integrity, and database replication performance.
When you investigate database cluster integrity, each node must meet the following criteria:
- The node is a part of the correct cluster.
- The node can write to the cluster.
- The node can receive queries and write commands from the cluster.
- The node is connected to other nodes in the cluster.
- The node is replicating write-sets to tables in the local database.
6.1. Verifying hostname resolution in a MariaDB cluster
To troubleshoot the MariaDB Galera cluster, first eliminate any hostname resolution problems and then check the write-set replication status on the database of each Controller node. To access the MySQL database, use the password set by director during the overcloud deployment.
By default, director binds the Galera resource to a hostname instead of an IP address. Therefore, any problems that prevent hostname resolution, such as misconfigured or failed DNS, might cause Pacemaker to incorrectly manage the Galera resource.
Procedure
From a Controller node, get the MariaDB database root password by running the
hiera
command.$ sudo hiera -c /etc/puppet/hiera.yaml "mysql::server::root_password" *[MYSQL-HIERA-PASSWORD]*
Get the name of the MariaDB container that runs on the node.
$ sudo podman ps | grep -i galera a403d96c5026 undercloud.ctlplane.localdomain:8787/rhosp-rhel9/openstack-mariadb:16.0-106 /bin/bash /usr/lo... 3 hours ago Up 3 hours ago galera-bundle-podman-0
Get the write-set replication information from the MariaDB database on each node.
$ sudo podman exec galera-bundle-podman-0 sudo mysql -B --password="[MYSQL-HIERA-PASSWORD]" -e "SHOW GLOBAL STATUS LIKE 'wsrep_%';" +----------------------------+----------+ | Variable_name | Value | +----------------------------+----------+ | wsrep_applier_thread_count | 1 | | wsrep_apply_oooe | 0.018672 | | wsrep_apply_oool | 0.000630 | | wsrep_apply_window | 1.021942 | | ... | ... | +----------------------------+----------+
Each relevant variable uses the prefix
wsrep
.- Verify the health and integrity of the MariaDB Galera cluster by checking that the cluster is reporting the correct number of nodes.
6.2. Checking MariaDB cluster integrity
To investigate problems with the MariaDB Galera Cluster, check the integrity of the whole cluster by checking specific wsrep
database variables on each Controller node.
Procedure
Run the following command and replace
<variable>
with thewsrep
database variable that you want to check:$ sudo podman exec galera-bundle-podman-0 sudo mysql -B --password="[MYSQL-HIERA-PASSWORD]" -e "SHOW GLOBAL STATUS LIKE <variable;"
The following example shows how to view the cluster state UUID of the node:
$ sudo podman exec galera-bundle-podman-0 sudo mysql -B --password="[MYSQL-HIERA-PASSWORD]" -e "SHOW GLOBAL STATUS LIKE 'wsrep_cluster_state_uuid';" +--------------------------+--------------------------------------+ | Variable_name | Value | +--------------------------+--------------------------------------+ | wsrep_cluster_state_uuid | e2c9a15e-5485-11e0-0800-6bbb637e7211 | +--------------------------+--------------------------------------+
The following table lists the wsrep
database variables that you can use to check cluster integrity.
Variable | Summary | Description |
---|---|---|
| Cluster state UUID | ID of the cluster to which the node belongs. All nodes must have an identical cluster ID. A node with a different ID is not connected to the cluster. |
| Number of nodes in the cluster | You can check this on any node. If the value is less than the actual number of nodes, then some nodes either failed or lost connectivity. |
| Total number of cluster changes | Determines whether the cluster was split to several components, or partitions. Partitioning is usually caused by a network failure. All nodes must have an identical value.
In case some nodes report a different |
| Primary component status |
Determines whether the node can write to the cluster. If the node can write to the cluster, the |
6.3. Checking database node integrity in a MariaDB cluster
To investigate problems with a specific Controller node in the MariaDB Galera Cluster, check the integrity of the node by checking specific wsrep
database variables.
Procedure
Run the following command and replace
<variable>
with thewsrep
database variable that you want to check:$ sudo podman exec galera-bundle-podman-0 sudo mysql -B --password="[MYSQL-HIERA-PASSWORD]" -e "SHOW GLOBAL STATUS LIKE <variable>;"
The following table lists the wsrep
database variables that you can use to check node integrity.
Variable | Summary | Description |
---|---|---|
| Node ability to accept queries |
States whether the node can accept write-sets from the cluster. If so, then |
| Node network connectivity |
States whether the node can connect to other nodes on the network. If so, then |
| Node state |
Summarizes the node state. If the node can write to the cluster, then typical values for
If the node is part of a non-operational component, then the value of |
-
The
wsrep_connected
value can beON
even if the node is connected only to a subset of nodes in the cluster. For example, in case of a cluster partition, the node might be part of a component that cannot write to the cluster. For more information about checking cluster integrity, see Section 6.2, “Checking MariaDB cluster integrity”. -
If the
wsrep_connected
value isOFF
, then the node is not connected to any cluster components.
6.4. Testing database replication performance in a MariaDB cluster
To check the performance of the MariaDB Galera Cluster, run benchmark tests on the replication throughput of the cluster by checking specific wsrep
database variables.
Every time you query one of these variables, a FLUSH STATUS
command resets the variable value. To run benchmark tests, you must run multiple queries and analyze the variances. These variances can help you determine how much Flow Control is affecting the cluster performance.
Flow Control is a mechanism that the cluster uses to manage replication. When the local receive queue exceeds a certain threshold, Flow Control pauses the replication until the queue size goes down. For more information about Flow Control, see Flow Control on the Galera Cluster website.
Procedure
Run the following command and replace
<variable>
with thewsrep
database variable that you want to check:$ sudo podman exec galera-bundle-podman-0 sudo mysql -B --password="[MYSQL-HIERA-PASSWORD]" -e "SHOW STATUS LIKE <variable>;"
The following table lists the wsrep
database variables that you can use to test database replication performance.
Variable | Summary | Usage |
---|---|---|
| Average size of the local received write-set queue after the last query. |
A value higher than 0.0 indicates that the node cannot apply write-sets as quickly as it receives write-sets, which triggers replication throttling. Check |
| Average send queue length after the last query. | A value higher than 0.0 indicates a higher likelihood of replication throttling and network throughput problems. |
| Minimum and maximum size of the local receive queue after the last query. |
If the value of |
| Fraction of the time that Flow Control paused the node after the last query. |
A value higher than 0.0 indicates that Flow Control paused the node. To determine the duration of the pause, multiply the For example:
|
|
Average difference between the lowest and highest sequence number ( |
In case of throttling and pausing, this variable indicates how many write-sets on average can be applied in parallel. Compare the value with the |
| Number of threads that can be applied simultaneously |
You can increase the value of this variable to apply more threads simultaneously, which also increases the value of
For example, if the
If a problematic node already has an optimal |
6.5. Additional resources
Chapter 7. Troubleshooting high availability resources
In case of resource failure, you must investigate the cause and location of the problem, fix the failed resource, and optionally clean up the resource. There are many possible causes of resource failures depending on your deployment, and you must investigate the resource to determine how to fix the problem.
For example, you can check the resource constraints to ensure that the resources are not interrupting each other, and that the resources can connect to each other. You can also examine a Controller node that is fenced more often than other Controller nodes to identify possible communication problems.
Depending on the location of the resource problem, you choose one of the following options:
- Controller node problems
- If health checks to a Controller node are failing, this can indicate a communication problem between Controller nodes. To investigate, log in to the Controller node and check if the services can start correctly.
- Individual resource problems
-
If most services on a Controller are running correctly, you can run the
pcs status
command and check the output for information about a specific Pacemaner resource failure or run thesystemctl
command to investigate a non-Pacemaker resource failure.
7.1. Viewing resource constraints in a high availability cluster
Before you investigate resource problems, you can view constraints on how services are launched, including constraints related to where each resource is located, the order in which the resource starts, and whether the resource must be colocated with another resource.
Procedure
Use one of the following options:
To view all resource constraints, log in to any Controller node and run the
pcs constraint show
command:$ sudo pcs constraint show
The following example shows a truncated output from the
pcs constraint show
command on a Controller node:Location Constraints: Resource: galera-bundle Constraint: location-galera-bundle (resource-discovery=exclusive) Rule: score=0 Expression: galera-role eq true [...] Resource: ip-192.168.24.15 Constraint: location-ip-192.168.24.15 (resource-discovery=exclusive) Rule: score=0 Expression: haproxy-role eq true [...] Resource: my-ipmilan-for-controller-0 Disabled on: overcloud-controller-0 (score:-INFINITY) Resource: my-ipmilan-for-controller-1 Disabled on: overcloud-controller-1 (score:-INFINITY) Resource: my-ipmilan-for-controller-2 Disabled on: overcloud-controller-2 (score:-INFINITY) Ordering Constraints: start ip-172.16.0.10 then start haproxy-bundle (kind:Optional) start ip-10.200.0.6 then start haproxy-bundle (kind:Optional) start ip-172.19.0.10 then start haproxy-bundle (kind:Optional) start ip-192.168.1.150 then start haproxy-bundle (kind:Optional) start ip-172.16.0.11 then start haproxy-bundle (kind:Optional) start ip-172.18.0.10 then start haproxy-bundle (kind:Optional) Colocation Constraints: ip-172.16.0.10 with haproxy-bundle (score:INFINITY) ip-172.18.0.10 with haproxy-bundle (score:INFINITY) ip-10.200.0.6 with haproxy-bundle (score:INFINITY) ip-172.19.0.10 with haproxy-bundle (score:INFINITY) ip-172.16.0.11 with haproxy-bundle (score:INFINITY) ip-192.168.1.150 with haproxy-bundle (score:INFINITY)
This output displays the following main constraint types:
- Location Constraints
Lists the locations to which resources can be assigned:
-
The first constraint defines a rule that sets the galera-bundle resource to run on nodes with the
galera-role
attribute set totrue
. -
The second location constraint specifies that the IP resource ip-192.168.24.15 runs only on nodes with the
haproxy-role
attribute set totrue
. This means that the cluster associates the IP address with thehaproxy
service, which is necessary to make the services reachable. - The third location constraint shows that the ipmilan resource is disabled on each of the Controller nodes.
-
The first constraint defines a rule that sets the galera-bundle resource to run on nodes with the
- Ordering Constraints
Lists the order in which resources can launch. This example shows a constraint that sets the virtual IP address resources IPaddr2 to start before the HAProxy service.
NoteOrdering constraints only apply to IP address resources and to HAproxy. Systemd manages all other resources, because services such as Compute are expected to withstand an interruption of a dependent service, such as Galera.
- Colocation Constraints
- Lists which resources must be located together. All virtual IP addresses are linked to the haproxy-bundle resource.
To view constraints for a specific resource, log in to any Controller node and run the
pcs property show
command:$ sudo pcs property show
Example output:
Cluster Properties: cluster-infrastructure: corosync cluster-name: tripleo_cluster dc-version: 2.0.1-4.el9-0eb7991564 have-watchdog: false redis_REPL_INFO: overcloud-controller-0 stonith-enabled: false Node Attributes: overcloud-controller-0: cinder-volume-role=true galera-role=true haproxy-role=true rabbitmq-role=true redis-role=true rmq-node-attr-last-known-rabbitmq=rabbit@overcloud-controller-0 overcloud-controller-1: cinder-volume-role=true galera-role=true haproxy-role=true rabbitmq-role=true redis-role=true rmq-node-attr-last-known-rabbitmq=rabbit@overcloud-controller-1 overcloud-controller-2: cinder-volume-role=true galera-role=true haproxy-role=true rabbitmq-role=true redis-role=true rmq-node-attr-last-known-rabbitmq=rabbit@overcloud-controller-2
In this output, you can verify the that the resource constraints are set correctly. For example, the
galera-role
attribute istrue
for all Controller nodes, which means that thegalera-bundle
resource runs only on these nodes.
7.2. Investigating Pacemaker resource problems
To investigate failed resources that Pacemaker manages, log in to the Controller node on which the resource is failing and check the status and log events for the resource. For example, investigate the status and log events for the openstack-cinder-volume
resource.
Prerequisites
- A Controller node with Pacemaker services
- Root user permissions to view log events
Procedure
- Log in to the Controller node on which the resource is failing.
Run the
pcs status
command with thegrep
option to get the status of the service:# sudo pcs status | grep cinder Podman container: openstack-cinder-volume [192.168.24.1:8787/rh-osbs/rhosp161-openstack-cinder-volume:pcmklatest] openstack-cinder-volume-podman-0 (ocf::heartbeat:podman): Started controller-1
View the log events for the resource:
# sudo less /var/log/containers/stdouts/openstack-cinder-volume.log [...] 2021-04-12T12:32:17.607179705+00:00 stderr F ++ cat /run_command 2021-04-12T12:32:17.609648533+00:00 stderr F + CMD='/usr/bin/cinder-volume --config-file /usr/share/cinder/cinder-dist.conf --config-file /etc/cinder/cinder.conf' 2021-04-12T12:32:17.609648533+00:00 stderr F + ARGS= 2021-04-12T12:32:17.609648533+00:00 stderr F + [[ ! -n '' ]] 2021-04-12T12:32:17.609648533+00:00 stderr F + . kolla_extend_start 2021-04-12T12:32:17.611214130+00:00 stderr F +++ stat -c %U:%G /var/lib/cinder 2021-04-12T12:32:17.616637578+00:00 stderr F ++ [[ cinder:kolla != \c\i\n\d\e\r\:\k\o\l\l\a ]] 2021-04-12T12:32:17.616722778+00:00 stderr F + echo 'Running command: '\''/usr/bin/cinder-volume --config-file /usr/share/cinder/cinder-dist.conf --config-file /etc/cinder/cinder.conf'\''' 2021-04-12T12:32:17.616751172+00:00 stdout F Running command: '/usr/bin/cinder-volume --config-file /usr/share/cinder/cinder-dist.conf --config-file /etc/cinder/cinder.conf' 2021-04-12T12:32:17.616775368+00:00 stderr F + exec /usr/bin/cinder-volume --config-file /usr/share/cinder/cinder-dist.conf --config-file /etc/cinder/cinder.conf
- Correct the failed resource based on the information from the output and from the logs.
Run the
pcs resource cleanup
command to reset the status and the fail count of the resource.$ sudo pcs resource cleanup openstack-cinder-volume Resource: openstack-cinder-volume successfully cleaned up
7.3. Investigating systemd resource problems
To investigate failed resources that systemd manages, log in to the Controller node on which the resource is failing and check the status and log events for the resource. For example, investigate the status and log events for the tripleo_nova_conductor
resource.
Prerequisites
- A Controller node with systemd services
- Root user permissions to view log events
Procedure
Run the
systemctl status
command to show the resource status and recent log events:[tripleo-admin@controller-0 ~]$ sudo systemctl status tripleo_nova_conductor ● tripleo_nova_conductor.service - nova_conductor container Loaded: loaded (/etc/systemd/system/tripleo_nova_conductor.service; enabled; vendor preset: disabled) Active: active (running) since Mon 2021-04-12 10:54:46 UTC; 1h 38min ago Main PID: 5125 (conmon) Tasks: 2 (limit: 126564) Memory: 1.2M CGroup: /system.slice/tripleo_nova_conductor.service └─5125 /usr/bin/conmon --api-version 1 -c cc3c63b54e0864c94ac54a5789be96aea1dd60b2f3216b37c3e020c76e7887d4 -u cc3c63b54e0864c94ac54a5789be96aea1dd60b2f3216b37c3e020c76e7887d4 -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/cc3c63b54e0864c94ac54a5789be96aea1dd60b2f3216b37c3e02> Apr 12 10:54:42 controller-0.redhat.local systemd[1]: Starting nova_conductor container... Apr 12 10:54:46 controller-0.redhat.local podman[2855]: nova_conductor Apr 12 10:54:46 controller-0.redhat.local systemd[1]: Started nova_conductor container.
View the log events for the resource:
# sudo less /var/log/containers/tripleo_nova_conductor.log
- Correct the failed resource based on the information from the output and from the logs.
Restart the resource and check the status of the service:
# systemctl restart tripleo_nova_conductor # systemctl status tripleo_nova_conductor ● tripleo_nova_conductor.service - nova_conductor container Loaded: loaded (/etc/systemd/system/tripleo_nova_conductor.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2021-04-22 14:28:35 UTC; 7s ago Process: 518937 ExecStopPost=/usr/bin/podman stop -t 10 nova_conductor (code=exited, status=0/SUCCESS) Process: 518653 ExecStop=/usr/bin/podman stop -t 10 nova_conductor (code=exited, status=0/SUCCESS) Process: 519063 ExecStart=/usr/bin/podman start nova_conductor (code=exited, status=0/SUCCESS) Main PID: 519198 (conmon) Tasks: 2 (limit: 126564) Memory: 1.1M CGroup: /system.slice/tripleo_nova_conductor.service └─519198 /usr/bin/conmon --api-version 1 -c 0d6583beb20508e6bacccd5fea169a2fe949471207cb7d4650fec5f3638c2ce6 -u 0d6583beb20508e6bacccd5fea169a2fe949471207cb7d4650fec5f3638c2ce6 -r /usr/bin/runc -b /var/lib/containe> Apr 22 14:28:34 controller-0.redhat.local systemd[1]: Starting nova_conductor container... Apr 22 14:28:35 controller-0.redhat.local podman[519063]: nova_conductor Apr 22 14:28:35 controller-0.redhat.local systemd[1]: Started nova_conductor container.
Chapter 8. Monitoring a high availability Red Hat Ceph Storage cluster
When you deploy an overcloud with Red Hat Ceph Storage, Red Hat OpenStack Platform uses the ceph-mon
monitor daemon to manage the Ceph cluster. Director deploys the daemon on all Controller nodes.
8.1. Checking Red Hat Ceph monitoring service status
To check the status of the Red Hat Ceph Storage monitoring service, log in to a Controller node and run the service ceph status
command.
Procedure
Log in to a Controller node and check that the Ceph Monitoring service is running:
$ sudo service ceph status === mon.overcloud-controller-0 === mon.overcloud-controller-0: running {"version":"0.94.1"}
8.2. Checking Red Hat Ceph monitoring configuration
To check the configuration of the Red Hat Ceph Storage monitoring service, log in to a Controller node or a Red Hat Ceph node and open the /etc/ceph/ceph.conf
file.
Procedure
Log in to a Controller nodes or on a Ceph node and open the
/etc/ceph/ceph.conf
file to view the monitoring configuration parameters:[global] osd_pool_default_pgp_num = 128 osd_pool_default_min_size = 1 auth_service_required = cephx mon_initial_members = overcloud-controller-0,overcloud-controller-1,overcloud-controller-2 fsid = 8c835acc-6838-11e5-bb96-2cc260178a92 cluster_network = 172.19.0.11/24 auth_supported = cephx auth_cluster_required = cephx mon_host = 172.18.0.17,172.18.0.15,172.18.0.16 auth_client_required = cephx osd_pool_default_size = 3 osd_pool_default_pg_num = 128 public_network = 172.18.0.17/24
This example shows the following information:
- All three Controller nodes are configured to monitor the Red Hat Ceph Storage cluster with the mon_initial_members parameter.
- The 172.19.0.11/24 network is configured to provide a communication path between the Controller nodes and the Red Hat Ceph Storage nodes.
- The Red Hat Ceph Storage nodes are assigned to a separate network from the Controller nodes, and the IP addresses for the monitoring Controller nodes are 172.18.0.15, 172.18.0.16, and 172.18.0.17.
8.3. Checking Red Hat Ceph node status
To check the status of a specific Red Hat Ceph Storage node, log in to the node and run the ceph -s
command.
Procedure
Log in to the Ceph node and run the
ceph -s
command:# ceph -s cluster 8c835acc-6838-11e5-bb96-2cc260178a92 health HEALTH_OK monmap e1: 3 mons at {overcloud-controller-0=172.18.0.17:6789/0,overcloud-controller-1=172.18.0.15:6789/0,overcloud-controller-2=172.18.0.16:6789/0} election epoch 152, quorum 0,1,2 overcloud-controller-1,overcloud-controller-2,overcloud-controller-0 osdmap e543: 6 osds: 6 up, 6 in pgmap v1736: 256 pgs, 4 pools, 0 bytes data, 0 objects 267 MB used, 119 GB / 119 GB avail 256 active+clean
This example output shows that the health parameter value is HEALTH_OK, which indicates that the Ceph node is active and healthy. The output also shows three Ceph monitor services that are running on the three overcloud-controller nodes and the IP addresses and ports of the services.