Chapter 1. High Availability Add-On overview
The High Availability Add-On is a clustered system that provides reliability, scalability, and availability to critical production services.
High availability clusters, sometimes called failover clusters, provide highly available services by eliminating single points of failure and by failing over services from one cluster node to another in case a node becomes inoperative. Typically, services in a high availability cluster read and write data by means of read-write mounted file systems. A high availability cluster must maintain data integrity as one cluster node takes over control of a service from another cluster node. Node failures in a high availability cluster are not visible from clients outside the cluster. The High Availability Add-On provides high availability clustering through its high availability service management component, Pacemaker.
Pacemaker is the cluster resource manager for the High Availability Add-On. It achieves maximum availability for your cluster services and resources by making use of the cluster infrastructure’s messaging and membership capabilities to deter and recover from node and resource-level failure.
Red Hat provides a variety of documentation for planning, configuring, and maintaining a Red Hat high availability cluster. For a listing of articles that provide guided indexes to the various areas of Red Hat cluster documentation, see the Red Hat Knowledgebase article Red Hat High Availability Add-On Documentation Guide.
1.1. Pacemaker architecture components
A cluster configured with Pacemaker comprises separate component daemons that monitor cluster membership, scripts that manage the services, and resource management subsystems that monitor the disparate resources.
The following components form the Pacemaker architecture:
- Cluster Information Base (CIB)
- The Pacemaker information daemon, which uses XML internally to distribute and synchronize current configuration and status information from the Designated Coordinator (DC) - a node assigned by Pacemaker to store and distribute cluster state and actions by means of the CIB - to all other cluster nodes.
- Cluster Resource Management Daemon (CRMd)
Pacemaker cluster resource actions are routed through this daemon. Resources managed by CRMd can be queried by client systems, moved, instantiated, and changed when needed.
Each cluster node also includes a local resource manager daemon (LRMd) that acts as an interface between CRMd and resources. LRMd passes commands from CRMd to agents, such as starting and stopping and relaying status information.
- Shoot the Other Node in the Head (STONITH)
- STONITH is the Pacemaker fencing implementation. It acts as a cluster resource in Pacemaker that processes fence requests, forcefully shutting down nodes and removing them from the cluster to ensure data integrity. STONITH is configured in the CIB and can be monitored as a normal cluster resource.
- corosync
corosync
is the component and daemon of the same name that serves the core membership and member-communication needs for high availability clusters. It is required for the High Availability Add-On to function.In addition to those membership and messaging functions,
corosync
also:- Manages quorum rules and determination.
- Provides messaging capabilities for applications that coordinate or operate across multiple members of the cluster and thus must communicate stateful or other information between instances.
-
Uses the
kronosnet
library as its network transport to provide multiple redundant links and automatic failover.
1.2. Pacemaker configuration and management tools
The High Availability Add-On features three configuration tools for cluster deployment, monitoring, and management.
pcs
command-line interfaceThe
pcs
command-line interface controls and configures Pacemaker and thecorosync
heartbeat daemon. A command-line based program,pcs
can perform the following cluster management tasks:- Create and configure a Pacemaker cluster
- Modify configuration of the cluster while it is running
- Start, stop, and display status information of the cluster
- HA Cluster Management RHEL web console add-on
-
The HA Cluster Management RHEL web console add-on is a graphical user interface to create and configure Pacemaker clusters. The HA Cluster Management RHEL web console add-on is available through the RHEL web console when the
cockpit-ha-cluster
package is installed. For information about the RHEL web console, see Getting started with the HA Cluster Management add-on for the RHEL web console. ha_cluster
RHEL system role-
With the
ha_cluster
RHEL system role, you can configure and manage a high-availability cluster that uses the Pacemaker high availability cluster resource manager. For information about using RHEL system roles, see Automating system administration by using RHEL system roles.
1.3. The cluster and Pacemaker configuration files
The configuration files for the Red Hat High Availability Add-On are corosync.conf
and cib.xml
.
-
The
corosync.conf
file provides the cluster parameters used bycorosync
, the cluster manager that Pacemaker is built on. In general, you should not edit thecorosync.conf
directly but, instead, use thepcs
interface, the HA Cluster Management RHEL web console add-on, or theha_cluster
RHEL system role. -
The
cib.xml
file is an XML file that represents both the cluster’s configuration and the current state of all resources in the cluster. This file is used by Pacemaker’s Cluster Information Base (CIB). The contents of the CIB are automatically kept in sync across the entire cluster. Do not edit thecib.xml
file directly; use thepcs
interface, the HA Cluster Management RHEL web console add-on, or theha_cluster
RHEL system role.