Chapter 32. Configuring disaster recovery clusters
One method of providing disaster recovery for a high availability cluster is to configure two clusters. You can then configure one cluster as your primary site cluster, and the second cluster as your disaster recovery cluster.
In normal circumstances, the primary cluster is running resources in production mode. The disaster recovery cluster has all the resources configured as well and is either running them in demoted mode or not at all. For example, there may be a database running in the primary cluster in promoted mode and running in the disaster recovery cluster in demoted mode. The database in this setup would be configured so that data is synchronized from the primary to disaster recovery site. This is done through the database configuration itself rather than through the pcs
command interface.
When the primary cluster goes down, users can use the pcs
command interface to manually fail the resources over to the disaster recovery site. They can then log in to the disaster site and promote and start the resources there. Once the primary cluster has recovered, users can use the pcs
command interface to manually move resources back to the primary site.
You can use the pcs
command to display the status of both the primary and the disaster recovery site cluster from a single node on either site.
32.1. Considerations for disaster recovery clusters
When planning and configuring a disaster recovery site that you will manage and monitor with the pcs
command interface, note the following considerations.
- The disaster recovery site must be a cluster. This makes it possible to configure it with same tools and similar procedures as the primary site.
-
The primary and disaster recovery clusters are created by independent
pcs cluster setup
commands. - The clusters and their resources must be configured so that that the data is synchronized and failover is possible.
- The cluster nodes in the recovery site can not have the same names as the nodes in the primary site.
-
The pcs user
hacluster
must be authenticated for each node in both clusters on the node from which you will be runningpcs
commands.
32.2. Displaying status of recovery clusters
To configure a primary and a disaster recovery cluster so that you can display the status of both clusters, perform the following procedure.
Setting up a disaster recovery cluster does not automatically configure resources or replicate data. Those items must be configured manually by the user.
In this example:
-
The primary cluster will be named
PrimarySite
and will consist of the nodesz1.example.com
. andz2.example.com
. -
The disaster recovery site cluster will be named
DRsite
and will consist of the nodesz3.example.com
andz4.example.com
.
This example sets up a basic cluster with no resources or fencing configured.
Procedure
Authenticate all of the nodes that will be used for both clusters.
[root@z1 ~]# pcs host auth z1.example.com z2.example.com z3.example.com z4.example.com -u hacluster -p password z1.example.com: Authorized z2.example.com: Authorized z3.example.com: Authorized z4.example.com: Authorized
Create the cluster that will be used as the primary cluster and start cluster services for the cluster.
[root@z1 ~]# pcs cluster setup PrimarySite z1.example.com z2.example.com --start {...} Cluster has been successfully set up. Starting cluster on hosts: 'z1.example.com', 'z2.example.com'...
Create the cluster that will be used as the disaster recovery cluster and start cluster services for the cluster.
[root@z1 ~]# pcs cluster setup DRSite z3.example.com z4.example.com --start {...} Cluster has been successfully set up. Starting cluster on hosts: 'z3.example.com', 'z4.example.com'...
From a node in the primary cluster, set up the second cluster as the recovery site. The recovery site is defined by a name of one of its nodes.
[root@z1 ~]# pcs dr set-recovery-site z3.example.com Sending 'disaster-recovery config' to 'z3.example.com', 'z4.example.com' z3.example.com: successful distribution of the file 'disaster-recovery config' z4.example.com: successful distribution of the file 'disaster-recovery config' Sending 'disaster-recovery config' to 'z1.example.com', 'z2.example.com' z1.example.com: successful distribution of the file 'disaster-recovery config' z2.example.com: successful distribution of the file 'disaster-recovery config'
Check the disaster recovery configuration.
[root@z1 ~]# pcs dr config Local site: Role: Primary Remote site: Role: Recovery Nodes: z3.example.com z4.example.com
Check the status of the primary cluster and the disaster recovery cluster from a node in the primary cluster.
[root@z1 ~]# pcs dr status --- Local cluster - Primary site --- Cluster name: PrimarySite WARNINGS: No stonith devices and stonith-enabled is not false Cluster Summary: * Stack: corosync * Current DC: z2.example.com (version 2.0.3-2.el8-2c9cea563e) - partition with quorum * Last updated: Mon Dec 9 04:10:31 2019 * Last change: Mon Dec 9 04:06:10 2019 by hacluster via crmd on z2.example.com * 2 nodes configured * 0 resource instances configured Node List: * Online: [ z1.example.com z2.example.com ] Full List of Resources: * No resources Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled --- Remote cluster - Recovery site --- Cluster name: DRSite WARNINGS: No stonith devices and stonith-enabled is not false Cluster Summary: * Stack: corosync * Current DC: z4.example.com (version 2.0.3-2.el8-2c9cea563e) - partition with quorum * Last updated: Mon Dec 9 04:10:34 2019 * Last change: Mon Dec 9 04:09:55 2019 by hacluster via crmd on z4.example.com * 2 nodes configured * 0 resource instances configured Node List: * Online: [ z3.example.com z4.example.com ] Full List of Resources: * No resources Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
For additional display options for a disaster recovery configuration, see the help screen for the pcs dr
command.