8.4. Configuring Failover Domains
A failover domain is a named subset of cluster nodes that are eligible to run a cluster service in the event of a node failure. A failover domain can have the following characteristics:
- Unrestricted — Allows you to specify that a subset of members are preferred, but that a cluster service assigned to this domain can run on any available member.
- Restricted — Allows you to restrict the members that can run a particular cluster service. If none of the members in a restricted failover domain are available, the cluster service cannot be started (either manually or by the cluster software).
- Unordered — When a cluster service is assigned to an unordered failover domain, the member on which the cluster service runs is chosen from the available failover domain members with no priority ordering.
- Ordered — Allows you to specify a preference order among the members of a failover domain. Ordered failover domains select the node with the lowest priority number first. That is, the node in a failover domain with a priority number of "1" specifies the highest priority, and therefore is the most preferred node in a failover domain. After that node, the next preferred node would be the node with the next highest priority number, and so on.
- Failback — Allows you to specify whether a service in the failover domain should fail back to the node that it was originally running on before that node failed. Configuring this characteristic is useful in circumstances where a node repeatedly fails and is part of an ordered failover domain. In that circumstance, if a node is the preferred node in a failover domain, it is possible for a service to fail over and fail back repeatedly between the preferred node and another node, causing severe impact on performance.
Note
The failback characteristic is applicable only if ordered failover is configured.
Note
Changing a failover domain configuration has no effect on currently running services.
Note
Failover domains are not required for operation.
By default, failover domains are unrestricted and unordered.
In a cluster with several members, using a restricted failover domain can minimize the work to set up the cluster to run a cluster service (such as
httpd
), which requires you to set up the configuration identically on all members that run the cluster service. Instead of setting up the entire cluster to run the cluster service, you can set up only the members in the restricted failover domain that you associate with the cluster service.
Note
To configure a preferred member, you can create an unrestricted failover domain comprising only one cluster member. Doing that causes a cluster service to run on that cluster member primarily (the preferred member), but allows the cluster service to fail over to any of the other members.
To configure a failover domain, use the following procedures:
- Open
/etc/cluster/cluster.conf
at any node in the cluster. - Add the following skeleton section within the
rm
element for each failover domain to be used:<failoverdomains> <failoverdomain name="" nofailback="" ordered="" restricted=""> <failoverdomainnode name="" priority=""/> <failoverdomainnode name="" priority=""/> <failoverdomainnode name="" priority=""/> </failoverdomain> </failoverdomains>
Note
The number offailoverdomainnode
attributes depends on the number of nodes in the failover domain. The skeletonfailoverdomain
section in preceding text shows threefailoverdomainnode
elements (with no node names specified), signifying that there are three nodes in the failover domain. - In the
failoverdomain
section, provide the values for the elements and attributes. For descriptions of the elements and attributes, see the failoverdomain section of the annotated cluster schema. The annotated cluster schema is available at/usr/share/doc/cman-X.Y.ZZ/cluster_conf.html
(for example/usr/share/doc/cman-3.0.12/cluster_conf.html
) in any of the cluster nodes. For an example of afailoverdomains
section, see Example 8.8, “A Failover Domain Added tocluster.conf
”. - Update the
config_version
attribute by incrementing its value (for example, changing fromconfig_version="2"
toconfig_version="3">
). - Save
/etc/cluster/cluster.conf
. - (Optional) Validate the file against the cluster schema (
cluster.rng
) by running theccs_config_validate
command. For example:[root@example-01 ~]#
ccs_config_validate
Configuration validates - Run the
cman_tool version -r
command to propagate the configuration to the rest of the cluster nodes. - Proceed to Section 8.5, “Configuring HA Services”.
Example 8.8, “A Failover Domain Added to
cluster.conf
” shows an example of a configuration with an ordered, unrestricted failover domain.
Example 8.8. A Failover Domain Added to cluster.conf
<cluster name="mycluster" config_version="3"> <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> <method name="APC"> <device name="apc" port="1"/> </method> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> <method name="APC"> <device name="apc" port="2"/> </method> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> <method name="APC"> <device name="apc" port="3"/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/> </fencedevices> <rm> <failoverdomains> <failoverdomain name="example_pri" nofailback="0" ordered="1" restricted="0"> <failoverdomainnode name="node-01.example.com" priority="1"/> <failoverdomainnode name="node-02.example.com" priority="2"/> <failoverdomainnode name="node-03.example.com" priority="3"/> </failoverdomain> </failoverdomains> </rm> </cluster>
The
failoverdomains
section contains a failoverdomain
section for each failover domain in the cluster. This example has one failover domain. In the failoverdomain
line, the name (name
) is specified as example_pri
. In addition, it specifies that resources using this domain should fail-back to lower-priority-score nodes when possible (nofailback="0"
), that failover is ordered (ordered="1"
), and that the failover domain is unrestricted (restricted="0"
).
Note:The
priority
value is applicable only if ordered failover is configured.