11.11. Configuring a high availability cluster with SBD node fencing by using the ha_cluster_node_options variable
You can use the ha_cluster RHEL system role to configure high availability clusters with STONITH Block Device (SBD) fencing in an automated fashion.
You must configure a Red Hat high availability cluster with at least one fencing device to ensure the cluster-provided services remain available when a node in the cluster encounters a problem. If your environment does not allow for a remotely accessible power switch to fence a cluster node, you can configure fencing by using a STONITH Block Device (SBD). This device provides a node fencing mechanism for Pacemaker-based clusters through the exchange of messages by means of shared block storage. SBD integrates with Pacemaker, a watchdog device and, optionally, shared storage to arrange for nodes to reliably self-terminate when fencing is required.
You can use the ha_cluster RHEL system role to configure SBD fencing in an automated fashion. With ha_cluster, you can configure watchdog and SBD devices on a node-to-node basis by using one of two variables:
-
ha_cluster_node_options: This is a single variable you define in a playbook file. It is a list of dictionaries where each dictionary defines options for one node. -
ha_cluster: A dictionary that defines options for one node only. You configure theha_clustervariable in an inventory file. To set different values for each node, you define the variable separately for each node.
If both the ha_cluster_node_options and ha_cluster variables contain SBD options, those in ha_cluster_node_options have precedence.
This example procedure uses the ha_cluster_node_options variable in a playbook file to configure node addresses and SBD options on a per-node basis. For an example procedure that uses the ha_cluster variable in an inventory file, see Configuring a high availability cluster with SBD node fencing by using the ha_cluster variable.
The ha_cluster RHEL system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the playbook will be lost.
Prerequisites
- You have prepared the control node and the managed nodes.
- You are logged in to the control node as a user who can run playbooks on the managed nodes.
-
The account you use to connect to the managed nodes has
sudopermissions for these nodes. - The systems that you will use as your cluster members have active subscription coverage for RHEL and the RHEL High Availability Add-On.
- The inventory file specifies the cluster nodes as described in Specifying an inventory for the ha_cluster RHEL system role. For general information about creating an inventory file, see Preparing a control node on RHEL 10.
Procedure
Store your sensitive variables in an encrypted file:
Create the vault:
$ ansible-vault create ~/vault.yml New Vault password: <vault_password> Confirm New Vault password: <vault_password>After the
ansible-vault createcommand opens an editor, enter the sensitive data in the<key>: <value>format:cluster_password: <cluster_password>- Save the changes, and close the editor. Ansible encrypts the data in the vault.
Create a playbook file, for example,
~/playbook.yml, with the following content:--- - name: Create a high availability cluster hosts: node1 node2 vars_files: - ~/vault.yml tasks: - name: Configure a cluster with SBD fencing ansible.builtin.include_role: name: redhat.rhel_system_roles.ha_cluster vars: my_sbd_devices: # This variable is indirectly used by various variables of the ha_cluster RHEL system role. # Its purpose is to define SBD devices once so they do not need # to be repeated several times in the role variables. - /dev/disk/by-id/000001 - /dev/disk/by-id/000002 - /dev/disk/by-id/000003 ha_cluster_cluster_name: my-new-cluster ha_cluster_hacluster_password: "{{ cluster_password }}" ha_cluster_manage_firewall: true ha_cluster_manage_selinux: true ha_cluster_sbd_enabled: true ha_cluster_sbd_options: - name: delay-start value: 'no' - name: startmode value: always - name: timeout-action value: 'flush,reboot' - name: watchdog-timeout value: 30 ha_cluster_node_options: - node_name: node1 sbd_watchdog_modules: - iTCO_wdt sbd_watchdog_modules_blocklist: - ipmi_watchdog sbd_watchdog: /dev/watchdog1 sbd_devices: "{{ my_sbd_devices }}" - node_name: node2 sbd_watchdog_modules: - iTCO_wdt sbd_watchdog_modules_blocklist: - ipmi_watchdog sbd_watchdog: /dev/watchdog1 sbd_devices: "{{ my_sbd_devices }}" # Best practice for setting SBD timeouts: # watchdog-timeout * 2 = msgwait-timeout (set automatically) # msgwait-timeout * 1.2 = stonith-timeout ha_cluster_cluster_properties: - attrs: - name: stonith-timeout value: 72 ha_cluster_resource_primitives: - id: fence_sbd agent: 'stonith:fence_sbd' instance_attrs: - attrs: - name: devices value: "{{ my_sbd_devices | join(',') }}" - name: pcmk_delay_base value: 30The settings specified in the example playbook include the following:
ha_cluster_cluster_name: <cluster_name>- The name of the cluster you are creating.
ha_cluster_hacluster_password: <password>-
The password of the
haclusteruser. Thehaclusteruser has full access to a cluster. ha_cluster_manage_firewall: true-
A variable that determines whether the
ha_clusterRHEL system role manages the firewall. ha_cluster_manage_selinux: true-
A variable that determines whether the
ha_clusterRHEL system role manages the ports of the firewall high availability service using theselinuxRHEL system role. ha_cluster_sbd_enabled: true- A variable that determines whether the cluster can use the SBD node fencing mechanism.
ha_cluster_sbd_options: <sbd options>-
A list of name-value dictionaries specifying SBD options. For information about these options, see the
Configuration via environmentsection of thesbd(8) man page on your system. ha_cluster_node_options: <node options>A variable that defines settings which vary from one cluster node to another. You can configure the following SBD and watchdog items:
-
sbd_watchdog_modules- Modules to be loaded, which create/dev/watchdog*devices. -
sbd_watchdog_modules_blocklist- Watchdog kernel modules to be unloaded and blocked. -
sbd_watchdog- Watchdog device to be used by SBD. -
sbd_devices- Devices to use for exchanging SBD messages and for monitoring. Always refer to the devices using the long, stable device name (/dev/disk/by-id/).
-
ha_cluster_cluster_properties: <cluster properties>- A list of sets of cluster properties for Pacemaker cluster-wide configuration.
ha_cluster_resource_primitives: <cluster resources>-
A list of resource definitions for the Pacemaker resources configured by the
ha_clusterRHEL system role, including fencing resources.
For details about all variables used in the playbook, see the
/usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.mdfile on the control node.Validate the playbook syntax:
$ ansible-playbook --syntax-check --ask-vault-pass ~/playbook.ymlNote that this command only validates the syntax and does not protect against a wrong but valid configuration.
Run the playbook:
$ ansible-playbook --ask-vault-pass ~/playbook.yml