Chapter 4. Configuring a Red Hat High Availability cluster on Microsoft Azure

To create a cluster where Red Hat Enterprise Linux (RHEL) nodes automatically redistribute their workloads if a node failure occurs, use the Red Hat High Availability Add-On. You can choose public cloud platforms, such as Microsoft Azure, to host high availability (HA) clusters.

Configure a Red Hat HA cluster on Azure by using Azure virtual machine (VM) as cluster nodes. Creating RHEL HA clusters on Azure is similar to creating HA clusters in non-cloud environments with certain specifications.

4.1. Benefits of using high-availability clusters on public cloud platforms
Copy link

A high-availability (HA) cluster is a set of computers, also known as nodes, linked together to run a specific workload. The purpose of HA clusters is to offer redundancy in case of a hardware or software failure. If a node in the HA cluster fails, the Pacemaker cluster resource manager distributes the workload to other nodes. No noticeable downtime occurs in the services that are running on the cluster.

You can also run HA clusters on public cloud platforms. In this case, you would use virtual machine (VM) instances in the cloud as the individual cluster nodes. Using HA clusters on a public cloud platform has the following benefits:

Improved availability: In case of a VM failure, the workload is quickly redistributed to other nodes, so running services are not disrupted.
Scalability: You can start additional nodes when demand is high and stop them when demand is low.
Cost-effectiveness: With the pay-as-you-go pricing, you pay only for nodes that are running.
Simplified management: Some public cloud platforms offer management interfaces to make configuring HA clusters easier.

To enable HA on your Red Hat Enterprise Linux (RHEL) systems, Red Hat offers a High Availability Add-On. The High Availability Add-On provides all necessary components for creating HA clusters on RHEL systems. The components include high availability service management and cluster administration tools.

4.2. Creating resources in Azure
Copy link

Complete the following procedure to create a region, resource group, storage account, virtual network, and availability set. You need these resources to set up a cluster on Microsoft Azure.

Prerequisites

You have created a Red Hat Customer Portal account.
You have administrator privileges for Microsoft Azure account.
You have installed the Azure command-line interface (CLI). For more information, see Azure Command Line Interface (CLI).

Procedure

Authenticate your system with Azure and log in.

$ az login

Note

If a browser is available in your environment, the CLI opens your browser to the Azure sign-in page.

Example:

[clouduser@localhost]$ az login

To sign in, use a web browser to open the page https://aka.ms/devicelogin and enter the code FDMSCMETZ to authenticate.
  [
    {
      "cloudName": "AzureCloud",
      "id": "Subscription ID",
      "isDefault": true,
      "name": "MySubscriptionName",
      "state": "Enabled",
      "tenantId": "Tenant ID",
      "user": {
        "name": "clouduser@company.com",
        "type": "user"
      }
    }
  ]

Create a resource group in an Azure region.

$ az group create --name resource-group --location azure-region

Example:

[clouduser@localhost]$ az group create --name azrhelclirsgrp --location southcentralus

{
  "id": "/subscriptions//resourceGroups/azrhelclirsgrp",
  "location": "southcentralus",
  "managedBy": null,
  "name": "azrhelclirsgrp",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null
}

Create a storage account.

$ az storage account create -l azure-region -n storage-account-name -g resource-group --sku sku_type --kind StorageV2

Example:

[clouduser@localhost]$ az storage account create -l southcentralus -n azrhelclistact -g azrhelclirsgrp --sku Standard_LRS --kind StorageV2

{
  "accessTier": null,
  "creationTime": "2017-04-05T19:10:29.855470+00:00",
  "customDomain": null,
  "encryption": null,
  "id": "/subscriptions//resourceGroups/azrhelclirsgrp/providers/Microsoft.Storage/storageAccounts/azrhelclistact",
  "kind": "StorageV2",
  "lastGeoFailoverTime": null,
  "location": "southcentralus",
  "name": "azrhelclistact",
  "primaryEndpoints": {
    "blob": "https://azrhelclistact.blob.core.windows.net/",
    "file": "https://azrhelclistact.file.core.windows.net/",
    "queue": "https://azrhelclistact.queue.core.windows.net/",
    "table": "https://azrhelclistact.table.core.windows.net/"
},
"primaryLocation": "southcentralus",
"provisioningState": "Succeeded",
"resourceGroup": "azrhelclirsgrp",
"secondaryEndpoints": null,
"secondaryLocation": null,
"sku": {
  "name": "Standard_LRS",
  "tier": "Standard"
},
"statusOfPrimary": "available",
"statusOfSecondary": null,
"tags": {},
  "type": "Microsoft.Storage/storageAccounts"
}

Get the storage account connection string.

$ az storage account show-connection-string -n storage-account-name -g resource-group

Example:

[clouduser@localhost]$ az storage account show-connection-string -n azrhelclistact -g azrhelclirsgrp
{
  "connectionString": "DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=azrhelclistact;AccountKey=NreGk...=="
}

Export the connection string by copying the connection string and pasting it into the following command. This string connects your system to the storage account.

$ export AZURE_STORAGE_CONNECTION_STRING="storage-connection-string"

Example:

[clouduser@localhost]$ export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=azrhelclistact;AccountKey=NreGk...=="

Create the storage container.

$ az storage container create -n container-name

Example:

[clouduser@localhost]$ az storage container create -n azrhelclistcont

{
  "created": true
}

Create a virtual network. All cluster nodes must be in the same virtual network.

$ az network vnet create -g resource group --name vnet-name --subnet-name subnet-name

Example:

[clouduser@localhost]$ az network vnet create --resource-group azrhelclirsgrp --name azrhelclivnet1 --subnet-name azrhelclisubnet1
{
  "newVNet": {
    "addressSpace": {
      "addressPrefixes": [
      "10.0.0.0/16"
      ]
  },
  "dhcpOptions": {
    "dnsServers": []
  },
  "etag": "W/\"\"",
  "id": "/subscriptions//resourceGroups/azrhelclirsgrp/providers/Microsoft.Network/virtualNetworks/azrhelclivnet1",
  "location": "southcentralus",
  "name": "azrhelclivnet1",
  "provisioningState": "Succeeded",
  "resourceGroup": "azrhelclirsgrp",
  "resourceGuid": "0f25efee-e2a6-4abe-a4e9-817061ee1e79",
  "subnets": [
    {
      "addressPrefix": "10.0.0.0/24",
      "etag": "W/\"\"",
      "id": "/subscriptions//resourceGroups/azrhelclirsgrp/providers/Microsoft.Network/virtualNetworks/azrhelclivnet1/subnets/azrhelclisubnet1",
      "ipConfigurations": null,
      "name": "azrhelclisubnet1",
      "networkSecurityGroup": null,
      "provisioningState": "Succeeded",
      "resourceGroup": "azrhelclirsgrp",
      "resourceNavigationLinks": null,
      "routeTable": null
    }
  ],
  "tags": {},
  "type": "Microsoft.Network/virtualNetworks",
  "virtualNetworkPeerings": null
  }
}

Create an availability set. All cluster nodes must be in the same availability set.

$ az vm availability-set create --name MyAvailabilitySet --resource-group MyResourceGroup

Example:

[clouduser@localhost]$ az vm availability-set create --name rhelha-avset1 --resource-group azrhelclirsgrp
{
  "additionalProperties": {},
    "id": "/subscriptions/.../resourceGroups/azrhelclirsgrp/providers/Microsoft.Compute/availabilitySets/rhelha-avset1",
    "location": "southcentralus",
    "name": “rhelha-avset1",
    "platformFaultDomainCount": 2,
    "platformUpdateDomainCount": 5,

[omitted]

4.3. Required system packages for High Availability
Copy link

The procedure assumes you are creating a VM image for Azure HA that uses Red Hat Enterprise Linux. To successfully complete the procedure, the following packages must be installed.

Expand

Table 4.1. System packages
Package	Repository	Description
libvirt	rhel-9-for-x86_64-appstream-rpms	Open source API, daemon, and management tool for managing platform virtualization
virt-install	rhel-9-for-x86_64-appstream-rpms	A command-line utility for building VMs
libguestfs	rhel-9-for-x86_64-appstream-rpms	A library for accessing and modifying VM file systems
guestfs-tools	rhel-9-for-x86_64-appstream-rpms	System administration tools for VMs; includes the `virt-customize` utility

4.4. Azure VM configuration settings
Copy link

Azure virtual machines (VMs) must have the following configuration settings. Some of these settings are enabled during the initial VM creation. Other settings are set when provisioning the VM image for Azure. Keep these settings in mind as you move through the procedures. Refer to them as necessary.

Expand

Table 4.2. VM configuration settings
Setting	Recommendation
SSH	SSH must be enabled to provide remote access to your Azure VMs.
dhcp	The primary virtual adapter should be configured for dhcp (IPv4 only).
swap space	Do not create a dedicated swap file or `swap` partition on the operating system (OS) disk or storage disk during installation. Configure the `cloud-init` utility to automatically create a `swap` partition on an ephemeral disk of the VM. Ephemeral disk is a local storage of the VM, while resource disk is mounted storage on VM itself. Both storage types store data temporarily.
NIC	Choose `virtio` for the primary virtual network adapter.
encryption	For custom images, use Network Bound Disk Encryption (NBDE) for full disk encryption on Azure.

4.5. Installing Hyper-V device drivers
Copy link

Microsoft Azure provides network and storage device drivers as part of their Linux Integration Services (LIS) for Hyper-V package. You need to install Hyper-V device drivers on the VM image prior to provisioning it as an Azure virtual machine (VM). Use the lsinitrd | grep hv command to verify that the drivers are installed.

Prerequisites

You have created a Red Hat Customer Portal account.
You have administrator privileges for Microsoft Azure account.
You have installed the Azure command-line interface (CLI). For more information, see Azure Command Line Interface (CLI).

Procedure

Enter the following grep command to determine if the required Hyper-V device drivers are installed.

# lsinitrd | grep hv

In the example below, all required drivers are installed.

# lsinitrd | grep hv
drwxr-xr-x   2 root     root            0 Aug 12 14:21 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/hv
-rw-r--r--   1 root     root        31272 Aug 11 08:45 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/hv/hv_vmbus.ko.xz
-rw-r--r--   1 root     root        25132 Aug 11 08:46 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/net/hyperv/hv_netvsc.ko.xz

If all the drivers are not installed, complete the remaining steps.

Note

An hv_vmbus driver may exist in the environment. Even if this driver is present, complete the following steps.

Create a file named hv.conf in /etc/dracut.conf.d.
Add the following driver parameters to the hv.conf file.
```
add_drivers+=" hv_vmbus "
add_drivers+=" hv_netvsc "
add_drivers+=" hv_storvsc "
add_drivers+=" nvme "
```
Note
Note the spaces before and after the quotes, for example, add_drivers+=" hv_vmbus ". This ensures that unique drivers are loaded in the event that other Hyper-V drivers already exist in the environment.
Regenerate the initramfs image.
```
# dracut -f -v --regenerate-all
```

Verification

Reboot the machine.
Run the lsinitrd | grep hv command to verify that the drivers are installed.

4.6. Making configuration changes required for a Microsoft Azure deployment
Copy link

Before you deploy a custom base image to Azure, perform additional configuration changes to ensure that the virtual machine (VM) can properly operate in Azure.

Prerequisites

You have created a Red Hat Customer Portal account.
You have administrator privileges for Microsoft Azure account.
You have installed the Azure command-line interface (CLI). For more information, see Azure Command Line Interface (CLI).

Procedure

# subscription-manager register
Installed Product Current Status:
Product Name: Red Hat Enterprise Linux for x86_64
Status: Subscribed

Ensure that the cloud-init and hyperv-daemons packages are installed.
```
# dnf install cloud-init hyperv-daemons -y
```
Create cloud-init configuration files that are needed for integration with Azure services:
1. To enable logging to the Hyper-V Data Exchange Service (KVP), create the /etc/cloud/cloud.cfg.d/10-azure-kvp.cfg configuration file and add the following lines to that file.
  reporting: logging: type: log telemetry: type: hyperv
2. To add Azure as a datasource, create the /etc/cloud/cloud.cfg.d/91-azure_datasource.cfg configuration file, and add the following lines to that file.
  datasource_list: [ Azure ] datasource: Azure: apply_network_config: False
3. To configure swap space on the ephemeral disk, create the /etc/cloud/cloud.cfg.d/00-azure-swap.cfg configuration file and add the following lines.
  Important
  The ephemeral disk is temporary storage. Therefore, data stored on it, including swap space, is lost when the VM is deallocated or moved. Use the ephemeral disk only for temporary data such as swap space.
  #cloud-config disk_setup: ephemeral0: table_type: gpt layout: [66, [33,82]] overwrite: true fs_setup: - device: ephemeral0.1 filesystem: ext4 - device: ephemeral0.2 filesystem: swap mounts: - ["ephemeral0.1", "/mnt"] - ["ephemeral0.2", "none", "swap", "sw,nofail,x-systemd.requires=cloud-init.service", "0", "0"]
To ensure that specific kernel modules are blocked from loading automatically, edit or create the /etc/modprobe.d/blocklist.conf file and add the following lines to that file.
```
blacklist nouveau
blacklist lbm-nouveau
blacklist floppy
blacklist amdgpu
blacklist skx_edac
blacklist intel_cstate
```
Modify udev network device rules:
1. Remove the following persistent network device rules if present.
  # rm -f /etc/udev/rules.d/70-persistent-net.rules # rm -f /etc/udev/rules.d/75-persistent-net-generator.rules # rm -f /etc/udev/rules.d/80-net-name-slot-rules
2. To ensure that Accelerated Networking on Azure works as intended, create a new network device rule /etc/udev/rules.d/68-azure-sriov-nm-unmanaged.rules and add the following line to it.
  SUBSYSTEM=="net", DRIVERS=="hv_pci", ACTION=="add", ENV{NM_UNMANAGED}="1"

Set the sshd service to start automatically.

# systemctl enable sshd
# systemctl is-enabled sshd

Modify kernel boot parameters:
1. Open the /etc/default/grub file, and ensure the GRUB_TIMEOUT line has the following value.
  GRUB_TIMEOUT=10
2. Remove the following options from the end of the GRUB_CMDLINE_LINUX line if present.
  rhgb quiet
3. Ensure the /etc/default/grub file contains the following lines with all the specified options.
  GRUB_CMDLINE_LINUX="loglevel=3 crashkernel=auto console=tty1 console=ttyS0 earlyprintk=ttyS0 rootdelay=300" GRUB_TIMEOUT_STYLE=countdown GRUB_TERMINAL="serial console" GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"
  Note
  If you are not running workloads on HDDs, add elevator=none to the end of the GRUB_CMDLINE_LINUX line. This sets the I/O scheduler to none, which improves I/O performance on SSD-based systems.
4. Regenerate the grub.cfg file.
  - On a BIOS-based machine:
    In RHEL 9.2 and earlier:
    
    # grub2-mkconfig -o /boot/grub2/grub.cfg
    
    In RHEL 9.3 and later:
    
    # grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdline
  - On a UEFI-based machine:
    In RHEL 9.2 and earlier:
    
    # grub2-mkconfig -o /boot/grub2/grub.cfg
    
    In RHEL 9.3 and later:
    
    # grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdline
    
    Warning
    The path to rebuild grub.cfg is same for both BIOS and UEFI based machines. Actual grub.cfg is present at BIOS path only. The UEFI path has a stub file that must not be modified or recreated using grub2-mkconfig command.
    If your system uses a non-default location for grub.cfg, adjust the command accordingly.
Configure the Windows Azure Linux Agent (WALinuxAgent):
1. Install and enable the WALinuxAgent package.
  # dnf install WALinuxAgent -y # systemctl enable waagent
2. To disable swap configuration in WALinuxAgent (required when using cloud-init to manage swap), edit the following lines in the /etc/waagent.conf file.
  Provisioning.DeleteRootPassword=y ResourceDisk.Format=n ResourceDisk.EnableSwap=n ResourceDisk.SwapSizeMB=0
  Note
  By disabling swap in WALinuxAgent, you enable cloud-init to manage the swap configuration on the ephemeral disk.
Prepare the VM for Azure provisioning:
1. Unregister the VM from Red Hat Subscription Manager.
  # subscription-manager unregister
2. Clean up the existing provisioning details.
  # waagent -force -deprovision
  Note
  This command generates warnings, which are expected because Azure handles the provisioning of VMs automatically.
3. Clean the shell history and shut down the VM.
  # export HISTSIZE=0 # poweroff

4.7. Creating an Azure Active Directory application
Copy link

Complete the following procedure to create an Azure Active Directory (AD) application. The Azure AD application authorizes and automates access for HA operations for all nodes in the cluster.

Prerequisites

You have created a Red Hat Customer Portal account.
You have administrator privileges for Microsoft Azure account. Use this authorization to create an Azure Active Directory (AD) application.
You have installed the Azure command-line interface (CLI). For more information, see Azure Command Line Interface (CLI).

Procedure

On any node in the HA cluster, log in to your Azure account.
```
$ az login
```

Create a json configuration file for a custom role for the Azure fence agent. Use the following configuration, but replace <subscription_id> with your subscription IDs.

{
      "Name": "Linux Fence Agent Role",
      "description": "Allows to power-off and start virtual machines",
      "assignableScopes": [
        "/subscriptions/<subscription_id>"
      ],
      "actions": [
        "Microsoft.Compute/*/read",
        "Microsoft.Compute/virtualMachines/powerOff/action",
        "Microsoft.Compute/virtualMachines/start/action"
      ],
      "notActions": [],
      "dataActions": [],
      "notDataActions": []
}

Define the custom role for the Azure fence agent. Use the json file created in the earlier step to do this.

$ az role definition create --role-definition azure-fence-role.json

{
  "assignableScopes": [
    "/subscriptions/<my_subscription_id>"
  ],
  "description": "Allows to power-off and start virtual machines",
  "id": "/subscriptions/<my_subscription_id>/providers/Microsoft.Authorization/roleDefinitions/<role_id>",
  "name": "<role_id>",
  "permissions": [
    {
      "actions": [
        "Microsoft.Compute/*/read",
        "Microsoft.Compute/virtualMachines/powerOff/action",
        "Microsoft.Compute/virtualMachines/start/action"
      ],
      "dataActions": [],
      "notActions": [],
      "notDataActions": []
    }
  ],
  "roleName": "Linux Fence Agent Role",
  "roleType": "CustomRole",
  "type": "Microsoft.Authorization/roleDefinitions"
}

In the Azure web console interface, select Virtual Machine Click Identity in the left-side menu.
Select On Click Save click Yes to confirm.
Click Azure role assignments Add role assignment.
Select the Scope required for the role, for example Resource Group.
Select the required Resource Group.
Optional: Change the Subscription if necessary.
Select the Linux Fence Agent Role role.
Click Save.

Verification

Display nodes visible to Azure AD.
```
# fence_azure_arm --msi -o list
node1,
node2,
[...]
```
If this command outputs all nodes in your cluster, you have configured the AD application successfully.

4.8. Converting the image to a fixed VHD format
Copy link

All Microsoft Azure VM images must be in a fixed VHD format. The image must be aligned on a 1 MB boundary before it is converted to VHD. To convert the image from qcow2 to a fixed VHD format and align the image, see the following procedure. Once you have converted the image, you can upload it to Azure.

Prerequisites

You have created a Red Hat Customer Portal account.
You have administrator privileges for Microsoft Azure account.
You have installed the Azure command-line interface (CLI). For more information, see Azure Command Line Interface (CLI).

Procedure

Convert the image from qcow2 to raw format.

$ qemu-img convert -f qcow2 -O raw <image-name>.qcow2 <image-name>.raw

Create a shell script with the following content.

#!/bin/bash
MB=$((1024 * 1024))
size=$(qemu-img info -f raw --output json "$1" | gawk 'match($0, /"virtual-size": ([0-9]+),/, val) {print val[1]}')
rounded_size=$((($size/$MB + 1) * $MB))
if [ $(($size % $MB)) -eq  0 ]
then
 echo "Your image is already aligned. You do not need to resize."
 exit 1
fi
echo "rounded size = $rounded_size"
export rounded_size

Run the script. This example uses the name align.sh.
```
$ sh align.sh <image-xxx>.raw
```
- If the message "Your image is already aligned. You do not need to resize." displays, proceed to the following step.
- If a value displays, your image is not aligned.
Use the following command to convert the file to a fixed VHD format.
The sample uses qemu-img version 2.12.0.
```
$ qemu-img convert -f raw -o subformat=fixed,force_size -O vpc <image-xxx>.raw <image.xxx>.vhd
```
Once converted, the VHD file is ready to upload to Azure.
If the raw image is not aligned, complete the following steps to align it.
1. Resize the raw file by using the rounded value displayed when you ran the verification script.
  $ qemu-img resize -f raw <image-xxx>.raw <rounded-value>
2. Convert the raw image file to a VHD format.
  The sample uses qemu-img version 2.12.0.
  $ qemu-img convert -f raw -o subformat=fixed,force_size -O vpc <image-xxx>.raw <image.xxx>.vhd
  Once converted, the VHD file is ready to upload to Azure.

4.9. Uploading and creating an Azure image
Copy link

To deploy RHEL virtual machine (VM) in Microsoft Azure with your custom configuration, you need to upload a RHEL virtual hard drive (VHD) file to an Azure storage container and create a custom Azure image from that VHD file.

Note

The exported storage connection string does not persist after a system reboot. If any of the commands in the following steps fail, export the connection string again.

Procedure

Upload the VHD file to the storage container. To get a list of storage containers, enter the az storage container list command.

$ az storage blob upload \
    --account-name <storage_account_name> --container-name <container_name> \
    --type page --file <path_to_vhd> --name <image_name>.vhd

Example:

[clouduser@localhost]$ az storage blob upload \
--account-name azrhelclistact --container-name azrhelclistcont \
--type page --file rhel-image-<ProductNumber>.vhd --name rhel-image-<ProductNumber>.vhd

Percent complete: %100.0

Get the URL for the uploaded VHD file to use in the following step.

$ az storage blob url -c <container_name> -n <image_name>.vhd

Example:

$ az storage blob url -c azrhelclistcont -n rhel-image-<ProductNumber>.vhd "https://azrhelclistact.blob.core.windows.net/azrhelclistcont/rhel-image-<ProductNumber>.vhd"

Create the Azure custom image.
```
$ az image create -n <image_name> -g <resource_group> -l <azure_region> --source <URL> --os-type linux
```
Note
The default hypervisor generation of the VM is V1. You can optionally specify a V2 hypervisor generation by including the option --hyper-v-generation V2. Generation 2 VMs use a UEFI-based boot architecture. See Support for generation 2 VMs on Azure for information about generation 2 VMs.
The command might return the error "Only blobs formatted as VHDs can be imported." This error might mean that the image was not aligned to the nearest 1 MB boundary before converted to VHD format.
Example:
```
$ az image create -n rhel<ProductNumber> -g azrhelclirsgrp2 -l southcentralus --source https://azrhelclistact.blob.core.windows.net/azrhelclistcont/rhel-image-<ProductNumber>.vhd --os-type linux
```

4.10. Installing Red Hat HA packages and agents
Copy link

Complete the following steps on all nodes.

Prerequisites

You have created a Red Hat Customer Portal account.
You have administrator privileges for Microsoft Azure account.
You have installed the Azure command-line interface (CLI). For more information, see Azure Command Line Interface (CLI).

Procedure

Launch an SSH terminal session and connect to the VM by using the administrator name and public IP address.

$ ssh administrator@PublicIP

To get the public IP address for an Azure VM, open the VM properties in the Azure Portal or enter the following Azure CLI command.

$ az vm list -g <resource_group> -d --output table

Example:

[clouduser@localhost ~] $ az vm list -g azrhelclirsgrp -d --output table
Name    ResourceGroup           PowerState      PublicIps        Location
------  ----------------------  --------------  -------------    --------------
node01  azrhelclirsgrp          VM running      192.98.152.251    southcentralus

$ sudo -i
# subscription-manager register

Disable all repositories.

# subscription-manager repos --disable=*

Enable the RHEL 9 Server HA repositories.

# subscription-manager repos --enable=rhel-9-for-x86_64-highavailability-rpms

Update all packages.
```
# dnf update -y
```
Install the Red Hat High Availability Add-On software packages, along with the Azure fencing agent from the High Availability channel.
```
# dnf install pcs pacemaker fence-agents-azure-arm
```
The user hacluster was created during the pcs and pacemaker installation in the last step. Create a password for hacluster on all cluster nodes. Use the same password for all nodes.
```
# passwd hacluster
```

Add the high availability service to the RHEL Firewall if firewalld.service is installed.

# firewall-cmd --permanent --add-service=high-availability
# firewall-cmd --reload

Start the pcs service and enable it to start on boot.

# systemctl start pcsd.service
# systemctl enable pcsd.service

Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.

Verification

Ensure the pcs service is running.

# systemctl status pcsd.service
pcsd.service - PCS GUI and remote configuration interface
Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2018-02-23 11:00:58 EST; 1min 23s ago
Docs: man:pcsd(8)
          man:pcs(8)
Main PID: 46235 (pcsd)
  CGroup: /system.slice/pcsd.service
          └─46235 /usr/bin/ruby /usr/lib/pcsd/pcsd > /dev/null &

4.11. Creating a cluster
Copy link

Create a Red Hat High Availability cluster on a public cloud platform by configuring and initializing the cluster nodes.

Procedure

On one of the nodes, enter the following command to authenticate the pcs user hacluster. In the command, specify the name of each node in the cluster.

# pcs host auth <hostname1> <hostname2> <hostname3>

Example:

[root@node01 clouduser]# pcs host auth node01 node02 node03
Username: hacluster
Password:
node01: Authorized
node02: Authorized
node03: Authorized

Create the cluster.

# pcs cluster setup <cluster_name> <hostname1> <hostname2> <hostname3>

Example:

[root@node01 clouduser]# pcs cluster setup new_cluster node01 node02 node03

[...]

Synchronizing pcsd certificates on nodes node01, node02, node03...
node02: Success
node03: Success
node01: Success
Restarting pcsd on the nodes in order to reload the certificates...
node02: Success
node03: Success
node01: Success

Verification

Enable the cluster.

[root@node01 clouduser]# pcs cluster enable --all
node02: Cluster Enabled
node03: Cluster Enabled
node01: Cluster Enabled

Start the cluster.

[root@node01 clouduser]# pcs cluster start --all
node02: Starting Cluster...
node03: Starting Cluster...
node01: Starting Cluster...

4.12. Overview of fencing in high availability clusters
Copy link

When a node in the cluster fails to connect to the rest of the cluster, the other nodes must restrict or release access of the failed node to shared resources. This is to ensure that resources should not remain allocated to the failed node.

Though you cannot establish communication with the failed node, as it is unresponsive, you need to fence the failed node so that the data on the failed node remains safe. Use Shoot The Other Note in The Head (STONITH), a fencing mechanism to protect the data on the failed node from getting corrupted by rogue nodes or concurrent access. STONITH ensures that rogue or unresponsive nodes are offline before another node takes over the resources of the failed node.

4.13. Creating a fencing device
Copy link

Complete the following steps to configure fencing. Complete these commands from any node in the cluster

Prerequisites

You have created a Red Hat Customer Portal account.
You have administrator privileges for Microsoft Azure account.
You have installed the Azure command-line interface (CLI). For more information, see Azure Command Line Interface (CLI).
You need to set the cluster property stonith-enabled to true.

Procedure

Identify the Azure node name for each RHEL VM. You use the Azure node names to configure the fence device.

# fence_azure_arm \
    -l <AD-Application-ID> -p <AD-Password> \
    --resourceGroup <MyResourceGroup> --tenantId <Tenant-ID> \
    --subscriptionId <Subscription-ID> -o list

Example:

[root@node01 clouduser]# fence_azure_arm \
-l e04a6a49-9f00-xxxx-xxxx-a8bdda4af447 -p z/a05AwCN0IzAjVwXXXXXXXEWIoeVp0xg7QT//JE=
--resourceGroup azrhelclirsgrp --tenantId 77ecefb6-cff0-XXXX-XXXX-757XXXX9485
--subscriptionId XXXXXXXX-38b4-4527-XXXX-012d49dfc02c -o list

node01,
node02,
node03,

View the options for the Azure ARM STONITH agent.
```
# pcs stonith describe fence_azure_arm
```
Example:
```
# pcs stonith describe fence_apc
Stonith options:
password: Authentication key
password_script: Script to run to retrieve password
```
Warning
For fence agents that offer a method option, do not specify a value of cycle as it is not supported and can cause data corruption.
Some fence devices can fence only a single node, while other devices can fence many nodes. The parameters you specify when you create a fencing device depend on what your fencing device supports and requires.
You can use the pcmk_host_list parameter when creating a fencing device to specify all machines that the fencing device controls.
You can use pcmk_host_map parameter when creating a fencing device to map host names to the specifications that comprehends the fence device.

Create a fence device.

# pcs stonith create clusterfence fence_azure_arm

To ensure immediate and complete fencing, disable ACPI Soft-Off on all cluster nodes. For information about disabling ACPI Soft-Off, see Disabling ACPI for use with integrated fence device.

Verification

Test the fencing agent for one of the other nodes:

# pcs stonith fence azurenodename

Example:

[root@node01 clouduser]# pcs status
Cluster name: newcluster
Stack: corosync
Current DC: node01 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Fri Feb 23 11:44:35 2018
Last change: Fri Feb 23 11:21:01 2018 by root via cibadmin on node01

3 nodes configured
1 resource configured

Online: [ node01 node03 ]
OFFLINE: [ node02 ]

Full list of resources:

  clusterfence  (stonith:fence_azure_arm):  Started node01

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Start the node already fenced in the earlier step:
```
# pcs cluster start <hostname>
```

Check the status to verify the node started:

# pcs status

Example:

[root@node01 clouduser]# pcs status
Cluster name: newcluster
Stack: corosync
Current DC: node01 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Fri Feb 23 11:34:59 2018
Last change: Fri Feb 23 11:21:01 2018 by root via cibadmin on node01

3 nodes configured
1 resource configured

Online: [ node01 node02 node03 ]

Full list of resources:

clusterfence    (stonith:fence_azure_arm):  Started node01

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

4.14. Creating an Azure internal load balancer
Copy link

To remove cluster nodes that do not respond to health probe requests, create an Azure internal load balancer.

Prerequisites

You have created a Red Hat Customer Portal account.
You have administrator privileges for Microsoft Azure account.
You have installed the Azure command-line interface (CLI). For more information, see Azure Command Line Interface (CLI).

Procedure

Create a Basic load balancer. Select Internal load balancer, the Basic SKU, and Dynamic for the type of IP address assignment.
Create a back-end address pool. Associate the backend pool to the availability set created while creating Azure resources in HA. Do not set any target network IP configurations.
Create a health probe. For the health probe, select TCP and enter port 61000. You can use TCP port number that does not interfere with another service. For certain HA product applications (for example, SAP HANA and SQL Server), you may need to work with Microsoft to identify the correct port to use.
Create a load balancer rule. To create the load balancing rule, the default values are prepopulated. Ensure to set Floating IP (direct server return) to Enabled.

4.15. Configuring the load balancer resource agent
Copy link

To ensure that the resource agent based service answers health probe requests from the Azure load balancer and removes cluster nodes that do not answer requests, configure the load balancer resource agent after creating a health probe.

Prerequisites

You have created a Red Hat Customer Portal account.
You have administrator privileges for Microsoft Azure account.
You have installed the Azure command-line interface (CLI). For more information, see Azure Command Line Interface (CLI).

Procedure

Install the nmap-ncat resource agents on all nodes.
```
# dnf install nmap-ncat resource-agents-cloud
```
Perform the following steps on a single node.

Create the pcs resources and group. Use your load balancer FrontendIP for the IPaddr2 address.

# pcs resource create resource-name IPaddr2 ip="10.0.0.7" --group cluster-resources-group

Configure the load balancer resource agent.

# pcs resource create resource-loadbalancer-name azure-lb port=port-number --group cluster-resources-group

Verification

Run pcs status to see the results.

[root@node01 clouduser]# pcs status

Example output:

Cluster name: clusterfence01
Stack: corosync
Current DC: node02 (version 1.1.16-12.el7_4.7-94ff4df) - partition with quorum
Last updated: Tue Jan 30 12:42:35 2018
Last change: Tue Jan 30 12:26:42 2018 by root via cibadmin on node01

3 nodes configured
3 resources configured

Online: [ node01 node02 node03 ]

Full list of resources:

clusterfence (stonith:fence_azure_arm):      Started node01
Resource Group: g_azure
    vip_azure  (ocf::heartbeat:IPaddr2):       Started node02
    lb_azure   (ocf::heartbeat:azure-lb):      Started node02

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

4.16. Configuring shared block storage
Copy link

To ensure data consistency and failover capabilities in a Red Hat High Availability cluster deployed in Microsoft Azure, you can use Azure shared block storage.

Use Azure shared block storage to enable multiple virtual machines in a cluster to simultaneously access the same storage volume, providing data consistency and failover capabilities for high-availability deployments.

Prerequisites

You have allocation of three Azure VMs (a three-node cluster) with a 1 TB shared disk.
You do not have an existing cluster.
You must have installed the Azure CLI on your host system and created your SSH key(s).
You must have created your cluster environment in Azure, which includes creating the following resources. Links are to the Microsoft Azure documentation.

Procedure

Create a shared block volume by using the Azure command az disk create.

$ az disk create -g <resource_group> -n <shared_block_volume_name> --size-gb <disk_size> --max-shares <number_vms> -l <location>

For example, the following command creates a shared block volume named shared-block-volume.vhd in the resource group sharedblock within the Azure Availability Zone westcentralus.

$ az disk create -g sharedblock-rg -n shared-block-volume.vhd --size-gb 1024 --max-shares 3 -l westcentralus

{
  "creationData": {
    "createOption": "Empty",
    "galleryImageReference": null,
    "imageReference": null,
    "sourceResourceId": null,
    "sourceUniqueId": null,
    "sourceUri": null,
    "storageAccountId": null,
    "uploadSizeBytes": null
  },
  "diskAccessId": null,
  "diskIopsReadOnly": null,
  "diskIopsReadWrite": 5000,
  "diskMbpsReadOnly": null,
  "diskMbpsReadWrite": 200,
  "diskSizeBytes": 1099511627776,
  "diskSizeGb": 1024,
  "diskState": "Unattached",
  "encryption": {
    "diskEncryptionSetId": null,
    "type": "EncryptionAtRestWithPlatformKey"
  },
  "encryptionSettingsCollection": null,
  "hyperVgeneration": "V1",
  "id": "/subscriptions/12345678910-12345678910/resourceGroups/sharedblock-rg/providers/Microsoft.Compute/disks/shared-block-volume.vhd",
  "location": "westcentralus",
  "managedBy": null,
  "managedByExtended": null,
  "maxShares": 3,
  "name": "shared-block-volume.vhd",
  "networkAccessPolicy": "AllowAll",
  "osType": null,
  "provisioningState": "Succeeded",
  "resourceGroup": "sharedblock-rg",
  "shareInfo": null,
  "sku": {
    "name": "Premium_LRS",
    "tier": "Premium"
  },
  "tags": {},
  "timeCreated": "2020-08-27T15:36:56.263382+00:00",
  "type": "Microsoft.Compute/disks",
  "uniqueId": "cd8b0a25-6fbe-4779-9312-8d9cbb89b6f2",
  "zones": null
}

Verify that you have created the shared block volume by using the Azure command az disk show.

$ az disk show -g <resource_group> -n <shared_block_volume_name>

For example, the following command shows details for the shared block volume shared-block-volume.vhd within the resource group sharedblock-rg.

$ az disk show -g sharedblock-rg -n shared-block-volume.vhd

{
  "creationData": {
    "createOption": "Empty",
    "galleryImageReference": null,
    "imageReference": null,
    "sourceResourceId": null,
    "sourceUniqueId": null,
    "sourceUri": null,
    "storageAccountId": null,
    "uploadSizeBytes": null
  },
  "diskAccessId": null,
  "diskIopsReadOnly": null,
  "diskIopsReadWrite": 5000,
  "diskMbpsReadOnly": null,
  "diskMbpsReadWrite": 200,
  "diskSizeBytes": 1099511627776,
  "diskSizeGb": 1024,
  "diskState": "Unattached",
  "encryption": {
    "diskEncryptionSetId": null,
    "type": "EncryptionAtRestWithPlatformKey"
  },
  "encryptionSettingsCollection": null,
  "hyperVgeneration": "V1",
  "id": "/subscriptions/12345678910-12345678910/resourceGroups/sharedblock-rg/providers/Microsoft.Compute/disks/shared-block-volume.vhd",
  "location": "westcentralus",
  "managedBy": null,
  "managedByExtended": null,
  "maxShares": 3,
  "name": "shared-block-volume.vhd",
  "networkAccessPolicy": "AllowAll",
  "osType": null,
  "provisioningState": "Succeeded",
  "resourceGroup": "sharedblock-rg",
  "shareInfo": null,
  "sku": {
    "name": "Premium_LRS",
    "tier": "Premium"
  },
  "tags": {},
  "timeCreated": "2020-08-27T15:36:56.263382+00:00",
  "type": "Microsoft.Compute/disks",
  "uniqueId": "cd8b0a25-6fbe-4779-9312-8d9cbb89b6f2",
  "zones": null
}

Create three network interfaces by using the Azure command az network nic create. Run the following command three times by using a different <nic_name> for each.

$ az network nic create \
-g <resource_group> -n <nic_name> --subnet <subnet_name> \
--vnet-name <virtual_network> --location <location> \
--network-security-group <network_security_group> --private-ip-address-version IPv4

For example, the following command creates a network interface with the name sharedblock-nodea-vm-nic-protected.

$ az network nic create \
-g sharedblock-rg -n sharedblock-nodea-vm-nic-protected --subnet sharedblock-subnet-protected \
--vnet-name sharedblock-vn --location westcentralus \
--network-security-group sharedblock-nsg --private-ip-address-version IPv4

Create three VMs and attach the shared block volume by using the Azure command az vm create. Option values are the same for each VM except that each VM has its own <vm_name>, <new_vm_disk_name>, and <nic_name>.

$ az vm create \
-n <vm_name> -g <resource_group> --attach-data-disks <shared_block_volume_name> \
--data-disk-caching None --os-disk-caching ReadWrite --os-disk-name <example_vm_disk_name> \
--os-disk-size-gb <disk_size> --location <location> --size <virtual_machine_size> \
--image <image_name> --admin-username <vm_username> --authentication-type ssh \
--ssh-key-values <ssh_key> --nics <nic_name> --availability-set <availability_set> --ppg <proximity_placement_group>

For example, the following command creates a VM named sharedblock-nodea-vm.

$ az vm create \
-n sharedblock-nodea-vm -g sharedblock-rg --attach-data-disks shared-block-volume.vhd \
--data-disk-caching None --os-disk-caching ReadWrite --os-disk-name sharedblock-nodea-vm.vhd \
--os-disk-size-gb 64 --location westcentralus --size Standard_D2s_v3 \
--image /subscriptions/12345678910-12345678910/resourceGroups/sample-azureimagesgroupwestcentralus/providers/Microsoft.Compute/images/sample-azure-rhel-9.3.0-20200713.n.0.x86_64 --admin-username sharedblock-user --authentication-type ssh \
--ssh-key-values @sharedblock-key.pub --nics sharedblock-nodea-vm-nic-protected --availability-set sharedblock-as --ppg sharedblock-ppg

{
  "fqdns": "",
  "id": "/subscriptions/12345678910-12345678910/resourceGroups/sharedblock-rg/providers/Microsoft.Compute/virtualMachines/sharedblock-nodea-vm",
  "location": "westcentralus",
  "macAddress": "00-22-48-5D-EE-FB",
  "powerState": "VM running",
  "privateIpAddress": "198.51.100.3",
  "publicIpAddress": "",
  "resourceGroup": "sharedblock-rg",
  "zones": ""
}

Verification

For each VM in your cluster, verify that the block device is available by using the ssh command with your IP address of VM.
```
# ssh <ip_address> "hostname ; lsblk -d | grep ' 1T '"
```
For example, the following command lists details including the hostname and block device for the VM IP 198.51.100.3.
```
# ssh 198.51.100.3 "hostname ; lsblk -d | grep ' 1T '"

nodea
sdb    8:16   0    1T  0 disk
```

Use the ssh command to verify that each VM in your cluster uses the same shared disk.

# ssh <ip_address> "hostname ; lsblk -d | grep ' 1T ' | awk '{print \$1}' | xargs -i udevadm info --query=all --name=/dev/{} | grep '^E: ID_SERIAL='"

For example, the following command lists details including the hostname and shared disk volume ID for the instance IP address 198.51.100.3.

# ssh 198.51.100.3 "hostname ; lsblk -d | grep ' 1T ' | awk '{print \$1}' | xargs -i udevadm info --query=all --name=/dev/{} | grep '^E: ID_SERIAL='"

nodea
E: ID_SERIAL=3600224808dd8eb102f6ffc5822c41d89

After verifying that the shared disk is attached to each VM, you can configure resilient storage for the cluster.

Chapter 4. Configuring a Red Hat High Availability cluster on Microsoft Azure

4.1. Benefits of using high-availability clusters on public cloud platforms
Copy link

4.2. Creating resources in Azure
Copy link

4.3. Required system packages for High Availability
Copy link

4.4. Azure VM configuration settings
Copy link

4.5. Installing Hyper-V device drivers
Copy link

4.6. Making configuration changes required for a Microsoft Azure deployment
Copy link

4.7. Creating an Azure Active Directory application
Copy link

4.8. Converting the image to a fixed VHD format
Copy link

4.9. Uploading and creating an Azure image
Copy link

4.10. Installing Red Hat HA packages and agents
Copy link

4.11. Creating a cluster
Copy link

4.12. Overview of fencing in high availability clusters
Copy link

4.13. Creating a fencing device
Copy link

4.14. Creating an Azure internal load balancer
Copy link

4.15. Configuring the load balancer resource agent
Copy link

4.16. Configuring shared block storage
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 4. Configuring a Red Hat High Availability cluster on Microsoft Azure

4.1. Benefits of using high-availability clusters on public cloud platformsCopy linkLink copied to clipboard!

4.2. Creating resources in AzureCopy linkLink copied to clipboard!

4.3. Required system packages for High AvailabilityCopy linkLink copied to clipboard!

4.4. Azure VM configuration settingsCopy linkLink copied to clipboard!

4.5. Installing Hyper-V device driversCopy linkLink copied to clipboard!

4.6. Making configuration changes required for a Microsoft Azure deploymentCopy linkLink copied to clipboard!

4.7. Creating an Azure Active Directory applicationCopy linkLink copied to clipboard!

4.8. Converting the image to a fixed VHD formatCopy linkLink copied to clipboard!

4.9. Uploading and creating an Azure imageCopy linkLink copied to clipboard!

4.10. Installing Red Hat HA packages and agentsCopy linkLink copied to clipboard!

4.11. Creating a clusterCopy linkLink copied to clipboard!

4.12. Overview of fencing in high availability clustersCopy linkLink copied to clipboard!

4.13. Creating a fencing deviceCopy linkLink copied to clipboard!

4.14. Creating an Azure internal load balancerCopy linkLink copied to clipboard!

4.15. Configuring the load balancer resource agentCopy linkLink copied to clipboard!

4.16. Configuring shared block storageCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

4.1. Benefits of using high-availability clusters on public cloud platforms
Copy link

4.2. Creating resources in Azure
Copy link

4.3. Required system packages for High Availability
Copy link

4.4. Azure VM configuration settings
Copy link

4.5. Installing Hyper-V device drivers
Copy link

4.6. Making configuration changes required for a Microsoft Azure deployment
Copy link

4.7. Creating an Azure Active Directory application
Copy link

4.8. Converting the image to a fixed VHD format
Copy link

4.9. Uploading and creating an Azure image
Copy link

4.10. Installing Red Hat HA packages and agents
Copy link

4.11. Creating a cluster
Copy link

4.12. Overview of fencing in high availability clusters
Copy link

4.13. Creating a fencing device
Copy link

4.14. Creating an Azure internal load balancer
Copy link

4.15. Configuring the load balancer resource agent
Copy link

4.16. Configuring shared block storage
Copy link