Chapter 4. Configuring a Red Hat High Availability Cluster on Google Cloud Platform
To create a cluster where RHEL nodes automatically redistribute their workloads if a node failure occurs, use the Red Hat High Availability Add-On. Such high availability (HA) clusters can also be hosted on public cloud platforms, including Google Cloud Platform (GCP). Creating RHEL HA clusters on GCP is similar to creating HA clusters in non-cloud environments, with certain specifics.
To configure a Red Hat HA cluster on Google Cloud Platform (GCP) using Google Compute Engine (GCE) virtual machine (VM) instances as cluster nodes, see the following sections.
These provide information on:
- Prerequisite procedures for setting up your environment for GCP. Once you have set up your environment, you can create and configure VM instances.
- Procedures specific to the creation of HA clusters, which transform individual nodes into a cluster of HA nodes on GCP. These include procedures for installing the High Availability packages and agents on each cluster node, configuring fencing, and installing network resource agents.
Prerequisites
- Red Hat Enterprise Linux 8 Server: rhel-8-server-rpms/8Server/x86_64
Red Hat Enterprise Linux 8 Server (High Availability): rhel-8-server-ha-rpms/8Server/x86_64
- You must belong to an active GCP project and have sufficient permissions to create resources in the project.
- Your project should have a service account that belongs to a VM instance and not an individual user. See Using the Compute Engine Default Service Account for information about using the default service account instead of creating a separate service account.
If you or your project administrator create a custom service account, the service account should be configured for the following roles.
- Cloud Trace Agent
- Compute Admin
- Compute Network Admin
- Cloud Datastore User
- Logging Admin
- Monitoring Editor
- Monitoring Metric Writer
- Service Account Administrator
- Storage Admin
4.1. The benefits of using high-availability clusters on public cloud platforms Copy linkLink copied to clipboard!
A high-availability (HA) cluster is a set of computers (called nodes) that are linked together to run a specific workload. The purpose of HA clusters is to provide redundancy in case of a hardware or software failure. If a node in the HA cluster fails, the Pacemaker cluster resource manager distributes the workload to other nodes and no noticeable downtime occurs in the services that are running on the cluster.
You can also run HA clusters on public cloud platforms. In this case, you would use virtual machine (VM) instances in the cloud as the individual cluster nodes. Using HA clusters on a public cloud platform has the following benefits:
- Improved availability: In case of a VM failure, the workload is quickly redistributed to other nodes, so running services are not disrupted.
- Scalability: Additional nodes can be started when demand is high and stopped when demand is low.
- Cost-effectiveness: With the pay-as-you-go pricing, you pay only for nodes that are running.
- Simplified management: Some public cloud platforms offer management interfaces to make configuring HA clusters easier.
To enable HA on your Red Hat Enterprise Linux (RHEL) systems, Red Hat offers a High Availability Add-On. The High Availability Add-On provides all necessary components for creating HA clusters on RHEL systems. The components include high availability service management and cluster administration tools.
4.2. Required system packages Copy linkLink copied to clipboard!
To create and configure a base image of RHEL, your host system must have the following packages installed.
| Package | Repository | Description |
|---|---|---|
| libvirt | rhel-8-for-x86_64-appstream-rpms | Open source API, daemon, and management tool for managing platform virtualization |
| virt-install | rhel-8-for-x86_64-appstream-rpms | A command-line utility for building VMs |
| libguestfs | rhel-8-for-x86_64-appstream-rpms | A library for accessing and modifying VM file systems |
| libguestfs-tools | rhel-8-for-x86_64-appstream-rpms |
System administration tools for VMs; includes the |
4.3. Red Hat Enterprise Linux image options on GCP Copy linkLink copied to clipboard!
You can use multiple types of images for deploying RHEL 8 on Google Cloud Platform. Based on your requirements, consider which option is optimal for your use case.
| Image option | Subscriptions | Sample scenario | Considerations |
|---|---|---|---|
| Deploy a Red Hat Gold Image. | Use your existing Red Hat subscriptions. | Select a Red Hat Gold Image on Google Cloud Platform. For details on Gold Images and how to access them on Google Cloud Platform, see the Red Hat Cloud Access Reference Guide. | The subscription includes the Red Hat product cost; you pay Google for all other instance costs. Red Hat provides support directly for custom RHEL images. |
| Deploy a custom image that you move to GCP. | Use your existing Red Hat subscriptions. | Upload your custom image and attach your subscriptions. | The subscription includes the Red Hat product cost; you pay all other instance costs. Red Hat provides support directly for custom RHEL images. |
| Deploy an existing GCP image that includes RHEL. | The GCP images include a Red Hat product. | Choose a RHEL image when you launch an instance on the GCP Compute Engine, or choose an image from the Google Cloud Platform Marketplace. | You pay GCP hourly on a pay-as-you-go model. Such images are called "on-demand" images. GCP offers support for on-demand images through a support agreement. |
You can create a custom image for GCP by using Red Hat Image Builder. See Composing a Customized RHEL System Image for more information.
You cannot convert an on-demand instance to a custom RHEL instance. To change from an on-demand image to a custom RHEL bring-your-own-subscription (BYOS) image:
- Create a new custom RHEL instance and migrate data from your on-demand instance.
- Cancel your on-demand instance after you migrate your data to avoid double billing.
4.4. Installing the Google Cloud SDK Copy linkLink copied to clipboard!
Many of the procedures to manage HA clusters on Google Cloud Platform (GCP) require the tools in the Google Cloud SDK.
Procedure
- Follow the GCP instructions for downloading and extracting the Google Cloud SDK archive. See the GCP document Quickstart for Linux for details.
Follow the same instructions for initializing the Google Cloud SDK.
NoteOnce you have initialized the Google Cloud SDK, you can use the
gcloudCLI commands to perform tasks and obtain information about your project and instances. For example, you can display project information with thegcloud compute project-info describe --project <project-name>command.
4.5. Creating a GCP image bucket Copy linkLink copied to clipboard!
The following document includes the minimum requirements for creating a multi-regional bucket in your default location.
Prerequisites
- GCP storage utility (gsutil)
Procedure
If you are not already logged in to Google Cloud Platform, log in with the following command.
# gcloud auth loginCreate a storage bucket.
$ gsutil mb gs://BucketNameExample:
$ gsutil mb gs://rhel-ha-bucket
4.6. Creating a custom virtual private cloud network and subnet Copy linkLink copied to clipboard!
A custom virtual private cloud (VPC) network and subnet are required for a cluster to be configured with a High Availability (HA) function.
Procedure
- Launch the GCP Console.
- Select VPC networks under Networking in the left navigation pane.
- Click Create VPC Network.
- Enter a name for the VPC network.
- Under the New subnet, create a Custom subnet in the region where you want to create the cluster.
- Click Create.
4.7. Preparing and importing a base GCP image Copy linkLink copied to clipboard!
Before a local RHEL 8 image can be deployed in GCP, you must first convert and upload the image to your GCP Bucket.
Procedure
Convert the file. Images uploaded to GCP must be in
rawformat and nameddisk.raw.$ qemu-img convert -f qcow2 ImageName.qcow2 -O raw disk.rawCompress the
rawfile. Images uploaded to GCP must be compressed.$ tar -Sczf ImageName.tar.gz disk.rawImport the compressed image to the bucket created earlier.
$ gsutil cp ImageName.tar.gz gs://BucketName
4.8. Creating and configuring a base GCP instance Copy linkLink copied to clipboard!
To create and configure a Google Cloud Platform (GCP) instance that complies with GCP operating and security requirements, complete the following steps.
Procedure
Create an image from the compressed file in the bucket.
$ gcloud compute images create BaseImageName --source-uri gs://BucketName/BaseImageName.tar.gzExample:
[admin@localhost ~] $ gcloud compute images create rhel-76-server --source-uri gs://user-rhelha/rhel-server-76.tar.gz Created [https://www.googleapis.com/compute/v1/projects/MyProject/global/images/rhel-server-76]. NAME PROJECT FAMILY DEPRECATED STATUS rhel-76-server rhel-ha-testing-on-gcp READYCreate a template instance from the image. The minimum size required for a base RHEL instance is n1-standard-2. See gcloud compute instances create for additional configuration options.
$ gcloud compute instances create BaseInstanceName --can-ip-forward --machine-type n1-standard-2 --image BaseImageName --service-account ServiceAccountEmailExample:
[admin@localhost ~] $ gcloud compute instances create rhel-76-server-base-instance --can-ip-forward --machine-type n1-standard-2 --image rhel-76-server --service-account account@project-name-on-gcp.iam.gserviceaccount.com Created [https://www.googleapis.com/compute/v1/projects/rhel-ha-testing-on-gcp/zones/us-east1-b/instances/rhel-76-server-base-instance]. NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS rhel-76-server-base-instance us-east1-bn1-standard-2 10.10.10.3 192.227.54.211 RUNNINGConnect to the instance with an SSH terminal session.
$ ssh root@PublicIPaddressUpdate the RHEL software.
- Register with Red Hat Subscription Manager (RHSM).
- Enable a Subscription Pool ID.
Disable all repositories.
# subscription-manager repos --disable=*Enable the following repository.
# subscription-manager repos --enable=rhel-8-server-rpmsRun the
yum updatecommand.# yum update -y
Install the GCP Linux Guest Environment on the running instance (in-place installation).
See Install the guest environment in-place for instructions.
- Select the CentOS/RHEL option.
- Copy the command script and paste it at the command prompt to run the script immediately.
Make the following configuration changes to the instance. These changes are based on GCP recommendations for custom images. See gcloudcompute images list for more information.
-
Edit the
/etc/chrony.conffile and remove all NTP servers. Add the following NTP server.
metadata.google.internal iburst Google NTP serverRemove any persistent network device rules.
# rm -f /etc/udev/rules.d/70-persistent-net.rules # rm -f /etc/udev/rules.d/75-persistent-net-generator.rulesSet the network service to start automatically.
# chkconfig network onSet the
sshd serviceto start automatically.# systemctl enable sshd # systemctl is-enabled sshdSet the time zone to UTC.
# ln -sf /usr/share/zoneinfo/UTC /etc/localtimeOptional: Edit the
/etc/ssh/ssh_configfile and add the following lines to the end of the file. This keeps your SSH session active during longer periods of inactivity.# Server times out connections after several minutes of inactivity. # Keep alive ssh connections by sending a packet every 7 minutes. ServerAliveInterval 420Edit the
/etc/ssh/sshd_configfile and make the following changes, if necessary. The ClientAliveInterval 420 setting is optional; this keeps your SSH session active during longer periods of inactivity.PermitRootLogin no PasswordAuthentication no AllowTcpForwarding yes X11Forwarding no PermitTunnel no # Compute times out connections after 10 minutes of inactivity. # Keep ssh connections alive by sending a packet every 7 minutes. ClientAliveInterval 420
-
Edit the
Disable password access.
ssh_pwauth from 1 to 0. ssh_pwauth: 0ImportantPreviously, you enabled password access to allow SSH session access to configure the instance. You must disable password access. All SSH session access must be passwordless.
Unregister the instance from the subscription manager.
# subscription-manager unregisterClean the shell history. Keep the instance running for the next procedure.
# export HISTSIZE=0
4.9. Creating a snapshot image Copy linkLink copied to clipboard!
To preserve the configuration and disk data of a GCP HA instance, create a snapshot of it.
Procedure
On the running instance, synchronize data to disk.
# syncOn your host system, create the snapshot.
$ gcloud compute disks snapshot InstanceName --snapshot-names SnapshotNameOn your host system, create the configured image from the snapshot.
$ gcloud compute images create ConfiguredImageFromSnapshot --source-snapshot SnapshotName
4.10. Creating an HA node template instance and HA nodes Copy linkLink copied to clipboard!
After you have configured an image from the snapshot, you can create a node template. Then, you can use this template to create all HA nodes.
Procedure
Create an instance template.
$ gcloud compute instance-templates create InstanceTemplateName --can-ip-forward --machine-type n1-standard-2 --image ConfiguredImageFromSnapshot --service-account ServiceAccountEmailAddressExample:
[admin@localhost ~] $ gcloud compute instance-templates create rhel-81-instance-template --can-ip-forward --machine-type n1-standard-2 --image rhel-81-gcp-image --service-account account@project-name-on-gcp.iam.gserviceaccount.com Created [https://www.googleapis.com/compute/v1/projects/project-name-on-gcp/global/instanceTemplates/rhel-81-instance-template]. NAME MACHINE_TYPE PREEMPTIBLE CREATION_TIMESTAMP rhel-81-instance-template n1-standard-2 2018-07-25T11:09:30.506-07:00Create multiple nodes in one zone.
# gcloud compute instances create NodeName01 NodeName02 --source-instance-template InstanceTemplateName --zone RegionZone --network=NetworkName --subnet=SubnetNameExample:
[admin@localhost ~] $ gcloud compute instances create rhel81-node-01 rhel81-node-02 rhel81-node-03 --source-instance-template rhel-81-instance-template --zone us-west1-b --network=projectVPC --subnet=range0 Created [https://www.googleapis.com/compute/v1/projects/project-name-on-gcp/zones/us-west1-b/instances/rhel81-node-01]. Created [https://www.googleapis.com/compute/v1/projects/project-name-on-gcp/zones/us-west1-b/instances/rhel81-node-02]. Created [https://www.googleapis.com/compute/v1/projects/project-name-on-gcp/zones/us-west1-b/instances/rhel81-node-03]. NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS rhel81-node-01 us-west1-b n1-standard-2 10.10.10.4 192.230.25.81 RUNNING rhel81-node-02 us-west1-b n1-standard-2 10.10.10.5 192.230.81.253 RUNNING rhel81-node-03 us-east1-b n1-standard-2 10.10.10.6 192.230.102.15 RUNNING
4.11. Installing HA packages and agents Copy linkLink copied to clipboard!
On each of your nodes, you need to install the High Availability packages and agents to be able to configure a Red Hat High Availability cluster on Google Cloud Platform (GCP).
Procedure
- In the Google Cloud Console, select Compute Engine and then select VM instances.
- Select the instance, click the arrow next to SSH, and select the View gcloud command option.
- Paste this command at a command prompt for passwordless access to the instance.
- Enable sudo account access and register with Red Hat Subscription Manager.
- Enable a Subscription Pool ID.
Disable all repositories.
# subscription-manager repos --disable=*Enable the following repositories.
# subscription-manager repos --enable=rhel-8-server-rpms # subscription-manager repos --enable=rhel-8-for-x86_64-highavailability-rpmsInstall
pcs pacemaker, the fence agents, and the resource agents.# yum install -y pcs pacemaker fence-agents-gce resource-agents-gcpUpdate all packages.
# yum update -y
4.12. Configuring HA services Copy linkLink copied to clipboard!
On each of your nodes, configure the HA services.
Procedure
The user
haclusterwas created during thepcsandpacemakerinstallation in the previous step. Create a password for the userhaclusteron all cluster nodes. Use the same password for all nodes.# passwd haclusterIf the
firewalldservice is installed, add the HA service.# firewall-cmd --permanent --add-service=high-availability # firewall-cmd --reloadStart the
pcsservice and enable it to start on boot.# systemctl start pcsd.service # systemctl enable pcsd.service Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.
Verification
Ensure the
pcsdservice is running.# systemctl status pcsd.service pcsd.service - PCS GUI and remote configuration interface Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled) Active: active (running) since Mon 2018-06-25 19:21:42 UTC; 15s ago Docs: man:pcsd(8) man:pcs(8) Main PID: 5901 (pcsd) CGroup: /system.slice/pcsd.service └─5901 /usr/bin/ruby /usr/lib/pcsd/pcsd > /dev/null &-
Edit the
/etc/hostsfile. Add RHEL host names and internal IP addresses for all nodes.
4.13. Creating a cluster Copy linkLink copied to clipboard!
To convert multiple nodes into a cluster, use the following steps.
Procedure
On one of the nodes, authenticate the
pcsuser. Specify the name of each node in the cluster in the command.# pcs host auth hostname1 hostname2 hostname3 Username: hacluster Password: hostname1: Authorized hostname2: Authorized hostname3: AuthorizedCreate the cluster.
# pcs cluster setup cluster-name hostname1 hostname2 hostname3
Verification
Run the following command to enable nodes to join the cluster automatically when started.
# pcs cluster enable --allStart the cluster.
# pcs cluster start --all
4.14. Creating a fencing device Copy linkLink copied to clipboard!
High Availability (HA) environments require a fencing device, which ensures that malfunctioning nodes are isolated and the cluster remains available if an outage occurs.
Note that for most default configurations, the GCP instance names and the RHEL host names are identical.
Procedure
Obtain GCP instance names. Note that the output of the following command also shows the internal ID for the instance.
# fence_gce --zone us-west1-b --project=rhel-ha-on-gcp -o listExample:
[root@rhel81-node-01 ~]# fence_gce --zone us-west1-b --project=rhel-ha-testing-on-gcp -o list 4435801234567893181,InstanceName-3 4081901234567896811,InstanceName-1 7173601234567893341,InstanceName-2Create a fence device.
# pcs stonith create FenceDeviceName fence_gce zone=Region-Zone project=MyProject- To ensure immediate and complete fencing, disable ACPI Soft-Off on all cluster nodes. For information about disabling ACPI Soft-Off, see Disabling ACPI for use with integrated fence device.
Verification
Verify that the fence devices started.
# pcs statusExample:
[root@rhel81-node-01 ~]# pcs status Cluster name: gcp-cluster Stack: corosync Current DC: rhel81-node-02 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum Last updated: Fri Jul 27 12:53:25 2018 Last change: Fri Jul 27 12:51:43 2018 by root via cibadmin on rhel81-node-01 3 nodes configured 3 resources configured Online: [ rhel81-node-01 rhel81-node-02 rhel81-node-03 ] Full list of resources: us-west1-b-fence (stonith:fence_gce): Started rhel81-node-01 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
4.15. Configuring the gcp-vcp-move-vip resource agent Copy linkLink copied to clipboard!
The gcp-vpc-move-vip resource agent attaches a secondary IP address (alias IP) to a running instance. This is a floating IP address that can be passed between different nodes in the cluster.
To show more information about this resource:
# pcs resource describe gcp-vpc-move-vip
You can configure the resource agent to use a primary subnet address range or a secondary subnet address range:
Primary subnet address range
Complete the following steps to configure the resource for the primary VPC subnet.
Procedure
Create the
aliasipresource. Include an unused internal IP address. Include the CIDR block in the command.# pcs resource create aliasip gcp-vpc-move-vip alias_ip=UnusedIPaddress/CIDRblockExample:
[root@rhel81-node-01 ~]# pcs resource create aliasip gcp-vpc-move-vip alias_ip=10.10.10.200/32Create an
IPaddr2resource for managing the IP on the node.# pcs resource create vip IPaddr2 nic=interface ip=AliasIPaddress cidr_netmask=32Example:
[root@rhel81-node-01 ~]# pcs resource create vip IPaddr2 nic=eth0 ip=10.10.10.200 cidr_netmask=32Group the network resources under
vipgrp.# pcs resource group add vipgrp aliasip vip
Verification
Verify that the resources have started and are grouped under
vipgrp.# pcs statusVerify that the resource can move to a different node.
# pcs resource move vip NodeExample:
[root@rhel81-node-01 ~]# pcs resource move vip rhel81-node-03Verify that the
vipsuccessfully started on a different node.# pcs status
Secondary subnet address range
Complete the following steps to configure the resource for a secondary subnet address range.
Prerequisites
- You have created a custom network and a subnet
Optional: You have installed Google Cloud SDK. For instructions, see Installing the Google Cloud SDK.
Note, however, that you can use the
gcloudcommands in the following procedure in the terminal that you can activate in the Google Cloud web console.
Procedure
Create a secondary subnet address range.
# gcloud compute networks subnets update SubnetName --region RegionName --add-secondary-ranges SecondarySubnetName=SecondarySubnetRangeExample:
# gcloud compute networks subnets update range0 --region us-west1 --add-secondary-ranges range1=10.10.20.0/24Create the
aliasipresource. Create an unused internal IP address in the secondary subnet address range. Include the CIDR block in the command.# pcs resource create aliasip gcp-vpc-move-vip alias_ip=UnusedIPaddress/CIDRblockExample:
[root@rhel81-node-01 ~]# pcs resource create aliasip gcp-vpc-move-vip alias_ip=10.10.20.200/32Create an
IPaddr2resource for managing the IP on the node.# pcs resource create vip IPaddr2 nic=interface ip=AliasIPaddress cidr_netmask=32Example:
[root@rhel81-node-01 ~]# pcs resource create vip IPaddr2 nic=eth0 ip=10.10.20.200 cidr_netmask=32Group the network resources under
vipgrp.# pcs resource group add vipgrp aliasip vip
Verification
Verify that the resources have started and are grouped under
vipgrp.# pcs statusVerify that the resource can move to a different node.
# pcs resource move vip NodeExample:
[root@rhel81-node-01 ~]# pcs resource move vip rhel81-node-03Verify that the
vipsuccessfully started on a different node.# pcs status