Chapter 41. Configuring an active/passive NFS server in a Red Hat High Availability cluster
The Red Hat High Availability Add-On provides support for running a highly available active/passive NFS server on a Red Hat Enterprise Linux High Availability Add-On cluster using shared storage. In the following example, you are configuring a two-node cluster in which clients access the NFS file system through a floating IP address. The NFS server runs on one of the two nodes in the cluster. If the node on which the NFS server is running becomes inoperative, the NFS server starts up again on the second node of the cluster with minimal service interruption.
This use case requires that your system include the following components:
- A two-node Red Hat High Availability cluster with power fencing configured for each node. We recommend but do not require a private network. This procedure uses the cluster example provided in Creating a Red Hat High-Availability cluster with Pacemaker.
- A public virtual IP address, required for the NFS server.
- Shared storage for the nodes in the cluster, using iSCSI, Fibre Channel, or other shared network block device.
Configuring a highly available active/passive NFS server on an existing two-node Red Hat Enterprise Linux High Availability cluster requires that you perform the following steps:
- Configure a file system on an LVM logical volume on the shared storage for the nodes in the cluster.
- Configure an NFS share on the shared storage on the LVM logical volume.
- Create the cluster resources.
- Test the NFS server you have configured.
41.1. Configuring an LVM volume with an XFS file system in a Pacemaker cluster
Create an LVM logical volume on storage that is shared between the nodes of the cluster with the following procedure.
LVM volumes and the corresponding partitions and devices used by cluster nodes must be connected to the cluster nodes only.
The following procedure creates an LVM logical volume and then creates an XFS file system on that volume for use in a Pacemaker cluster. In this example, the shared partition /dev/sdb1
is used to store the LVM physical volume from which the LVM logical volume will be created.
Procedure
On both nodes of the cluster, perform the following steps to set the value for the LVM system ID to the value of the
uname
identifier for the system. The LVM system ID will be used to ensure that only the cluster is capable of activating the volume group.Set the
system_id_source
configuration option in the/etc/lvm/lvm.conf
configuration file touname
.# Configuration option global/system_id_source. system_id_source = "uname"
Verify that the LVM system ID on the node matches the
uname
for the node.# lvm systemid system ID: z1.example.com # uname -n z1.example.com
Create the LVM volume and create an XFS file system on that volume. Since the
/dev/sdb1
partition is storage that is shared, you perform this part of the procedure on one node only.NoteIf your LVM volume group contains one or more physical volumes that reside on remote block storage, such as an iSCSI target, Red Hat recommends that you ensure that the service starts before Pacemaker starts. For information about configuring startup order for a remote physical volume used by a Pacemaker cluster, see Configuring startup order for resource dependencies not managed by Pacemaker.
Create an LVM physical volume on partition
/dev/sdb1
.[root@z1 ~]# pvcreate /dev/sdb1 Physical volume "/dev/sdb1" successfully created
NoteIf your LVM volume group contains one or more physical volumes that reside on remote block storage, such as an iSCSI target, Red Hat recommends that you ensure that the service starts before Pacemaker starts. For information about configuring startup order for a remote physical volume used by a Pacemaker cluster, see Configuring startup order for resource dependencies not managed by Pacemaker.
Create the volume group
my_vg
that consists of the physical volume/dev/sdb1
.For RHEL 8.5 and later, specify the
--setautoactivation n
flag to ensure that volume groups managed by Pacemaker in a cluster will not be automatically activated on startup. If you are using an existing volume group for the LVM volume you are creating, you can reset this flag with thevgchange --setautoactivation n
command for the volume group.[root@z1 ~]# vgcreate --setautoactivation n my_vg /dev/sdb1 Volume group "my_vg" successfully created
For RHEL 8.4 and earlier, create the volume group with the following command.
[root@z1 ~]# vgcreate my_vg /dev/sdb1 Volume group "my_vg" successfully created
For information about ensuring that volume groups managed by Pacemaker in a cluster will not be automatically activated on startup for RHEL 8.4 and earlier, see Ensuring a volume group is not activated on multiple cluster nodes.
Verify that the new volume group has the system ID of the node on which you are running and from which you created the volume group.
[root@z1 ~]# vgs -o+systemid VG #PV #LV #SN Attr VSize VFree System ID my_vg 1 0 0 wz--n- <1.82t <1.82t z1.example.com
Create a logical volume using the volume group
my_vg
.[root@z1 ~]# lvcreate -L450 -n my_lv my_vg Rounding up size to full physical extent 452.00 MiB Logical volume "my_lv" created
You can use the
lvs
command to display the logical volume.[root@z1 ~]# lvs LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert my_lv my_vg -wi-a---- 452.00m ...
Create an XFS file system on the logical volume
my_lv
.[root@z1 ~]# mkfs.xfs /dev/my_vg/my_lv meta-data=/dev/my_vg/my_lv isize=512 agcount=4, agsize=28928 blks = sectsz=512 attr=2, projid32bit=1 ...
(RHEL 8.5 and later) If you have enabled the use of a devices file by setting
use_devicesfile = 1
in thelvm.conf
file, add the shared device to the devices file on the second node in the cluster. By default, the use of a devices file is not enabled.[root@z2 ~]# lvmdevices --adddev /dev/sdb1
41.2. Ensuring a volume group is not activated on multiple cluster nodes (RHEL 8.4 and earlier)
You can ensure that volume groups that are managed by Pacemaker in a cluster will not be automatically activated on startup with the following procedure. If a volume group is automatically activated on startup rather than by Pacemaker, there is a risk that the volume group will be active on multiple nodes at the same time, which could corrupt the volume group’s metadata.
For RHEL 8.5 and later, you can disable autoactivation for a volume group when you create the volume group by specifying the --setautoactivation n
flag for the vgcreate
command, as described in Configuring an LVM volume with an XFS file system in a Pacemaker cluster.
This procedure modifies the auto_activation_volume_list
entry in the /etc/lvm/lvm.conf
configuration file. The auto_activation_volume_list
entry is used to limit autoactivation to specific logical volumes. Setting auto_activation_volume_list
to an empty list disables autoactivation entirely.
Any local volumes that are not shared and are not managed by Pacemaker should be included in the auto_activation_volume_list
entry, including volume groups related to the node’s local root and home directories. All volume groups managed by the cluster manager must be excluded from the auto_activation_volume_list
entry.
Procedure
Perform the following procedure on each node in the cluster.
Determine which volume groups are currently configured on your local storage with the following command. This will output a list of the currently-configured volume groups. If you have space allocated in separate volume groups for root and for your home directory on this node, you will see those volumes in the output, as in this example.
# vgs --noheadings -o vg_name my_vg rhel_home rhel_root
Add the volume groups other than
my_vg
(the volume group you have just defined for the cluster) as entries toauto_activation_volume_list
in the/etc/lvm/lvm.conf
configuration file.For example, if you have space allocated in separate volume groups for root and for your home directory, you would uncomment the
auto_activation_volume_list
line of thelvm.conf
file and add these volume groups as entries toauto_activation_volume_list
as follows. Note that the volume group you have just defined for the cluster (my_vg
in this example) is not in this list.auto_activation_volume_list = [ "rhel_root", "rhel_home" ]
NoteIf no local volume groups are present on a node to be activated outside of the cluster manager, you must still initialize the
auto_activation_volume_list
entry asauto_activation_volume_list = []
.Rebuild the
initramfs
boot image to guarantee that the boot image will not try to activate a volume group controlled by the cluster. Update theinitramfs
device with the following command. This command may take up to a minute to complete.# dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)
Reboot the node.
NoteIf you have installed a new Linux kernel since booting the node on which you created the boot image, the new
initrd
image will be for the kernel that was running when you created it and not for the new kernel that is running when you reboot the node. You can ensure that the correctinitrd
device is in use by running theuname -r
command before and after the reboot to determine the kernel release that is running. If the releases are not the same, update theinitrd
file after rebooting with the new kernel and then reboot the node.When the node has rebooted, check whether the cluster services have started up again on that node by executing the
pcs cluster status
command on that node. If this yields the messageError: cluster is not currently running on this node
then enter the following command.# pcs cluster start
Alternately, you can wait until you have rebooted each node in the cluster and start cluster services on all of the nodes in the cluster with the following command.
# pcs cluster start --all
41.4. Configuring the resources and resource group for an NFS server in a cluster
Configure the cluster resources for an NFS server in a cluster with the following procedure.
If you have not configured a fencing device for your cluster, by default the resources do not start.
If you find that the resources you configured are not running, you can run the pcs resource debug-start resource
command to test the resource configuration. This starts the service outside of the cluster’s control and knowledge. At the point the configured resources are running again, run pcs resource cleanup resource
to make the cluster aware of the updates.
Procedure
The following procedure configures the system resources. To ensure these resources all run on the same node, they are configured as part of the resource group nfsgroup
. The resources will start in the order in which you add them to the group, and they will stop in the reverse order in which they are added to the group. Run this procedure from one node of the cluster only.
Create the LVM-activate resource named
my_lvm
. Because the resource groupnfsgroup
does not yet exist, this command creates the resource group.WarningDo not configure more than one
LVM-activate
resource that uses the same LVM volume group in an active/passive HA configuration, as this risks data corruption. Additionally, do not configure anLVM-activate
resource as a clone resource in an active/passive HA configuration.[root@z1 ~]# pcs resource create my_lvm ocf:heartbeat:LVM-activate vgname=my_vg vg_access_mode=system_id --group nfsgroup
Check the status of the cluster to verify that the resource is running.
root@z1 ~]# pcs status Cluster name: my_cluster Last updated: Thu Jan 8 11:13:17 2015 Last change: Thu Jan 8 11:13:08 2015 Stack: corosync Current DC: z2.example.com (2) - partition with quorum Version: 1.1.12-a14efad 2 Nodes configured 3 Resources configured Online: [ z1.example.com z2.example.com ] Full list of resources: myapc (stonith:fence_apc_snmp): Started z1.example.com Resource Group: nfsgroup my_lvm (ocf::heartbeat:LVM-activate): Started z1.example.com PCSD Status: z1.example.com: Online z2.example.com: Online Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
Configure a
Filesystem
resource for the cluster.The following command configures an XFS
Filesystem
resource namednfsshare
as part of thenfsgroup
resource group. This file system uses the LVM volume group and XFS file system you created in Configuring an LVM volume with an XFS file system and will be mounted on the/nfsshare
directory you created in Configuring an NFS share.[root@z1 ~]# pcs resource create nfsshare Filesystem device=/dev/my_vg/my_lv directory=/nfsshare fstype=xfs --group nfsgroup
You can specify mount options as part of the resource configuration for a
Filesystem
resource with theoptions=options
parameter. Run thepcs resource describe Filesystem
command for full configuration options.Verify that the
my_lvm
andnfsshare
resources are running.[root@z1 ~]# pcs status ... Full list of resources: myapc (stonith:fence_apc_snmp): Started z1.example.com Resource Group: nfsgroup my_lvm (ocf::heartbeat:LVM-activate): Started z1.example.com nfsshare (ocf::heartbeat:Filesystem): Started z1.example.com ...
Create the
nfsserver
resource namednfs-daemon
as part of the resource groupnfsgroup
.NoteThe
nfsserver
resource allows you to specify annfs_shared_infodir
parameter, which is a directory that NFS servers use to store NFS-related stateful information.It is recommended that this attribute be set to a subdirectory of one of the
Filesystem
resources you created in this collection of exports. This ensures that the NFS servers are storing their stateful information on a device that will become available to another node if this resource group needs to relocate. In this example;-
/nfsshare
is the shared-storage directory managed by theFilesystem
resource -
/nfsshare/exports/export1
and/nfsshare/exports/export2
are the export directories -
/nfsshare/nfsinfo
is the shared-information directory for thenfsserver
resource
[root@z1 ~]# pcs resource create nfs-daemon nfsserver nfs_shared_infodir=/nfsshare/nfsinfo nfs_no_notify=true --group nfsgroup [root@z1 ~]# pcs status ...
-
Add the
exportfs
resources to export the/nfsshare/exports
directory. These resources are part of the resource groupnfsgroup
. This builds a virtual directory for NFSv4 clients. NFSv3 clients can access these exports as well.NoteThe
fsid=0
option is required only if you want to create a virtual directory for NFSv4 clients. For more information, see the Red Hat Knowledgebase solution How do I configure the fsid option in an NFS server’s /etc/exports file?.[root@z1 ~]# pcs resource create nfs-root exportfs clientspec=192.168.122.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfsshare/exports fsid=0 --group nfsgroup [root@z1 ~]# pcs resource create nfs-export1 exportfs clientspec=192.168.122.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfsshare/exports/export1 fsid=1 --group nfsgroup [root@z1 ~]# pcs resource create nfs-export2 exportfs clientspec=192.168.122.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfsshare/exports/export2 fsid=2 --group nfsgroup
Add the floating IP address resource that NFS clients will use to access the NFS share. This resource is part of the resource group
nfsgroup
. For this example deployment, we are using 192.168.122.200 as the floating IP address.[root@z1 ~]# pcs resource create nfs_ip IPaddr2 ip=192.168.122.200 cidr_netmask=24 --group nfsgroup
Add an
nfsnotify
resource for sending NFSv3 reboot notifications once the entire NFS deployment has initialized. This resource is part of the resource groupnfsgroup
.NoteFor the NFS notification to be processed correctly, the floating IP address must have a host name associated with it that is consistent on both the NFS servers and the NFS client.
[root@z1 ~]# pcs resource create nfs-notify nfsnotify source_host=192.168.122.200 --group nfsgroup
After creating the resources and the resource constraints, you can check the status of the cluster. Note that all resources are running on the same node.
[root@z1 ~]# pcs status ... Full list of resources: myapc (stonith:fence_apc_snmp): Started z1.example.com Resource Group: nfsgroup my_lvm (ocf::heartbeat:LVM-activate): Started z1.example.com nfsshare (ocf::heartbeat:Filesystem): Started z1.example.com nfs-daemon (ocf::heartbeat:nfsserver): Started z1.example.com nfs-root (ocf::heartbeat:exportfs): Started z1.example.com nfs-export1 (ocf::heartbeat:exportfs): Started z1.example.com nfs-export2 (ocf::heartbeat:exportfs): Started z1.example.com nfs_ip (ocf::heartbeat:IPaddr2): Started z1.example.com nfs-notify (ocf::heartbeat:nfsnotify): Started z1.example.com ...
41.5. Testing the NFS resource configuration
You can validate your NFS resource configuration in a high availability cluster with the following procedures. You should be able to mount the exported file system with either NFSv3 or NFSv4.
41.5.1. Testing the NFS export
-
If you are running the
firewalld
daemon on your cluster nodes, ensure that the ports that your system requires for NFS access are enabled on all nodes. On a node outside of the cluster, residing in the same network as the deployment, verify that the NFS share can be seen by mounting the NFS share. For this example, we are using the 192.168.122.0/24 network.
# showmount -e 192.168.122.200 Export list for 192.168.122.200: /nfsshare/exports/export1 192.168.122.0/255.255.255.0 /nfsshare/exports 192.168.122.0/255.255.255.0 /nfsshare/exports/export2 192.168.122.0/255.255.255.0
To verify that you can mount the NFS share with NFSv4, mount the NFS share to a directory on the client node. After mounting, verify that the contents of the export directories are visible. Unmount the share after testing.
# mkdir nfsshare # mount -o "vers=4" 192.168.122.200:export1 nfsshare # ls nfsshare clientdatafile1 # umount nfsshare
Verify that you can mount the NFS share with NFSv3. After mounting, verify that the test file
clientdatafile1
is visible. Unlike NFSv4, since NFSv3 does not use the virtual file system, you must mount a specific export. Unmount the share after testing.# mkdir nfsshare # mount -o "vers=3" 192.168.122.200:/nfsshare/exports/export2 nfsshare # ls nfsshare clientdatafile2 # umount nfsshare
41.5.2. Testing for failover
On a node outside of the cluster, mount the NFS share and verify access to the
clientdatafile1
file you created in Configuring an NFS share.# mkdir nfsshare # mount -o "vers=4" 192.168.122.200:export1 nfsshare # ls nfsshare clientdatafile1
From a node within the cluster, determine which node in the cluster is running
nfsgroup
. In this example,nfsgroup
is running onz1.example.com
.[root@z1 ~]# pcs status ... Full list of resources: myapc (stonith:fence_apc_snmp): Started z1.example.com Resource Group: nfsgroup my_lvm (ocf::heartbeat:LVM-activate): Started z1.example.com nfsshare (ocf::heartbeat:Filesystem): Started z1.example.com nfs-daemon (ocf::heartbeat:nfsserver): Started z1.example.com nfs-root (ocf::heartbeat:exportfs): Started z1.example.com nfs-export1 (ocf::heartbeat:exportfs): Started z1.example.com nfs-export2 (ocf::heartbeat:exportfs): Started z1.example.com nfs_ip (ocf::heartbeat:IPaddr2): Started z1.example.com nfs-notify (ocf::heartbeat:nfsnotify): Started z1.example.com ...
From a node within the cluster, put the node that is running
nfsgroup
in standby mode.[root@z1 ~]# pcs node standby z1.example.com
Verify that
nfsgroup
successfully starts on the other cluster node.[root@z1 ~]# pcs status ... Full list of resources: Resource Group: nfsgroup my_lvm (ocf::heartbeat:LVM-activate): Started z2.example.com nfsshare (ocf::heartbeat:Filesystem): Started z2.example.com nfs-daemon (ocf::heartbeat:nfsserver): Started z2.example.com nfs-root (ocf::heartbeat:exportfs): Started z2.example.com nfs-export1 (ocf::heartbeat:exportfs): Started z2.example.com nfs-export2 (ocf::heartbeat:exportfs): Started z2.example.com nfs_ip (ocf::heartbeat:IPaddr2): Started z2.example.com nfs-notify (ocf::heartbeat:nfsnotify): Started z2.example.com ...
From the node outside the cluster on which you have mounted the NFS share, verify that this outside node still continues to have access to the test file within the NFS mount.
# ls nfsshare clientdatafile1
Service will be lost briefly for the client during the failover but the client should recover it with no user intervention. By default, clients using NFSv4 may take up to 90 seconds to recover the mount; this 90 seconds represents the NFSv4 file lease grace period observed by the server on startup. NFSv3 clients should recover access to the mount in a matter of a few seconds.
From a node within the cluster, remove the node that was initially running
nfsgroup
from standby mode.NoteRemoving a node from
standby
mode does not in itself cause the resources to fail back over to that node. This will depend on theresource-stickiness
value for the resources. For information about theresource-stickiness
meta attribute, see Configuring a resource to prefer its current node.[root@z1 ~]# pcs node unstandby z1.example.com