Este conteúdo não está disponível no idioma selecionado.
Chapter 14. Fencing the Controller Nodes
Fencing is the process of isolating a failed node to protect a cluster and its resources. Without fencing, a failed node can result in data corruption in a cluster.
The director uses Pacemaker to provide a highly available cluster of Controller nodes. Pacemaker uses a process called STONITH to fence failed nodes. STONITH is disabled by default and requires manual configuration so that Pacemaker can control the power management of each node in the cluster.
14.1. Review the Prerequisites Copiar o linkLink copiado para a área de transferência!
To configure fencing in the overcloud, your overcloud must already have been deployed and be in a working state. The following steps review the state of Pacemaker and STONITH in your deployment:
-
Log in to each node as the
heat-adminuser from thestackuser on the director. The overcloud creation automatically copies thestackuser’s SSH key to each node’sheat-admin. Verify you have a running cluster:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify STONITH is disabled:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
14.2. Enable Fencing Copiar o linkLink copiado para a área de transferência!
Having confirmed your overcloud is deployed and working, you can then configure fencing:
Generate the
fencing.yamlfile:openstack overcloud generate fencing --ipmi-lanplus --ipmi-level administrator --output fencing.yaml instackenv.json
$ openstack overcloud generate fencing --ipmi-lanplus --ipmi-level administrator --output fencing.yaml instackenv.jsonCopy to Clipboard Copied! Toggle word wrap Toggle overflow Sample
fencing.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Pass the resulting
fencing.yamlfile to thedeploycommand you previously used to deploy the overcloud. This will re-run the deployment procedure and configure fencing on the hosts:openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e ~/templates/network-environment.yaml -e ~/templates/storage-environment.yaml --control-scale 3 --compute-scale 3 --ceph-storage-scale 3 --control-flavor control --compute-flavor compute --ceph-storage-flavor ceph-storage --ntp-server pool.ntp.org --neutron-network-type vxlan --neutron-tunnel-types vxlan -e fencing.yaml
openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e ~/templates/network-environment.yaml -e ~/templates/storage-environment.yaml --control-scale 3 --compute-scale 3 --ceph-storage-scale 3 --control-flavor control --compute-flavor compute --ceph-storage-flavor ceph-storage --ntp-server pool.ntp.org --neutron-network-type vxlan --neutron-tunnel-types vxlan -e fencing.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow The deployment command should complete without any error or exceptions.
Log in to the overcloud and verify fencing was configured for each of the controllers:
Check the fencing resources are managed by Pacemaker:
source stackrc nova list | grep controller ssh heat-admin@<controller-x_ip> sudo pcs status |grep fence
$ source stackrc $ nova list | grep controller $ ssh heat-admin@<controller-x_ip> $ sudo pcs status |grep fence stonith-overcloud-controller-x (stonith:fence_ipmilan): Started overcloud-controller-yCopy to Clipboard Copied! Toggle word wrap Toggle overflow You should see Pacemaker is configured to use a STONITH resource for each of the controllers specified in
fencing.yaml. Thefence-resourceprocess should not be configured on the same host it controls.Use
pcsto verify the fence resource attributes:sudo pcs stonith show <stonith-resource-controller-x>
$ sudo pcs stonith show <stonith-resource-controller-x>Copy to Clipboard Copied! Toggle word wrap Toggle overflow The values used by STONITH should match those defined in the
fencing.yaml.
14.3. Test Fencing Copiar o linkLink copiado para a área de transferência!
This procedure tests whether fencing is working as expected.
Trigger a fencing action for each controller in the deployment:
Log in to a controller:
source stackrc nova list |grep controller ssh heat-admin@<controller-x_ip>
$ source stackrc $ nova list |grep controller $ ssh heat-admin@<controller-x_ip>Copy to Clipboard Copied! Toggle word wrap Toggle overflow As root, trigger fencing by using
iptablesto close all ports:Copy to Clipboard Copied! Toggle word wrap Toggle overflow As a result, the connections should drop, and the server should be rebooted.
From another controller, locate the fencing event in the Pacemaker log file:
ssh heat-admin@<controller-x_ip> less /var/log/cluster/corosync.log
$ ssh heat-admin@<controller-x_ip> $ less /var/log/cluster/corosync.log (less): /fenc*Copy to Clipboard Copied! Toggle word wrap Toggle overflow You should see that STONITH has issued a fence action against the controller, and that Pacemaker has raised an event in the log.
Verify the rebooted controller has returned to the cluster:
-
From the second controller, wait a few minutes and run
pcs statusto see if the fenced controller has returned to the cluster. The duration can vary depending on your configuration.
-
From the second controller, wait a few minutes and run