Appendix B. Manual Procedure Automated by Ansible Playbooks
The Ansible-based solution provided by this document is designed to automate a manual procedure for configuring Instance HA in a supported manner. For reference, this appendix provides the steps automated by the solution.
1. Begin by disabling libvirtd and all OpenStack services on the Compute nodes:
heat-admin@compute-n # sudo openstack-service stop heat-admin@compute-n # sudo openstack-service disable heat-admin@compute-n # sudo systemctl stop libvirtd heat-admin@compute-n # sudo systemctl disable libvirtd
2. Create an authentication key for use with pacemaker-remote.
Perform this step on one of the Compute nodes:
heat-admin@compute-1 # sudo mkdir -p /etc/pacemaker/ heat-admin@compute-1 # sudo dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1 heat-admin@compute-1 # sudo cp /etc/pacemaker/authkey ./ heat-admin@compute-1 # sudo chown heat-admin:heat-admin authkey
3. Copy this key to the director node, and then to the remaining Compute and Controller nodes:
stack@director # scp authkey heat-admin@node-n:~/ heat-admin@node-n # sudo mkdir -p --mode=0750 /etc/pacemaker heat-admin@node-n # sudo chgrp haclient /etc/pacemaker heat-admin@node-n # sudo chown root:haclient /etc/pacemaker/authkey
4. Enable pacemaker-remote on all Compute nodes:
heat-admin@compute-n # sudo systemctl enable pacemaker_remote heat-admin@compute-n # sudo systemctl start pacemaker_remote
5. Confirm that the required versions of the pacemaker (1.1.12-22.el7_1.4.x86_64
) and resource-agents (3.9.5-40.el7_1.5.x86_64
) packages are installed on the controller and Compute nodes:
heat-admin@controller-n # sudo rpm -qa | egrep \'(pacemaker|resource-agents)'
6. Apply the following constraint workarounds required for BZ#1257414.
This issue has been addressed in RHSA-2015:1862, and might not be required for your environment.
heat-admin@controller-1 # sudo pcs constraint order start openstack-nova-novncproxy-clone then openstack-nova-api-clone heat-admin@controller-1 # sudo pcs constraint order start rabbitmq-clone then openstack-keystone-clone heat-admin@controller-1 # sudo pcs constraint order promote galera-master then openstack-keystone-clone heat-admin@controller-1 # sudo pcs constraint order start haproxy-clone then openstack-keystone-clone heat-admin@controller-1 # sudo pcs constraint order start memcached-clone then openstack-keystone-clone heat-admin@controller-1 # sudo pcs constraint order promote redis-master then start openstack-ceilometer-central-clone require-all=false heat-admin@controller-1 # sudo pcs resource defaults resource-stickiness=INFINITY
7. Create a NovaEvacuate active/passive resource using the overcloudrc file to provide the auth_url
, username
, tenant
and password
values:
stack@director # scp overcloudrc heat-admin@controller-1:~/ heat-admin@controller-1 # . ~/overcloudrc heat-admin@controller-1 # sudo pcs resource create nova-evacuate ocf:openstack:NovaEvacuate auth_url=$OS_AUTH_URL username=$OS_USERNAME password=$OS_PASSWORD tenant_name=$OS_TENANT_NAME \ 1
- 1
- If you are not using shared storage, include the no_shared_storage=1 option. See Section 2.1, “Exceptions for Shared Storage” for more information.
8. Confirm that nova-evacuate is started after the floating IP resources, and the Image Service (glance), OpenStack Networking (neutron), Compute (nova) services:
heat-admin@controller-1 # for i in $(sudo pcs status | grep IP | awk \'{ print $1 }\'); do sudo pcs constraint order start $i then nova-evacuate ; done heat-admin@controller-1 # for i in openstack-glance-api-clone neutron-metadata-agent-clone openstack-nova-conductor-clone; do sudo pcs constraint order start $i then nova-evacuate require-all=false ; done
9. Disable all OpenStack resources across the control plane:
heat-admin@controller-1 # sudo pcs resource disable openstack-keystone --wait=540
The timeout used here (--wait=540) is only used as an example. Depending on the time needed to stop the Identity Service (and on the power of your hardware), you can consider increasing the timeout period.
10. Create a list of the current controllers using cibadmin
data :
heat-admin@controller-1 # controllers=$(sudo cibadmin -Q -o nodes | grep uname | sed s/.*uname..// | awk -F\" \'{print $1}') heat-admin@controller-1 # echo $controllers
11. Use this list to tag these nodes as controllers with the osprole=controller
property:
heat-admin@controller-1 # for controller in ${controllers}; do sudo pcs property set --node ${controller} osprole=controller ; done
12. Build a list of stonith devices already present in the environment:
heat-admin@controller-1 # stonithdevs=$(sudo pcs stonith | awk \'{print $1}') heat-admin@controller-1 # echo $stonithdevs
13. Tag the control plane services to make sure they only run on the controllers identified above, skipping any stonith devices listed:
heat-admin@controller-1 # for i in $(sudo cibadmin -Q --xpath //primitive --node-path | tr ' ' \'\n' | awk -F "id=\'" '{print $2}' | awk -F "\'" '{print $1}' | uniq); do
found=0
if [ -n "$stonithdevs" ]; then
for x in $stonithdevs; do
if [ $x = $i ]; then
found=1
fi
done
fi
if [ $found = 0 ]; then
sudo pcs constraint location $i rule resource-discovery=exclusive score=0 osprole eq controller
fi
done
14. Begin to populate the Compute node resources within pacemaker, starting with neutron-openvswitch-agent:
heat-admin@controller-1 # sudo pcs resource create neutron-openvswitch-agent-compute systemd:neutron-openvswitch-agent op start timeout 200s stop timeout 200s --clone interleave=true --disabled --force heat-admin@controller-1 # sudo pcs constraint location neutron-openvswitch-agent-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute heat-admin@controller-1 # sudo pcs constraint order start neutron-server-clone then neutron-openvswitch-agent-compute-clone require-all=false
Then the compute libvirtd resource:
heat-admin@controller-1 # sudo pcs resource create libvirtd-compute systemd:libvirtd op start timeout 200s stop timeout 200s --clone interleave=true --disabled --force heat-admin@controller-1 # sudo pcs constraint location libvirtd-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute heat-admin@controller-1 # sudo pcs constraint order start neutron-openvswitch-agent-compute-clone then libvirtd-compute-clone heat-admin@controller-1 # sudo pcs constraint colocation add libvirtd-compute-clone with neutron-openvswitch-agent-compute-clone
Then the openstack-ceilometer-compute resource:
heat-admin@controller-1 # sudo pcs resource create ceilometer-compute systemd:openstack-ceilometer-compute op start timeout 200s stop timeout 200s --clone interleave=true --disabled --force heat-admin@controller-1 # sudo pcs constraint location ceilometer-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute heat-admin@controller-1 # sudo pcs constraint order start openstack-ceilometer-notification-clone then ceilometer-compute-clone require-all=false heat-admin@controller-1 # sudo pcs constraint order start libvirtd-compute-clone then ceilometer-compute-clone heat-admin@controller-1 # sudo pcs constraint colocation add ceilometer-compute-clone with libvirtd-compute-clone
Then the nova-compute resource:
heat-admin@controller-1 # . /home/heat-admin/overcloudrc heat-admin@controller-1 # sudo pcs resource create nova-compute-checkevacuate ocf:openstack:nova-compute-wait auth_url=$OS_AUTH_URL username=$OS_USERNAME password=$OS_PASSWORD tenant_name=$OS_TENANT_NAME domain=localdomain op start timeout=300 --clone interleave=true --disabled --force heat-admin@controller-1 # sudo pcs constraint location nova-compute-checkevacuate-clone rule resource-discovery=exclusive score=0 osprole eq compute heat-admin@controller-1 # sudo pcs constraint order start openstack-nova-conductor-clone then nova-compute-checkevacuate-clone require-all=false heat-admin@controller-1 # sudo pcs resource create nova-compute systemd:openstack-nova-compute --clone interleave=true --disabled --force heat-admin@controller-1 # sudo pcs constraint location nova-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute heat-admin@controller-1 # sudo pcs constraint order start nova-compute-checkevacuate-clone then nova-compute-clone require-all=true heat-admin@controller-1 # sudo pcs constraint order start nova-compute-clone then nova-evacuate require-all=false heat-admin@controller-1 # sudo pcs constraint order start libvirtd-compute-clone then nova-compute-clone heat-admin@controller-1 # sudo pcs constraint colocation add nova-compute-clone with libvirtd-compute-clone
15. Add stonith devices for the Compute nodes. Run the following command for each Compute node:
heat-admin@controller-1 # sudo pcs stonith create ipmilan-overcloud-compute-N fence_ipmilan pcmk_host_list=overcloud-compute-0 ipaddr=10.35.160.78 login=IPMILANUSER passwd=IPMILANPW lanplus=1 cipher=1 op monitor interval=60s;
Where:
-
N is the identifying number of each compute node (for example,
ipmilan-overcloud-compute-1
,ipmilan-overcloud-compute-2
, and so on). - IPMILANUSER and IPMILANPW are the username and password to the IPMI device.
16. Create a seperate fence-nova stonith device:
heat-admin@controller-1 # . overcloudrc heat-admin@controller-1 # sudo pcs stonith create fence-nova fence_compute \ auth-url=$OS_AUTH_URL \ login=$OS_USERNAME \ passwd=$OS_PASSWORD \ tenant-name=$OS_TENANT_NAME \ record-only=1 --force
17. Make certain the Compute nodes are able to recover after fencing:
heat-admin@controller-1 # sudo pcs property set cluster-recheck-interval=1min
18. Create Compute node resources and set the stonith level 1 to include both the nodes’s physical fence device and fence-nova. Run the following commands for each Compute node:
heat-admin@controller-1 # sudo pcs resource create overcloud-compute-N ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20 heat-admin@controller-1 # sudo pcs property set --node overcloud-compute-N osprole=compute heat-admin@controller-1 # sudo pcs stonith level add 1 overcloud-compute-N ipmilan-overcloud-compute-N,fence-nova heat-admin@controller-1 # sudo pcs stonith
Replace N with the identifying number of each compute node (for example, overcloud-compute-1
, overcloud-compute-2
, and so on). Use these identifying numbers to match each compute nodes with the stonith devices created earlier (for example, overcloud-compute-1
and ipmilan-overcloud-compute-1
).
19. Enable the control and Compute plane services:
heat-admin@controller-1 # sudo pcs resource enable openstack-keystone heat-admin@controller-1 # sudo pcs resource enable neutron-openvswitch-agent-compute heat-admin@controller-1 # sudo pcs resource enable libvirtd-compute heat-admin@controller-1 # sudo pcs resource enable ceilometer-compute heat-admin@controller-1 # sudo pcs resource enable nova-compute-checkevacuate heat-admin@controller-1 # sudo pcs resource enable nova-compute
20. Allow some time for the environment to settle before cleaning up any failed resources:
heat-admin@controller-1 # sleep 60 heat-admin@controller-1 # sudo pcs resource cleanup heat-admin@controller-1 # sudo pcs status heat-admin@controller-1 # sudo pcs property set stonith-enabled=true