此内容没有您所选择的语言版本。
Appendix A. Automated Evacuation Through Instance HA
With Instance HA, OpenStack automates the process of evacuating instances from a Compute node when that node fails. The following process describes the sequence of events triggered in the event of a Compute node failure.
-
When a Compute node fails, the
IPMI
agent performs first-level fencing and physically resets the node to ensure that it is powered off. Evacuating instances from online Compute nodes could result in data corruption or multiple identical instances running on the overcloud. Once the node is powered off, it is considered fenced. After the physical IPMI fencing, the
fence-nova
agent performs second-level fencing and marks the fenced node with the“evacuate=yes”
cluster per-node attribute. To do this, the agent runs:$ attrd_updater -n evacuate -A name="evacuate" host="FAILEDHOST" value="yes"
Where FAILEDHOST is the hostname of the failed Compute node.
-
The
nova-evacuate
agent constantly runs in the background, periodically checking the cluster for nodes with the“evacuate=yes”
attribute. Oncenova-evacuate
detects that the fenced node has this attribute, the agent starts evacuating the node using the same process as described in Evacuate Instances. -
Meanwhile, while the failed node is booting up from the IPMI reset, the
nova-compute-checkevacuate
agent will wait (by default, for 120 seconds) before checking whethernova-evacuate
is finished with evacuation. If not, it will check again after the same time interval. -
Once
nova-compute-checkevacuate
verifies that the instances are fully evacuated, it triggers another process to make the fenced node available again for hosting instances.