Este contenido no está disponible en el idioma seleccionado.
Chapter 6. Replacing a primary host using new bricks
6.1. Host replacement prerequisites Copiar enlaceEnlace copiado en el portapapeles!
- Determine which node to use as the Ansible controller node (the node from which all Ansible playbooks are executed). Red Hat recommends using a healthy node in the same cluster as the failed node as the Ansible controller node.
- Power off all virtual machines in the cluster.
Stop brick processes and unmount file systems on the failed host, to avoid file system inconsistency issues.
pkill glusterfsd umount /gluster_bricks/{engine,vmstore,data}# pkill glusterfsd # umount /gluster_bricks/{engine,vmstore,data}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check which operating system is running on your hyperconverged hosts by running the following command:
nodectl info
$ nodectl infoCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Install the same operating system on a replacement host.
6.2. Preparing the cluster for host replacement Copiar enlaceEnlace copiado en el portapapeles!
Verify host state in the Administrator Portal.
Log in to the Red Hat Virtualization Administrator Portal.
The host is listed as
NonResponsivein the Administrator Portal. Virtual machines that previously ran on this host are in theUnknownstate.-
Click Compute
Hosts and click the Action menu (⋮). - Click Confirm host has been rebooted and confirm the operation.
-
Verify that the virtual machines are now listed with a state of
Down.
Update the SSH fingerprint for the failed node.
- Log in to the Ansible controller node as the root user.
Remove the existing SSH fingerprint for the failed node.
sed -i `/failed-host-frontend.example.com/d` /root/.ssh/known_hosts sed -i `/failed-host-backend.example.com/d` /root/.ssh/known_hosts
# sed -i `/failed-host-frontend.example.com/d` /root/.ssh/known_hosts # sed -i `/failed-host-backend.example.com/d` /root/.ssh/known_hostsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the public key from the Ansible controller node to the freshly installed node.
ssh-copy-id root@new-host-backend.example.com ssh-copy-id root@new-host-frontend.example.com
# ssh-copy-id root@new-host-backend.example.com # ssh-copy-id root@new-host-frontend.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that you can log in to all hosts in the cluster, including the Ansible controller node, using key-based SSH authentication without a password. Test access using all network addresses. The following example assumes that the Ansible controller node is
host1.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use
ssh-copy-idto copy the public key to any host you cannot log into without a password using this method.ssh-copy-id root@host-frontend.example.com ssh-copy-id root@host-backend.example.com
# ssh-copy-id root@host-frontend.example.com # ssh-copy-id root@host-backend.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow
6.3. Creating the node_prep_inventory.yml file Copiar enlaceEnlace copiado en el portapapeles!
Define the replacement node in the node_prep_inventory.yml file.
Procedure
Familiarize yourself with your Gluster configuration.
The configuration that you define in your inventory file must match the existing Gluster volume configuration. Use
gluster volume infoto check where your bricks should be mounted for each Gluster volume, for example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Back up the
node_prep_inventory.ymlfile.cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment cp node_prep_inventory.yml node_prep_inventory.yml.bk
# cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment # cp node_prep_inventory.yml node_prep_inventory.yml.bkCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the
node_prep_inventory.ymlfile to define your node preparation.See Appendix B, Understanding the
node_prep_inventory.ymlfile for more information about this inventory file and its parameters.
6.4. Creating the node_replace_inventory.yml file Copiar enlaceEnlace copiado en el portapapeles!
Define your cluster hosts by creating a node_replacement_inventory.yml file.
Procedure
Back up the
node_replace_inventory.ymlfile.cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment cp node_replace_inventory.yml node_replace_inventory.yml.bk
# cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment # cp node_replace_inventory.yml node_replace_inventory.yml.bkCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the
node_replace_inventory.ymlfile to define your cluster.See Appendix C, Understanding the
node_replace_inventory.ymlfile for more information about this inventory file and its parameters.
6.5. Executing the replace_node.yml playbook file Copiar enlaceEnlace copiado en el portapapeles!
The replace_node.yml playbook reconfigures a Red Hat Hyperconverged Infrastructure for Virtualization cluster to use a new node after an existing cluster node has failed.
Procedure
Execute the playbook.
cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/ ansible-playbook -i node_prep_inventory.yml -i node_replace_inventory.yml tasks/replace_node.yml
# cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/ # ansible-playbook -i node_prep_inventory.yml -i node_replace_inventory.yml tasks/replace_node.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
6.6. Updating the cluster for a new primary host Copiar enlaceEnlace copiado en el portapapeles!
When you replace a failed host using a different FQDN, you need to update configuration in the cluster to use the replacement host.
Procedure
Change into the
hc-ansible-deploymentdirectory.cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/
# cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/Copy to Clipboard Copied! Toggle word wrap Toggle overflow Make a copy of the
reconfigure_storage_inventory.ymlfile.cp reconfigure_storage_inventory.yml reconfigure_storage_inventory.yml.bk
# cp reconfigure_storage_inventory.yml reconfigure_storage_inventory.yml.bkCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the
reconfigure_storage_inventory.ymlfile to identify the following:- hosts
- Two active hosts in the cluster that have been configured to host the Hosted Engine virtual machine.
- gluster_maintenance_old_node
- The backend network FQDN of the failed node.
- gluster_maintenance_new_node
- The backend network FQDN of the replacement node.
- ovirt_engine_hostname
- The FQDN of the Hosted Engine virtual machine.
For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Execute the
reconfigure_he_storage.ymlplaybook with your updated inventory file.ansible-playbook -i reconfigure_he_storage_inventory.yml tasks/reconfigure_he_storage.yml
# ansible-playbook -i reconfigure_he_storage_inventory.yml tasks/reconfigure_he_storage.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
6.7. Removing a failed host from the cluster Copiar enlaceEnlace copiado en el portapapeles!
When a replacement host is ready, remove the existing failed host from the cluster.
Procedure
Remove the failed host.
- Log in into the Administrator Portal.
Click Compute
Hosts. The failed host is in the
NonResponsivestate. Virtual machines running on the failed host are in theUnknownstate.- Select the failed host.
- Click the main Action menu (⋮) for the Hosts page and select Confirm host has been rebooted.
Click OK to confirm the operation.
Virtual machines move to the
Downstate.-
Select the failed host and click Management
Maintenance. - Click the Action menu (⋮) beside the failed host and click Remove.
Update the storage domains.
For each storage domain:
-
Click Storage
Domains. -
Click the storage domain name, then click Data Center
Maintenance and confirm the operation. Click Manage Domain.
- Edit the Path field to match the new FQDN.
Click OK.
NoteA dialog box with an
Operation Cancellederror appears as a result of Bug 1853995, but the path is updated as expected.
- Click the Action menu (⋮) beside the storage domain and click Activate.
-
Click Storage
- Add the replacement host to the cluster.
- Attach the gluster logical network to the replacement host.
Restart all virtual machines.
For highly available virtual machines, disable and re-enable high-availability.
-
Click Compute
Virtual Machines and select a virtual machine. -
Click Edit
High Availability uncheck the High Availability check box and click OK. -
Click Edit
High Availability check the High Availability check box and click OK.
-
Click Compute
Start all the virtual machines.
-
Click Compute
Virtual Machines and select a virtual machine. -
Click the Action menu (⋮)
Start.
-
Click Compute
6.8. Verifying healing in progress Copiar enlaceEnlace copiado en el portapapeles!
After replacing a failed host with a new host, verify that your storage is healing as expected.
Procedure
Verify that healing is in progress.
Run the following command on any hyperconverged host:
for vol in `gluster volume list`; do gluster volume heal $vol info summary; done
# for vol in `gluster volume list`; do gluster volume heal $vol info summary; doneCopy to Clipboard Copied! Toggle word wrap Toggle overflow The output shows a summary of healing activity on each brick in each volume, for example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Depending on brick size, volumes can take a long time to heal. You can still run and migrate virtual machines using this node while the underlying storage heals.