Este contenido no está disponible en el idioma seleccionado.
Chapter 3. Reusing bricks and restoring configuration from backups
3.1. Host replacement prerequisites Copiar enlaceEnlace copiado en el portapapeles!
- Determine which node to use as the Ansible controller node (the node from which all Ansible playbooks are executed). Red Hat recommends using a healthy node in the same cluster as the failed node as the Ansible controller node.
- If possible, locate a recent backup or create a new backup of the important files (disk configuration or inventory files). See Backing up important files for details.
Stop brick processes and unmount file systems on the failed host, to avoid file system inconsistency issues.
pkill glusterfsd umount /gluster_bricks/{engine,vmstore,data}# pkill glusterfsd # umount /gluster_bricks/{engine,vmstore,data}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check which operating system is running on your hyperconverged hosts by running the following command:
nodectl info
$ nodectl infoCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Reinstall the same operating system on the failed hyperconverged host.
3.2. Preparing the cluster for host replacement Copiar enlaceEnlace copiado en el portapapeles!
Verify host state in the Administrator Portal.
Log in to the Red Hat Virtualization Administrator Portal.
The host is listed as
NonResponsivein the Administrator Portal. Virtual machines that previously ran on this host are in theUnknownstate.-
Click Compute
Hosts and click the Action menu (⋮). - Click Confirm host has been rebooted and confirm the operation.
-
Verify that the virtual machines are now listed with a state of
Down.
Update the SSH fingerprint for the failed node.
- Log in to the Ansible controller node as the root user.
Remove the existing SSH fingerprint for the failed node.
sed -i `/failed-host-frontend.example.com/d` /root/.ssh/known_hosts sed -i `/failed-host-backend.example.com/d` /root/.ssh/known_hosts
# sed -i `/failed-host-frontend.example.com/d` /root/.ssh/known_hosts # sed -i `/failed-host-backend.example.com/d` /root/.ssh/known_hostsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the public key from the Ansible controller node to the freshly installed node.
ssh-copy-id root@new-host-backend.example.com ssh-copy-id root@new-host-frontend.example.com
# ssh-copy-id root@new-host-backend.example.com # ssh-copy-id root@new-host-frontend.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that you can log in to all hosts in the cluster, including the Ansible controller node, using key-based SSH authentication without a password. Test access using all network addresses. The following example assumes that the Ansible controller node is
host1.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use
ssh-copy-idto copy the public key to any host you cannot log into without a password using this method.ssh-copy-id root@host-frontend.example.com ssh-copy-id root@host-backend.example.com
# ssh-copy-id root@host-frontend.example.com # ssh-copy-id root@host-backend.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow
3.3. Restoring disk configuration from backups Copiar enlaceEnlace copiado en el portapapeles!
Prerequisites
- This procedure assumes you have already performed the backup process in Chapter 2, Backing up important files and know the location of your backup files and the address of the backup host.
Procedure
If the new host does not have multipath configuration, blacklist the devices.
Create an inventory file for the new host that defines the devices to blacklist.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the
gluster_deployment.ymlplaybook on this inventory file using theblacklistdevicestag.ansible-playbook -i blacklist-inventory.yml /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/tasks/gluster_deployment.yml --tags=blacklistdevices
# ansible-playbook -i blacklist-inventory.yml /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/tasks/gluster_deployment.yml --tags=blacklistdevicesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Copy backed up configuration details to the new host.
mkdir /rhhi-backup scp backup-host.example.com:/backups/rhvh-node-host1-backend.example.com-backup.tar.gz /rhhi-backup tar -xvf /rhhi-backup/rhvh-node-host1-backend.example.com-backup.tar.gz -C /rhhi-backup
# mkdir /rhhi-backup # scp backup-host.example.com:/backups/rhvh-node-host1-backend.example.com-backup.tar.gz /rhhi-backup # tar -xvf /rhhi-backup/rhvh-node-host1-backend.example.com-backup.tar.gz -C /rhhi-backupCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create an inventory file for host restoration.
Change into the
hc-ansible-deploymentdirectory and back up the defaultarchive_config_inventory.ymlfile.cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment cp archive_config_inventory.yml archive_config_inventory.yml.bk
# cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment # cp archive_config_inventory.yml archive_config_inventory.yml.bkCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the
archive_config_inventory.ymlfile with details of the cluster you want to back up.- hosts
- The backend FQDN of the host that you want to restore (this host).
- backup_dir
- The directory in which to store extracted backup files.
- nbde_setup
-
If you use Network-Bound Disk Encryption, set this to
true. Otherwise, set tofalse. - upgrade
-
Set to
false.
For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Execute the
archive_config.ymlplaybook.Run the
archive_config.ymlplaybook using your updated inventory file with therestorefilestag.ansible-playbook -i archive_config_inventory.yml archive_config.yml --tags=restorefiles
# ansible-playbook -i archive_config_inventory.yml archive_config.yml --tags=restorefilesCopy to Clipboard Copied! Toggle word wrap Toggle overflow (Optional) Configure Network-Bound Disk Encryption (NBDE) on the root disk.
Create an inventory file for the new host that defines devices to encrypt.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow See Understanding the luks_tang_inventory.yml file for more information about these parameters.
Run the
luks_tang_setup.ymlplaybook using your inventory file and thebindtangtag.ansible-playbook -i inventory.yml /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/tasks/luks_tang_setup.yml --tags=bindtang
# ansible-playbook -i inventory.yml /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/tasks/luks_tang_setup.yml --tags=bindtangCopy to Clipboard Copied! Toggle word wrap Toggle overflow
3.4. Creating the node_replace_inventory.yml file Copiar enlaceEnlace copiado en el portapapeles!
Define your cluster hosts by creating a node_replacement_inventory.yml file.
Procedure
Back up the
node_replace_inventory.ymlfile.cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment cp node_replace_inventory.yml node_replace_inventory.yml.bk
# cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment # cp node_replace_inventory.yml node_replace_inventory.yml.bkCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the
node_replace_inventory.ymlfile to define your cluster.See Appendix C, Understanding the
node_replace_inventory.ymlfile for more information about this inventory file and its parameters.
3.5. Executing the replace_node.yml playbook file Copiar enlaceEnlace copiado en el portapapeles!
The replace_node.yml playbook reconfigures a Red Hat Hyperconverged Infrastructure for Virtualization cluster to use a new node after an existing cluster node has failed.
Procedure
Execute the playbook.
cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/ ansible-playbook -i node_replace_inventory.yml tasks/replace_node.yml --tags=restorepeer
# cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/ # ansible-playbook -i node_replace_inventory.yml tasks/replace_node.yml --tags=restorepeerCopy to Clipboard Copied! Toggle word wrap Toggle overflow
3.6. Finalizing host replacement Copiar enlaceEnlace copiado en el portapapeles!
After you have replaced a failed host with a new host, follow these steps to ensure that the cluster is connected to the new host and properly activated.
Procedure
Activate the host.
- Log in to the Red Hat Virtualization Administrator Portal.
-
Click Compute
Hosts and observe that the replacement host is listed with a state of Maintenance. -
Select the host and click Management
Activate. -
Wait for the host to reach the
Upstate.
Attach the gluster network to the host.
-
Click Compute
Hosts and select the host. -
Click Network Interfaces
Setup Host Networks. - Drag and drop the newly created network to the correct interface.
- Ensure that the Verify connectivity between Host and Engine checkbox is checked.
- Ensure that the Save network configuration checkbox is checked.
- Click OK to save.
Verify the health of the network.
Click the Network Interfaces tab and check the state of the host’s network.
If the network interface enters an "Out of sync" state or does not have an IP Address, click Management
Refresh Capabilities.
-
Click Compute
3.7. Verifying healing in progress Copiar enlaceEnlace copiado en el portapapeles!
After replacing a failed host with a new host, verify that your storage is healing as expected.
Procedure
Verify that healing is in progress.
Run the following command on any hyperconverged host:
for vol in `gluster volume list`; do gluster volume heal $vol info summary; done
# for vol in `gluster volume list`; do gluster volume heal $vol info summary; doneCopy to Clipboard Copied! Toggle word wrap Toggle overflow The output shows a summary of healing activity on each brick in each volume, for example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Depending on brick size, volumes can take a long time to heal. You can still run and migrate virtual machines using this node while the underlying storage heals.