Chapter 14. Recovering from disaster
This chapter explains how to restore your cluster to a working state after a disk or server failure.
You must have configured disaster recovery options previously in order to use this chapter. See Configuring backup and recovery options for details.
14.1. Manually restoring data from a backup volume
This section covers how to restore data from a remote backup volume to a freshly installed replacement deployment of Red Hat Hyperconverged Infrastructure for Virtualization.
To do this, you must:
- Install and configure a replacement deployment according to the instructions in Deploying Red Hat Hyperconverged Infrastructure for Virtualization.
14.1.1. Restoring a volume from a geo-replicated backup
Install and configure a replacement Hyperconverged Infrastructure deployment
For instructions, refer to Deploying Red Hat Hyperconverged Infrastructure for Virtualization: https://access.redhat.com/documentation/en-us/red_hat_hyperconverged_infrastructure_for_virtualization/1.6/html/deploying_red_hat_hyperconverged_infrastructure_for_virtualization/.
Import the backup of the storage domain
From the new Hyperconverged Infrastructure deployment, in the Administration Portal:
-
Click Storage
Domains. - Click Import Domain. The Import Pre-Configured Domain window opens.
- In the Storage Type field, specify GlusterFS.
- In the Name field, specify a name for the new volume that will be created from the backup volume.
- In the Path field, specify the path to the backup volume.
Click OK. The following warning appears, with any active data centers listed below:
This operation might be unrecoverable and destructive! Storage Domain(s) are already attached to a Data Center. Approving this operation might cause data corruption if both Data Centers are active.
- Check the Approve operation checkbox and click OK.
-
Click Storage
Determine a list of virtual machines to import
Determine the imported domain’s identifier by running the following command:
# curl -v -k -X GET -u "admin@internal:password" -H "Accept: application/xml" https://$ENGINE_FQDN/ovirt-engine/api/storagedomains/
For example:
# curl -v -k -X GET -u "admin@example.com:mybadpassword" -H "Accept: application/xml" https://10.0.2.1/ovirt-engine/api/storagedomains/
Determine the list of unregistered disks by running the following command:
# curl -v -k -X GET -u "admin@internal:password" -H "Accept: application/xml" "https://$ENGINE_FQDN/ovirt-engine/api/storagedomains/DOMAIN_ID/vms;unregistered"
For example:
# curl -v -k -X GET -u "admin@example.com:mybadpassword" -H "Accept: application/xml" "https://10.0.2.1/ovirt-engine/api/storagedomains/5e1a37cf-933d-424c-8e3d-eb9e40b690a7/vms;unregistered"
Perform a partial import of each virtual machine to the storage domain
Determine cluster identifier
The following command returns the cluster identifier.
# curl -v -k -X GET -u "admin@internal:password" -H "Accept: application/xml" https://$ENGINE_FQDN/ovirt-engine/api/clusters/
For example:
# curl -v -k -X GET -u "admin@example:mybadpassword" -H "Accept: application/xml" https://10.0.2.1/ovirt-engine/api/clusters/
Import the virtual machines
The following command imports a virtual machine without requiring all disks to be available in the storage domain.
# curl -v -k -u 'admin@internal:password' -H "Content-type: application/xml" -d '<action> <cluster id="CLUSTER_ID"></cluster> <allow_partial_import>true</allow_partial_import> </action>' "https://ENGINE_FQDN/ovirt-engine/api/storagedomains/DOMAIN_ID/vms/VM_ID/register"
For example:
# curl -v -k -u 'admin@example.com:mybadpassword' -H "Content-type: application/xml" -d '<action> <cluster id="bf5a9e9e-5b52-4b0d-aeba-4ee4493f1072"></cluster> <allow_partial_import>true</allow_partial_import> </action>' "https://10.0.2.1/ovirt-engine/api/storagedomains/8d21980a-a50b-45e9-9f32-cd8d2424882e/e164f8c6-769a-4cbd-ac2a-ef322c2c5f30/register"
For further information, see the Red Hat Virtualization REST API Guide: https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/html/rest_api_guide/.
Migrate the partially imported disks to the new storage domain
In the Administration Portal, click Storage
Disks, and Click the Move Disk option. Move the imported disks from the synced volume to the replacement cluster’s storage domain. For further information, see the Red Hat Virtualization Administration Guide. Attach the restored disks to the new virtual machines
Follow the instructions in the Red Hat Virtualization Virtual Machine Management Guide to attach the replacement disks to each virtual machine.
14.2. Failing over to a secondary cluster
This section covers how to fail over from your primary cluster to a remote secondary cluster in the event of server failure.
- Configure failover to a remote cluster.
- Verify that the mapping file for the source and target clusters remains accurate.
Run the failover playbook with the
fail_over
tag.# ansible-playbook dr-rhv-failover.yml --tags "fail_over"
14.3. Failing back to a primary cluster
This section covers how to fail back from your secondary cluster to the primary cluster after you have corrected the cause of a server failure.
Prepare the primary cluster for failback by running the cleanup playbook with the
clean_engine
tag.# ansible-playbook dr-cleanup.yml --tags "clean_engine"
- Verify that the mapping file for the source and target clusters remains accurate.
Execute failback by running the failback playbook with the
fail_back
tag.# ansible-playbook dr-cleanup.yml --tags "fail_back"
14.4. Stopping a geo-replication session using RHV Manager
Stop a geo-replication session when you want to prevent data being replicated from an active source volume to a passive target volume via geo-replication.
Verify that data is not currently being synchronized
Click the Tasks icon at the top right of the Manager, and review the Tasks page.
Ensure that there are no ongoing tasks related to Data Synchronization.
If data synchronization tasks are present, wait until they are complete.
Stop the geo-replication session
-
Click Storage
Volumes. - Click the name of the volume that you want to prevent geo-replicating.
- Click the Geo-replication subtab.
- Select the session that you want to stop, then click Stop.
-
Click Storage
14.5. Turning off scheduled backups by deleting the geo-replication schedule
You can stop scheduled backups via geo-replication by deleting the geo-replication schedule.
- Log in to the Administration Portal on any source node.
-
Click Storage
Domains. - Click the name of the storage domain that you want to back up.
- Click the Remote Data Sync Setup subtab.
Click Setup.
The Setup Remote Data Synchronization window opens.
- In the Recurrence field, select a recurrence interval type of NONE and click OK.
(Optional) Remove the geo-replication session
Run the following command from the geo-replication master node:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL delete
You can also run this command with the
reset-sync-time
parameter. For further information about this parameter and deleting a geo-replication session, see Deleting a Geo-replication Session in the Red Hat Gluster Storage 3.4 Administration Guide.