10.7. Disaster Recovery
Red Hat Gluster Storage provides geo-replication failover and failback capabilities for disaster recovery. If the master goes offline, you can perform a
failover
procedure so that a slave can replace the master. When this happens, all the I/O operations, including reads and writes, are done on the slave which is now acting as the master. When the original master is back online, you can perform a failback
procedure on the original slave so that it synchronizes the differences back to the original master.
10.7.1. Failover: Promoting a Slave to Master Copy linkLink copied to clipboard!
Copy linkLink copied to clipboard!
If the master volume goes offline, you can promote a slave volume to be the master, and start using that volume for data access.
Run the following commands on the slave machine to promote it to be the master:
gluster volume set VOLNAME geo-replication.indexing on gluster volume set VOLNAME changelog on
# gluster volume set VOLNAME geo-replication.indexing ongluster volume set VOLNAME geo-replication.indexing ongluster volume set VOLNAME geo-replication.indexing on
# gluster volume set VOLNAME changelog on# gluster volume set VOLNAME changelog on# gluster volume set VOLNAME changelog on
For example
gluster volume set slave-vol geo-replication.indexing on gluster volume set slave-vol changelog on
# gluster volume set slave-vol geo-replication.indexing on
volume set: success
# gluster volume set slave-vol changelog on
volume set: success
You can now configure applications to use the slave volume for I/O operations.
10.7.2. Failback: Resuming Master and Slave back to their Original State Copy linkLink copied to clipboard!
Copy linkLink copied to clipboard!
When the original master is back online, you can perform the following procedure on the original slave so that it synchronizes the differences back to the original master:
- Stop the existing geo-rep session from original master to orginal slave using the following command:
gluster volume geo-replication ORIGINAL_MASTER_VOL ORIGINAL_SLAVE_HOST::ORIGINAL_SLAVE_VOL stop force
# gluster volume geo-replication ORIGINAL_MASTER_VOL ORIGINAL_SLAVE_HOST::ORIGINAL_SLAVE_VOL stop forcegluster volume geo-replication ORIGINAL_MASTER_VOL ORIGINAL_SLAVE_HOST::ORIGINAL_SLAVE_VOL stop forcegluster volume geo-replication ORIGINAL_MASTER_VOL ORIGINAL_SLAVE_HOST::ORIGINAL_SLAVE_VOL stop forcegluster volume geo-replication ORIGINAL_MASTER_VOL ORIGINAL_SLAVE_HOST::ORIGINAL_SLAVE_VOL stop force
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example,gluster volume geo-replication Volume1 storage.backup.com::slave-vol stop force
# gluster volume geo-replication Volume1 storage.backup.com::slave-vol stop force Stopping geo-replication session between Volume1 and storage.backup.com::slave-vol has been successful
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Create a new geo-replication session with the original slave as the new master, and the original master as the new slave with
force
option. Detailed information on creating geo-replication session is available at: . - Start the special synchronization mode to speed up the recovery of data from slave. This option adds capability to geo-replication to ignore the files created before enabling
indexing
option. With this option, geo-replication will synchronize only those files which are created after making Slave volume as Master volume.gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL config special-sync-mode recover
# gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL config special-sync-mode recovergluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL config special-sync-mode recovergluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL config special-sync-mode recovergluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL config special-sync-mode recover
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example,gluster volume geo-replication slave-vol master.com::Volume1 config special-sync-mode recover
# gluster volume geo-replication slave-vol master.com::Volume1 config special-sync-mode recover geo-replication config updated successfully
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Start the new geo-replication session using the following command:
gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL start
# gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL startgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL startgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL startgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL start
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example,gluster volume geo-replication slave-vol master.com::Volume1 start
# gluster volume geo-replication slave-vol master.com::Volume1 start Starting geo-replication session between slave-vol and master.com::Volume1 has been successful
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Stop the I/O operations on the original slave and set the checkpoint. By setting a checkpoint, synchronization information is available on whether the data that was on the master at that point in time has been replicated to the slaves.
gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL config checkpoint now
# gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL config checkpoint nowgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL config checkpoint nowgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL config checkpoint nowgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL config checkpoint now
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example,gluster volume geo-replication slave-vol master.com::Volume1 config checkpoint now
# gluster volume geo-replication slave-vol master.com::Volume1 config checkpoint now geo-replication config updated successfully
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Checkpoint completion ensures that the data from the original slave is restored back to the original master. But since the IOs were stopped at slave before checkpoint was set, we need to touch the slave mount for checkpoint to be completed
touch orginial_slave_mount
# touch orginial_slave_mount touch orginial_slave_mount
Copy to Clipboard Copied! Toggle word wrap Toggle overflow gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL status detail
# gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL status detailgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL status detailgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL status detailgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL status detail
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example,touch /mnt/gluster/slavevol gluster volume geo-replication slave-vol master.com::Volume1 status detail
# touch /mnt/gluster/slavevol # gluster volume geo-replication slave-vol master.com::Volume1 status detail
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - After the checkpoint is complete, stop and delete the current geo-replication session between the original slave and original master
gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL stop
# gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL stopgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL stopgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL stopgluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL stop
Copy to Clipboard Copied! Toggle word wrap Toggle overflow gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL delete
# gluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL deletegluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL deletegluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL deletegluster volume geo-replication ORIGINAL_SLAVE_VOL ORIGINAL_MASTER_HOST::ORIGINAL_MASTER_VOL delete
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example,Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Reset the options that were set for promoting the slave volume as the master volume by running the following commands:
gluster volume reset ORIGINAL_SLAVE_VOL geo-replication.indexing force gluster volume reset ORIGINAL_SLAVE_VOL changelog
# gluster volume reset ORIGINAL_SLAVE_VOL geo-replication.indexing force # gluster volume reset ORIGINAL_SLAVE_VOL changelog
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example,Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Resume the original roles by starting the geo-rep session from the original master using the following command:
gluster volume geo-replication ORIGINAL_MASTER_VOL ORIGINAL_SLAVE_HOST::ORIGINAL_SLAVE_VOL start
# gluster volume geo-replication ORIGINAL_MASTER_VOL ORIGINAL_SLAVE_HOST::ORIGINAL_SLAVE_VOL start
Copy to Clipboard Copied! Toggle word wrap Toggle overflow gluster volume geo-replication Volume1 storage.backup.com::slave-vol start
# gluster volume geo-replication Volume1 storage.backup.com::slave-vol start Starting geo-replication session between slave-vol and master.com::Volume1 been successful
Copy to Clipboard Copied! Toggle word wrap Toggle overflow