10.4. Starting Geo-replication
This section describes how to and start geo-replication in your storage environment, and verify that it is functioning correctly.
10.4.1. Starting a Geo-replication Session
Important
You must create the geo-replication session before starting geo-replication. For more information, see Section 10.3.4, “Setting Up your Environment for Geo-replication Session”.
To start geo-replication, use one of the following commands:
- To start the geo-replication session between the hosts:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL start
For example:# gluster volume geo-replication master-vol example.com::slave-vol start Starting geo-replication session between master-vol & example.com::slave-vol has been successful
This command will start distributed geo-replication on all the nodes that are part of the master volume. If a node that is part of the master volume is down, the command will still be successful. In a replica pair, the geo-replication session will be active on any of the replica nodes, but remain passive on the others.After executing the command, it may take a few minutes for the session to initialize and become stable.Note
If you attempt to create a geo-replication session and the slave already has data, the following error message will be displayed:slave-node::slave is not empty. Please delete existing files in slave-node::slave and retry, or use force to continue without deleting the existing files. geo-replication command failed
- To start the geo-replication session forcefully between the hosts:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL start force
For example:# gluster volume geo-replication master-vol example.com::slave-vol start force Starting geo-replication session between master-vol & example.com::slave-vol has been successful
This command will force start geo-replication sessions on the nodes that are part of the master volume. If it is unable to successfully start the geo-replication session on any node which is online and part of the master volume, the command will still start the geo-replication sessions on as many nodes as it can. This command can also be used to re-start geo-replication sessions on the nodes where the session has died, or has not started.
10.4.2. Verifying a Successful Geo-replication Deployment
You can use the
status
command to verify the status of geo-replication in your environment:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL status
For example:
# gluster volume geo-replication master-vol example.com::slave-vol status
10.4.3. Displaying Geo-replication Status Information
The
status
command can be used to display information about a specific geo-replication master session, master-slave session, or all geo-replication sessions. The status output provides both node and brick level information.
- To display information on all geo-replication sessions from a particular master volume, use the following command:
# gluster volume geo-replication MASTER_VOL status
- To display information of a particular master-slave session, use the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL status
- To display the details of a master-slave session, use the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL status detail
Important
There will be a mismatch between the outputs of thedf
command (including-h
and-k
) and inode of the master and slave volumes when the data is in full sync. This is due to the extra inode and size consumption by thechangelog
journaling data, which keeps track of the changes done on the file system on themaster
volume. Instead of running thedf
command to verify the status of synchronization, use# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL status detail
instead.The status of a session can be one of the following:- Initializing: This is the initial phase of the Geo-replication session; it remains in this state for a minute in order to make sure no abnormalities are present.
- Not Started: The geo-replication session is created, but not started.
- Active: The
gsync
daemon in this node is active and syncing the data. - Passive: A replica pair of the active node. The data synchronization is handled by active node. Hence, this node does not sync any data.
- Faulty: The geo-replication session has experienced a problem, and the issue needs to be investigated further. For more information, see Section 10.10, “Troubleshooting Geo-replication” section.
- Stopped: The geo-replication session has stopped, but has not been deleted.
- Crawl Status
- Changelog Crawl: The
changelog
translator has produced the changelog and that is being consumed bygsyncd
daemon to sync data. - Hybrid Crawl: The
gsyncd
daemon is crawling the glusterFS file system and generating pseudo changelog to sync data.
- Checkpoint Status: Displays the status of the checkpoint, if set. Otherwise, it displays as N/A.
10.4.4. Configuring a Geo-replication Session
To configure a geo-replication session, use the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL config [options]
For example, to view the list of all option/value pairs:
# gluster volume geo-replication Volume1 example.com::slave-vol config
To delete a setting for a geo-replication config option, prefix the option with
!
(exclamation mark). For example, to reset log-level
to the default value:
# gluster volume geo-replication Volume1 example.com::slave-vol config '!log-level'
Configurable Options
The following table provides an overview of the configurable options for a geo-replication setting:
Option | Description |
---|---|
gluster-log-file LOGFILE | The path to the geo-replication glusterfs log file. |
gluster-log-level LOGFILELEVEL | The log level for glusterfs processes. |
log-file LOGFILE | The path to the geo-replication log file. |
log-level LOGFILELEVEL | The log level for geo-replication. |
ssh-command COMMAND | The SSH command to connect to the remote machine (the default is SSH ). |
rsync-command COMMAND | The rsync command to use for synchronizing the files (the default is rsync ). |
use-tarssh true | The use-tarssh command allows tar over Secure Shell protocol. Use this option to handle workloads of files that have not undergone edits. |
volume_id=UID | The command to delete the existing master UID for the intermediate/slave node. |
timeout SECONDS | The timeout period in seconds. |
sync-jobs N | The number of simultaneous files/directories that can be synchronized. |
ignore-deletes | If this option is set to 1 , a file deleted on the master will not trigger a delete operation on the slave. As a result, the slave will remain as a superset of the master and can be used to recover the master in the event of a crash and/or accidental delete. |
checkpoint [LABEL|now] | Sets a checkpoint with the given option LABEL. If the option is set as now , then the current time will be used as the label. |
10.4.4.1. Geo-replication Checkpoints
10.4.4.1.1. About Geo-replication Checkpoints
Geo-replication data synchronization is an asynchronous process, so changes made on the master may take time to be replicated to the slaves. Data replication to a slave may also be interrupted by various issues, such network outages.
Red Hat Storage provides the ability to set geo-replication checkpoints. By setting a checkpoint, synchronization information is available on whether the data that was on the master at that point in time has been replicated to the slaves.
10.4.4.1.2. Configuring and Viewing Geo-replication Checkpoint Information
- To set a checkpoint on a geo-replication session, use the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL config checkpoint
[now|LABEL]
For example, to set checkpoint betweenVolume1
andexample.com:/data/remote_dir
:# gluster volume geo-replication Volume1 example.com::slave-vol config checkpoint now geo-replication config updated successfully
The label for a checkpoint can be set as the current time usingnow
, or a particular label can be specified, as shown below:# gluster volume geo-replication Volume1 example.com::slave-vol config checkpoint NEW_ACCOUNTS_CREATED geo-replication config updated successfully.
- To display the status of a checkpoint for a geo-replication session, use the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL status
- To delete checkpoints for a geo-replication session, use the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL config '!checkpoint'
For example, to delete the checkpoint set betweenVolume1
andexample.com::slave-vol
:# gluster volume geo-replication Volume1 example.com::slave-vol config '!checkpoint' geo-replication config updated successfully
- To view the history of checkpoints for a geo-replication session (including set, delete, and completion events), use the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL config log-file | xargs grep checkpoint
For example, to display the checkpoint history betweenVolume1
andexample.com::slave-vol
:# gluster volume geo-replication Volume1 example.com::slave-vol config log-file | xargs grep checkpoint [2013-11-12 12:40:03.436563] I [gsyncd(conf):359:main_i] <top>: checkpoint as of 2012-06-04 12:40:02 set [2013-11-15 12:41:03.617508] I master:448:checkpt_service] _GMaster: checkpoint as of 2013-11-12 12:40:02 completed [2013-11-12 03:01:17.488917] I [gsyncd(conf):359:main_i] <top>: checkpoint as of 2013-06-22 03:01:12 set [2013-11-15 03:02:29.10240] I master:448:checkpt_service] _GMaster: checkpoint as of 2013-06-22 03:01:12 completed
10.4.5. Stopping a Geo-replication Session
To stop a geo-replication session, use one of the following commands:
- To stop a geo-replication session between the hosts:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop
For example:#gluster volume geo-replication master-vol example.com::slave-vol stop Stopping geo-replication session between master-vol & example.com::slave-vol has been successful
Note
Thestop
command will fail if:- any node that is a part of the volume is offline.
- if it is unable to stop the geo-replication session on any particular node.
- if the geo-replication session between the master and slave is not active.
- To stop a geo-replication session forcefully between the hosts:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop force
For example:# gluster volume geo-replication master-vol example.com::slave-vol stop force Stopping geo-replication session between master-vol & example.com::slave-vol has been successful
Usingforce
will stop the geo-replication session between the master and slave even if any node that is a part of the volume is offline. If it is unable to stop the geo-replication session on any particular node, the command will still stop the geo-replication sessions on as many nodes as it can. Usingforce
will also stop inactive geo-replication sessions.
10.4.6. Deleting a Geo-replication Session
Important
You must first stop a geo-replication session before it can be deleted. For more information, see Section 10.4.5, “Stopping a Geo-replication Session”.
To delete a geo-replication session, use the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL delete
For example:
# gluster volume geo-replication master-vol example.com::slave-vol delete geo-replication command executed successfully
Note
The
delete
command will fail if:
- any node that is a part of the volume is offline.
- if it is unable to delete the geo-replication session on any particular node.
- if the geo-replication session between the master and slave is still active.
Important
The SSH keys will not removed from the master and slave nodes when the geo-replication session is deleted. You can manually remove the
pem
files which contain the SSH keys from the /var/lib/glusterd/geo-replication/
directory.