11.9. Rebalancing Volumes
If a volume has been expanded or shrunk using the
add-brick
or remove-brick
commands, the data on the volume needs to be rebalanced among the servers.
Note
In a non-replicated volume, all bricks should be online to perform the
rebalance
operation using the start option. In a replicated volume, at least one of the bricks in the replica should be online.
To rebalance a volume, use the following command on any of the servers:
# gluster volume rebalance VOLNAME start
For example:
# gluster volume rebalance test-volume start Starting rebalancing on volume test-volume has been successful
When run without the
force
option, the rebalance command attempts to balance the space utilized across nodes. Files whose migration would cause the target node to have less available space than the source node are skipped. This results in linkto files being retained, which may cause slower access when a large number of linkto files are present.
Enhancements made to the file rename and rebalance operations in Red Hat Gluster Storage 2.1 update 5 requires that all the clients connected to a cluster operate with the same or later versions. If the clients operate on older versions, and a rebalance operation is performed, the following warning message is displayed and the rebalance operation will not be executed.
volume rebalance: VOLNAME: failed: Volume VOLNAME has one or more connected clients of a version lower than Red Hat Gluster Storage-2.1 update 5. Starting rebalance in this state could lead to data loss. Please disconnect those clients before attempting this command again.
Red Hat strongly recommends you to disconnect all the older clients before executing the rebalance command to avoid a potential data loss scenario.
Warning
The
Rebalance
command can be executed with the force option even when the older clients are connected to the cluster. However, this could lead to a data loss situation.
A
rebalance
operation with force
, balances the data based on the layout, and hence optimizes or does away with the link files, but may lead to an imbalanced storage space used across bricks. This option is to be used only when there are a large number of link files in the system.
To rebalance a volume forcefully, use the following command on any of the servers:
# gluster volume rebalance VOLNAME start force
For example:
# gluster volume rebalance test-volume start force Starting rebalancing on volume test-volume has been successful
11.9.1. Rebalance Throttling
The rebalance process uses multiple threads to ensure good performance during migration of multiple files. During multiple file migration, there can be a severe impact on storage system performance and a throttling mechanism is provided to manage it.
By default, the rebalance throttling is started in the
normal
mode. Configure the throttling modes to adjust the rate at which the files must be migrated
# gluster volume set VOLNAME rebal-throttle lazy|normal|aggressive
For example:
# gluster volume set test-volume rebal-throttle lazy
11.9.2. Displaying Rebalance Progress
To display the status of a volume rebalance operation, use the following command:
# gluster volume rebalance VOLNAME status
For example:
# gluster volume rebalance test-volume status # gluster volume rebalance test-volume status Node Rebalanced size scanned failures skipped status run time -files in h:m:s ------------- ---------- ------ ------- -------- ------- ----------- -------- 10.70.37.01 71962 70.3GB 380852 0 0 in progress 2:02:20 10.70.37.02 70489 68.8GB 502185 0 0 in progress 2:02:20 10.70.37.03 70704 69.0GB 507728 0 0 in progress 2:02:20 10.70.37.04 71819 70.1GB 435611 0 0 in progress 2:02:20 Estimated time left for rebalance to complete : 2:50:24
This displays the estimated time left for the rebalance to complete on all nodes. The estimated time to complete is displayed only after the rebalance operation has been running for 10 minutes. In cases where the remaining time is extremely large, the estimated time to completion is displayed as
>2 months
and the user is advised to check again later.
The time taken to complete a rebalance operation depends on the number of files estimated to be on the bricks and the rate at which files are being processed by the rebalance process. This value is recalculated every time the rebalance status command is executed and becomes more accurate the longer rebalance has been running, and for large data sets. The calculation assumes that a file system partition contains a single brick.
The rebalance status is shown as
completed
when the rebalance is complete. For example:
# gluster volume rebalance test-volume status Node Rebalanced size scanned failures skipped status run time -files in h:m:s ------------- ---------- ------- ------- -------- ------- ----------- -------- 10.70.37.01 118715 115.9GB 768835 0 30988 completed 3:52:44 10.70.37.02 148113 144.6GB 1242793 0 44258 completed 4:36:27 10.70.37.03 148226 144.8GB 1261041 0 44212 completed 4:36:27 10.70.37.04 119558 116.8GB 848517 0 28239 completed 3:49:35 volume rebalance: test-volume: success
11.9.3. Stopping a Rebalance Operation
To stop a rebalance operation, use the following command:
# gluster volume rebalance VOLNAME stop
For example:
# gluster volume rebalance test-volume stop Node Rebalanced size scanned failures skipped status run time -files in h:m:s ------------- ---------- ------- ------- -------- ------- ----------- -------- 10.70.37.01 106504 104.0GB 558111 0 0 stopped 3:02:24 10.70.37.02 102299 99.9GB 725239 0 0 stopped 3:02:24 10.70.37.03 102264 99.9GB 737364 0 0 stopped 3:02:24 10.70.37.04 106813 104.3GB 646581 0 0 stopped 3:02:24 Estimated time left for rebalance to complete : 2:06:38