11.5. Migrating Volumes
Note
replace-brick
operation, review the known issues related to replace-brick
operation in the Red Hat Gluster Storage 3.2 Release Notes.
11.5.1. Replacing a Subvolume on a Distribute or Distribute-replicate Volume
- Add the new bricks to the volume.
#
gluster volume add-brick VOLNAME [replica <COUNT>] NEW-BRICK
Example 11.1. Adding a Brick to a Distribute Volume
#
gluster volume add-brick test-volume server5:/rhgs/brick5
Add Brick successful - Verify the volume information using the command:
#
gluster volume info
Volume Name: test-volume Type: Distribute Status: Started Number of Bricks: 5 Bricks: Brick1: server1:/rhgs/brick1 Brick2: server2:/rhgs/brick2 Brick3: server3:/rhgs/brick3 Brick4: server4:/rhgs/brick4 Brick5: server5:/rhgs/brick5Note
In case of a Distribute-replicate volume, you must specify the replica count in theadd-brick
command and provide the same number of bricks as the replica count to theadd-brick
command. - Remove the bricks to be replaced from the subvolume.
- Start the
remove-brick
operation using the command:#
gluster volume remove-brick VOLNAME [replica <COUNT>] <BRICK> start
Example 11.2. Start a remove-brick operation on a distribute volume
#
gluster volume remove-brick test-volume server2:/rhgs/brick2 start
Remove Brick start successful - View the status of the
remove-brick
operation using the command:#
gluster volume remove-brick VOLNAME [replica <COUNT>] BRICK status
Example 11.3. View the Status of remove-brick Operation
#
gluster volume remove-brick test-volume server2:/rhgs/brick2 status
Node Rebalanced-files size scanned failures status ------------------------------------------------------------------ server2 16 16777216 52 0 in progressKeep monitoring theremove-brick
operation status by executing the above command. When the value of the status field is set tocomplete
in the output ofremove-brick
status command, proceed further. - Commit the
remove-brick
operation using the command:#
gluster volume remove-brick VOLNAME [replica <COUNT>] <BRICK> commit
Example 11.4. Commit the remove-brick Operation on a Distribute Volume
#
gluster volume remove-brick test-volume server2:/rhgs/brick2 commit
- Verify the volume information using the command:
#
gluster volume info
Volume Name: test-volume Type: Distribute Status: Started Number of Bricks: 4 Bricks: Brick1: server1:/rhgs/brick1 Brick3: server3:/rhgs/brick3 Brick4: server4:/rhgs/brick4 Brick5: server5:/rhgs/brick5 - Verify the content on the brick after committing the
remove-brick
operation on the volume. If there are any files leftover, copy it through FUSE or NFS mount.- Verify if there are any pending files on the bricks of the subvolume.Along with files, all the application-specific extended attributes must be copied. glusterFS also uses extended attributes to store its internal data. The extended attributes used by glusterFS are of the form
trusted.glusterfs.*
,trusted.afr.*
, andtrusted.gfid
. Any extended attributes other than ones listed above must also be copied.To copy the application-specific extended attributes and to achieve a an effect similar to the one that is described above, use the following shell script:Syntax:#
copy.sh <glusterfs-mount-point> <brick>
Example 11.5. Code Snippet Usage
If the mount point is/mnt/glusterfs
and brick path is/rhgs/brick1
, then the script must be run as:#
copy.sh /mnt/glusterfs /rhgs/brick1
#!/bin/bash MOUNT=$1 BRICK=$2 for file in `find $BRICK ! -type d`; do rpath=`echo $file | sed -e "s#$BRICK\(.*\)#\1#g"` rdir=`dirname $rpath` cp -fv $file $MOUNT/$rdir; for xattr in `getfattr -e hex -m. -d $file 2>/dev/null | sed -e '/^#/d' | grep -v -E "trusted.glusterfs.*" | grep -v -E "trusted.afr.*" | grep -v "trusted.gfid"`; do key=`echo $xattr | cut -d"=" -f 1` value=`echo $xattr | cut -d"=" -f 2` setfattr $MOUNT/$rpath -n $key -v $value done done
- To identify a list of files that are in a split-brain state, execute the command:
#
gluster volume heal test-volume info split-brain
- If there are any files listed in the output of the above command, compare the files across the bricks in a replica set, delete the bad files from the brick and retain the correct copy of the file. Manual intervention by the System Administrator would be required to choose the correct copy of file.
11.5.2. Replacing an Old Brick with a New Brick on a Replicate or Distribute-replicate Volume
- Ensure that the new brick (
server5:/rhgs/brick1
) that replaces the old brick (server0:/rhgs/brick1
) is empty. Ensure that all the bricks are online. The brick that must be replaced can be in an offline state. - Execute the
replace-brick
command with theforce
option:# gluster volume replace-brick test-volume server0:/rhgs/brick1 server5:/rhgs/brick1 commit force volume replace-brick: success: replace-brick commit successful
- Check if the new brick is online.
#
gluster volume status
Status of volume: test-volume Gluster process Port Online Pid --------------------------------------------------------- Brick server5:/rhgs/brick1 49156 Y 5731 Brick server1:/rhgs/brick1 49153 Y 5354 Brick server2:/rhgs/brick1 49154 Y 5365 Brick server3:/rhgs/brick1 49155 Y 5376 - Data on the newly added brick would automatically be healed. It might take time depending upon the amount of data to be healed. It is recommended to check heal information after replacing a brick to make sure all the data has been healed before replacing/removing any other brick.
# gluster volume heal VOL_NAME info
For example:# gluster volume heal test-volume info Brick server5:/rhgs/brick1 Status: Connected Number of entries: 0 Brick server1:/rhgs/brick1 Status: Connected Number of entries: 0 Brick server2:/rhgs/brick1 Status: Connected Number of entries: 0 Brick server3:/rhgs/brick1 Status: Connected Number of entries: 0
The value ofNumber of entries
field will be displayed as zero if the heal is complete.
11.5.3. Replacing an Old Brick with a New Brick on a Distribute Volume
Important
- Replace a brick with a commit
force
option:#
gluster volume replace-brick VOLNAME <BRICK> <NEW-BRICK> commit force
Example 11.6. Replace a brick on a Distribute Volume
# gluster volume replace-brick test-volume server0:/rhgs/brick1 server5:/rhgs/brick1 commit force volume replace-brick: success: replace-brick commit successful
- Verify if the new brick is online.
#
gluster volume status
Status of volume: test-volume Gluster process Port Online Pid --------------------------------------------------------- Brick server5:/rhgs/brick1 49156 Y 5731 Brick server1:/rhgs/brick1 49153 Y 5354 Brick server2:/rhgs/brick1 49154 Y 5365 Brick server3:/rhgs/brick1 49155 Y 5376
Note
replace-brick
command options except the commit force
option are deprecated.
11.5.4. Replacing an Old Brick with a New Brick on a Dispersed or Distributed-dispersed Volume
- Ensure that the new brick that replaces the old brick is empty. The brick that must be replaced can be in an offline state but all other bricks must be online.
- Execute the replace-brick command with the
force
option:# gluster volume replace-brick VOL_NAME old_brick_path new_brick_path commit force
For example:# gluster volume replace-brick test-volume server1:/rhgs/brick2 server1:/rhgs/brick2new commit force volume replace-brick: success: replace-brick commit successful
The new brick you are adding could be from the same server or you can add a new server and then a new brick. - Check if the new brick is online.
# gluster volume status Status of volume: test-volume Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick server1:/rhgs/brick1 49187 0 Y 19927 Brick server1:/rhgs/brick2new 49188 0 Y 19946 Brick server2:/rhgs/brick3 49189 0 Y 19965 Brick server2:/rhgs/brick4 49190 0 Y 19984 Brick server3:/rhgs/brick5 49191 0 Y 20003 Brick server3:/rhgs/brick6 49192 0 Y 20022 NFS Server on localhost N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 20043 Task Status of Volume test-volume ------------------------------------------------------------------------------ There are no active volume tasks
- Data on the newly added brick would automatically be healed. It might take time depending upon the amount of data to be healed. It is recommended to check heal information after replacing a brick to make sure all the data has been healed before replacing/removing any other brick.
# gluster volume heal VOL_NAME info
For example:# gluster volume heal test-volume info Brick server1:/rhgs/brick1 Status: Connected Number of entries: 0 Brick server1:/rhgs/brick2new Status: Connected Number of entries: 0 Brick server2:/rhgs/brick3 Status: Connected Number of entries: 0 Brick server2:/rhgs/brick4 Status: Connected Number of entries: 0 Brick server3:/rhgs/brick5 Status: Connected Number of entries: 0 Brick server3:/rhgs/brick6 Status: Connected Number of entries: 0
The value ofNumber of entries
field will be displayed as zero if the heal is complete.
11.5.5. Reconfiguring a Brick in a Volume
reset-brick
subcommand is useful when you want to reconfigure a brick rather than replace it. reset-brick
lets you replace a brick with another brick of the same location and UUID. For example, if you initially configured bricks so that they were identified with a hostname, but you want to use that hostname somewhere else, you can use reset-brick
to stop the brick, reconfigure it so that it is identified by an IP address instead of the hostname, and return the reconfigured brick to the cluster.
- Ensure that the quorum minimum will still be met when the brick that you want to reset is taken offline.
- If possible, Red Hat recommends stopping I/O, and verifying that no heal operations are pending on the volume.
- Run the following command to kill the brick that you want to reset.
# gluster volume reset-brick VOLNAME HOSTNAME:BRICKPATH start
- Configure the offline brick according to your needs.
- Check that the volume's
Volume ID
displayed bygluster volume info
matches thevolume-id
(if any) of the offline brick.# gluster volume info VOLNAME # cat /var/lib/glusterd/vols/VOLNAME/VOLNAME.HOSTNAME.BRICKPATH.vol | grep volume-id
For example, in the following dispersed volume, theVolume ID
and thevolume-id
are bothab8a981a-a6d9-42f2-b8a5-0b28fe2c4548
.# gluster volume info vol Volume Name: vol Type: Disperse Volume ID: ab8a981a-a6d9-42f2-b8a5-0b28fe2c4548 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (4 + 2) = 6 Transport-type: tcp Bricks: Brick1: myhost:/brick/gluster/vol-1
# cat /var/lib/glusterd/vols/vol/vol.myhost.brick-gluster-vol-1.vol | grep volume-id option volume-id ab8a981a-a6d9-42f2-b8a5-0b28fe2c4548
- Bring the reconfigured brick back online. There are two options for this:
- If your brick did not have a
volume-id
in the previous step, run:# gluster volume reset-brick VOLNAME HOSTNAME:BRICKPATH HOSTNAME:BRICKPATH commit
- If your brick's
volume-id
matches your volume's identifier, Red Hat recommends adding theforce
keyword to ensure that the operation succeeds.# gluster volume reset-brick VOLNAME HOSTNAME:BRICKPATH HOSTNAME:BRICKPATH commit force