Ce contenu n'est pas disponible dans la langue sélectionnée.
11.10. Replacing Hosts
11.10.1. Replacing a Host Machine with a Different Hostname Copier lienLien copié sur presse-papiers!
Important
server0.example.com and the replacement machine is server5.example.com. The brick with an unrecoverable failure is server0.example.com:/rhgs/brick1 and the replacement brick is server5.example.com:/rhgs/brick1.
- Stop the geo-replication session if configured by executing the following command:
gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop force
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop forceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Probe the new peer from one of the existing peers to bring it into the cluster.
gluster peer probe server5.example.com
# gluster peer probe server5.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Ensure that the new brick
(server5.example.com:/rhgs/brick1)that is replacing the old brick(server0.example.com:/rhgs/brick1)is empty. - If the geo-replication session is configured, perform the following steps:
- Setup the geo-replication session by generating the ssh keys:
gluster system:: execute gsec_create
# gluster system:: execute gsec_createCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Create geo-replication session again with
forceoption to distribute the keys from new nodes to Slave nodes.gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL create push-pem force
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL create push-pem forceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - After successfully setting up the shared storage volume, when a new node is replaced in the cluster, the shared storage is not mounted automatically on this node. Neither is the
/etc/fstabentry added for the shared storage on this node. To make use of shared storage on this node, execute the following commands:mount -t glusterfs local node's ip:gluster_shared_storage cp /etc/fstab /var/run/gluster/fstab.tmp echo local node's ip:/gluster_shared_storage
# mount -t glusterfs local node's ip:gluster_shared_storage /var/run/gluster/shared_storage # cp /etc/fstab /var/run/gluster/fstab.tmp # echo local node's ip:/gluster_shared_storage /var/run/gluster/shared_storage/ glusterfs defaults 0 0" >> /etc/fstabCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information on setting up shared storage volume, see Section 11.12, “Setting up Shared Storage Volume”. - Configure the meta-volume for geo-replication:
gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL config use_meta_volume true
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL config use_meta_volume trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information on configuring meta-volume, see Section 10.3.5, “Configuring a Meta-Volume”.
- Retrieve the brick paths in
server0.example.comusing the following command:gluster volume info <VOLNAME>
# gluster volume info <VOLNAME>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow Brick path inserver0.example.comis/rhgs/brick1. This has to be replaced with the brick in the newly added host,server5.example.com. - Create the required brick path in server5.example.com.For example, if /rhs/brick is the XFS mount point in server5.example.com, then create a brick directory in that path.
mkdir /rhgs/brick1
# mkdir /rhgs/brick1Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Execute the
replace-brickcommand with the force option:gluster volume replace-brick vol server0.example.com:/rhgs/brick1 server5.example.com:/rhgs/brick1 commit force
# gluster volume replace-brick vol server0.example.com:/rhgs/brick1 server5.example.com:/rhgs/brick1 commit force volume replace-brick: success: replace-brick commit successfulCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Verify that the new brick is online.
gluster volume status
# gluster volume status Status of volume: vol Gluster process Port Online Pid Brick server5.example.com:/rhgs/brick1 49156 Y 5731 Brick server1.example.com:/rhgs/brick1 49153 Y 5354Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Initiate self-heal on the volume. The status of the heal process can be seen by executing the command:
gluster volume heal VOLNAME
# gluster volume heal VOLNAMECopy to Clipboard Copied! Toggle word wrap Toggle overflow - The status of the heal process can be seen by executing the command:
gluster volume heal VOLNAME info
# gluster volume heal VOLNAME infoCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Detach the original machine from the trusted pool.
gluster peer detach server0.example.com
# gluster peer detach server0.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Ensure that after the self-heal completes, the extended attributes are set to zero on the other bricks in the replica.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow In this example, the extended attributestrusted.afr.vol-client-0andtrusted.afr.vol-client-1have zero values. This means that the data on the two bricks is identical. If these attributes are not zero after self-heal is completed, the data has not been synchronised correctly. - Start the geo-replication session using
forceoption:gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL start force
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL start forceCopy to Clipboard Copied! Toggle word wrap Toggle overflow
11.10.2. Replacing a Host Machine with the Same Hostname Copier lienLien copié sur presse-papiers!
/var/lib/glusterd/glusterd.info file.
- Stop the geo-replication session if configured by executing the following command:
gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop force
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop forceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Stop the
glusterdservice on the server0.example.com.service glusterd stop
# service glusterd stopCopy to Clipboard Copied! Toggle word wrap Toggle overflow Important
Ifglusterdcrashes, there is no functionality impact to this crash as it occurs during the shutdown. For more information, see Section 23.3, “ResolvingglusterdCrash” - Retrieve the UUID of the failed host (server0.example.com) from another of the Red Hat Gluster Storage Trusted Storage Pool by executing the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Note that the UUID of the failed host isb5ab2ec3-5411-45fa-a30f-43bd04caf96b - Edit the
glusterd.infofile in the new host and include the UUID of the host you retrieved in the previous step.cat /var/lib/glusterd/glusterd.info UUID=b5ab2ec3-5411-45fa-a30f-43bd04caf96b operating-version=30703
# cat /var/lib/glusterd/glusterd.info UUID=b5ab2ec3-5411-45fa-a30f-43bd04caf96b operating-version=30703Copy to Clipboard Copied! Toggle word wrap Toggle overflow Note
The operating version of this node must be same as in other nodes of the trusted storage pool. - Select any host (say for example, server1.example.com) in the Red Hat Gluster Storage Trusted Storage Pool and retrieve its UUID from the
glusterd.infofile.grep -i uuid /var/lib/glusterd/glusterd.info
# grep -i uuid /var/lib/glusterd/glusterd.info UUID=8cc6377d-0153-4540-b965-a4015494461cCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Gather the peer information files from the host (server1.example.com) in the previous step. Execute the following command in that host (server1.example.com) of the cluster.
cp -a /var/lib/glusterd/peers /tmp/
# cp -a /var/lib/glusterd/peers /tmp/Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Remove the peer file corresponding to the failed host (server0.example.com) from the
/tmp/peersdirectory.rm /tmp/peers/b5ab2ec3-5411-45fa-a30f-43bd04caf96b
# rm /tmp/peers/b5ab2ec3-5411-45fa-a30f-43bd04caf96bCopy to Clipboard Copied! Toggle word wrap Toggle overflow Note that the UUID corresponds to the UUID of the failed host (server0.example.com) retrieved in Step 3. - Archive all the files and copy those to the failed host(server0.example.com).
cd /tmp; tar -cvf peers.tar peers
# cd /tmp; tar -cvf peers.tar peersCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Copy the above created file to the new peer.
scp /tmp/peers.tar root@server0.example.com:/tmp
# scp /tmp/peers.tar root@server0.example.com:/tmpCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Copy the extracted content to the
/var/lib/glusterd/peersdirectory. Execute the following command in the newly added host with the same name (server0.example.com) and IP Address.tar -xvf /tmp/peers.tar cp peers/* /var/lib/glusterd/peers/
# tar -xvf /tmp/peers.tar # cp peers/* /var/lib/glusterd/peers/Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Select any other host in the cluster other than the node (server1.example.com) selected in step 5. Copy the peer file corresponding to the UUID of the host retrieved in Step 5 to the new host (server0.example.com) by executing the following command:
scp /var/lib/glusterd/peers/<UUID-retrieved-from-step5> root@Example1:/var/lib/glusterd/peers/
# scp /var/lib/glusterd/peers/<UUID-retrieved-from-step5> root@Example1:/var/lib/glusterd/peers/Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Retrieve the brick directory information, by executing the following command in any host in the cluster.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow In the above example, the brick path in server0.example.com is,/rhgs/brick1. If the brick path does not exist in server0.example.com, perform steps a, b, and c.- Create a brick path in the host, server0.example.com.
mkdir /rhgs/brick1
mkdir /rhgs/brick1Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Retrieve the volume ID from the existing brick of another host by executing the following command on any host that contains the bricks for the volume.
getfattr -d -m. -ehex <brick-path>
# getfattr -d -m. -ehex <brick-path>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the volume-id.Copy to Clipboard Copied! Toggle word wrap Toggle overflow In the above example, the volume id is 0x8f16258c88a0498fbd53368706af7496 - Set this volume ID on the brick created in the newly added host and execute the following command on the newly added host (server0.example.com).
setfattr -n trusted.glusterfs.volume-id -v <volume-id> <brick-path>
# setfattr -n trusted.glusterfs.volume-id -v <volume-id> <brick-path>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For Example:setfattr -n trusted.glusterfs.volume-id -v 0x8f16258c88a0498fbd53368706af7496 /rhs/brick2/drv2
# setfattr -n trusted.glusterfs.volume-id -v 0x8f16258c88a0498fbd53368706af7496 /rhs/brick2/drv2Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Data recovery is possible only if the volume type is replicate or distribute-replicate. If the volume type is plain distribute, you can skip steps 12 and 13. - Create a FUSE mount point to mount the glusterFS volume.
mount -t glusterfs <server-name>:/VOLNAME <mount>
# mount -t glusterfs <server-name>:/VOLNAME <mount>Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Perform the following operations to change the Automatic File Replication extended attributes so that the heal process happens from the other brick (server1.example.com:/rhgs/brick1) in the replica pair to the new brick (server0.example.com:/rhgs/brick1). Note that /mnt/r2 is the FUSE mount path.
- Create a new directory on the mount point and ensure that a directory with such a name is not already present.
mkdir /mnt/r2/<name-of-nonexistent-dir>
# mkdir /mnt/r2/<name-of-nonexistent-dir>Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Delete the directory and set the extended attributes.
rmdir /mnt/r2/<name-of-nonexistent-dir> setfattr -n trusted.non-existent-key -v abc /mnt/r2 setfattr -x trusted.non-existent-key /mnt/r2
# rmdir /mnt/r2/<name-of-nonexistent-dir> # setfattr -n trusted.non-existent-key -v abc /mnt/r2 # setfattr -x trusted.non-existent-key /mnt/r2Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Ensure that the extended attributes on the other bricks in the replica (in this example,
trusted.afr.vol-client-0) is not set to zero.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Note
You must ensure to perform steps 12, 13, and 14 for all the volumes having bricks fromserver0.example.com. - Start the
glusterdservice.service glusterd start
# service glusterd startCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Perform the self-heal operation on the restored volume.
gluster volume heal VOLNAME
# gluster volume heal VOLNAMECopy to Clipboard Copied! Toggle word wrap Toggle overflow - You can view the gluster volume self-heal status by executing the following command:
gluster volume heal VOLNAME info
# gluster volume heal VOLNAME infoCopy to Clipboard Copied! Toggle word wrap Toggle overflow - If the geo-replication session is configured, perform the following steps:
- Setup the geo-replication session by generating the ssh keys:
gluster system:: execute gsec_create
# gluster system:: execute gsec_createCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Create geo-replication session again with
forceoption to distribute the keys from new nodes to Slave nodes.gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL create push-pem force
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL create push-pem forceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - After successfully setting up the shared storage volume, when a new node is replaced in the cluster, the shared storage is not mounted automatically on this node. Neither is the
/etc/fstabentry added for the shared storage on this node. To make use of shared storage on this node, execute the following commands:mount -t glusterfs <local node's ip>:gluster_shared_storage /var/run/gluster/shared_storage # cp /etc/fstab /var/run/gluster/fstab.tmp # echo "<local node's ip>:/gluster_shared_storage /var/run/gluster/shared_storage/ glusterfs defaults 0 0" >> /etc/fstab
# mount -t glusterfs <local node's ip>:gluster_shared_storage /var/run/gluster/shared_storage # cp /etc/fstab /var/run/gluster/fstab.tmp # echo "<local node's ip>:/gluster_shared_storage /var/run/gluster/shared_storage/ glusterfs defaults 0 0" >> /etc/fstabCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information on setting up shared storage volume, see Section 11.12, “Setting up Shared Storage Volume”. - Configure the meta-volume for geo-replication:
gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL config use_meta_volume true
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL config use_meta_volume trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Start the geo-replication session using
forceoption:gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL start force
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL start forceCopy to Clipboard Copied! Toggle word wrap Toggle overflow
If there are only 2 hosts in the Red Hat Gluster Storage Trusted Storage Pool where the host server0.example.com must be replaced, perform the following steps:
- Stop the geo-replication session if configured by executing the following command:
gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop force
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop forceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Stop the
glusterdservice on server0.example.com.service glusterd stop
# service glusterd stopCopy to Clipboard Copied! Toggle word wrap Toggle overflow Important
Ifglusterdcrashes, there is no functionality impact to this crash as it occurs during the shutdown. For more information, see Section 23.3, “ResolvingglusterdCrash” - Retrieve the UUID of the failed host (server0.example.com) from another peer in the Red Hat Gluster Storage Trusted Storage Pool by executing the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Note that the UUID of the failed host isb5ab2ec3-5411-45fa-a30f-43bd04caf96b - Edit the
glusterd.infofile in the new host (server0.example.com) and include the UUID of the host you retrieved in the previous step.cat /var/lib/glusterd/glusterd.info UUID=b5ab2ec3-5411-45fa-a30f-43bd04caf96b operating-version=30703
# cat /var/lib/glusterd/glusterd.info UUID=b5ab2ec3-5411-45fa-a30f-43bd04caf96b operating-version=30703Copy to Clipboard Copied! Toggle word wrap Toggle overflow Note
The operating version of this node must be same as in other nodes of the trusted storage pool. - Create the peer file in the newly created host (server0.example.com) in /var/lib/glusterd/peers/<uuid-of-other-peer> with the name of the UUID of the other host (server1.example.com).UUID of the host can be obtained with the following:
gluster system:: uuid get
# gluster system:: uuid getCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example 11.6. Example to obtain the UUID of a host
For example, # gluster system:: uuid get UUID: 1d9677dc-6159-405e-9319-ad85ec030880
For example, # gluster system:: uuid get UUID: 1d9677dc-6159-405e-9319-ad85ec030880Copy to Clipboard Copied! Toggle word wrap Toggle overflow In this case the UUID of other peer is1d9677dc-6159-405e-9319-ad85ec030880 - Create a file
/var/lib/glusterd/peers/1d9677dc-6159-405e-9319-ad85ec030880in server0.example.com, with the following command:touch /var/lib/glusterd/peers/1d9677dc-6159-405e-9319-ad85ec030880
# touch /var/lib/glusterd/peers/1d9677dc-6159-405e-9319-ad85ec030880Copy to Clipboard Copied! Toggle word wrap Toggle overflow The file you create must contain the following information:UUID=<uuid-of-other-node> state=3 hostname=<hostname>
UUID=<uuid-of-other-node> state=3 hostname=<hostname>Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Continue to perform steps 12 to 18 as documented in the previous procedure.