Questo contenuto non è disponibile nella lingua selezionata.
11.8. Replacing Hosts
11.8.1. Replacing a Host Machine with a Different Hostname Copia collegamentoCollegamento copiato negli appunti!
Important
server0.example.com and the replacement machine is server5.example.com. The brick with an unrecoverable failure is server0.example.com:/rhgs/brick1 and the replacement brick is server5.example.com:/rhgs/brick1.
- Stop the geo-replication session if configured by executing the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop force - Probe the new peer from one of the existing peers to bring it into the cluster.
# gluster peer probe server5.example.com - Ensure that the new brick
(server5.example.com:/rhgs/brick1)that is replacing the old brick(server0.example.com:/rhgs/brick1)is empty. - If the geo-replication session is configured, perform the following steps:
- Setup the geo-replication session by generating the ssh keys:
# gluster system:: execute gsec_create - Create geo-replication session again with
forceoption to distribute the keys from new nodes to Slave nodes.# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL create push-pem force - After successfully setting up the shared storage volume, when a new node is replaced in the cluster, the shared storage is not mounted automatically on this node. Neither is the
/etc/fstabentry added for the shared storage on this node. To make use of shared storage on this node, execute the following commands:# mount -t glusterfs local node's ip:gluster_shared_storage /var/run/gluster/shared_storage # cp /etc/fstab /var/run/gluster/fstab.tmp # echo local node's ip:/gluster_shared_storage /var/run/gluster/shared_storage/ glusterfs defaults 0 0" >> /etc/fstabFor more information on setting up shared storage volume, see Section 11.10, “Setting up Shared Storage Volume”. - Configure the meta-volume for geo-replication:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL config use_meta_volume trueFor more information on configuring meta-volume, see Section 10.3.5, “Configuring a Meta-Volume”.
- Retrieve the brick paths in
server0.example.comusing the following command:# gluster volume info <VOLNAME>Volume Name: vol Type: Replicate Volume ID: 0xde822e25ebd049ea83bfaa3c4be2b440 Status: Started Snap Volume: no Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: server0.example.com:/rhgs/brick1 Brick2: server1.example.com:/rhgs/brick1 Options Reconfigured: performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disableBrick path inserver0.example.comis/rhgs/brick1. This has to be replaced with the brick in the newly added host,server5.example.com. - Create the required brick path in server5.example.com.For example, if /rhs/brick is the XFS mount point in server5.example.com, then create a brick directory in that path.
# mkdir /rhgs/brick1 - Execute the
replace-brickcommand with the force option:# gluster volume replace-brick vol server0.example.com:/rhgs/brick1 server5.example.com:/rhgs/brick1 commit force volume replace-brick: success: replace-brick commit successful - Verify that the new brick is online.
# gluster volume status Status of volume: vol Gluster process Port Online Pid Brick server5.example.com:/rhgs/brick1 49156 Y 5731 Brick server1.example.com:/rhgs/brick1 49153 Y 5354 - Initiate self-heal on the volume. The status of the heal process can be seen by executing the command:
# gluster volume heal VOLNAME - The status of the heal process can be seen by executing the command:
# gluster volume heal VOLNAME info - Detach the original machine from the trusted pool.
# gluster peer detach server0.example.com - Ensure that after the self-heal completes, the extended attributes are set to zero on the other bricks in the replica.
# getfattr -d -m. -e hex /rhgs/brick1 getfattr: Removing leading '/' from absolute path names #file: rhgs/brick1 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.vol-client-0=0x000000000000000000000000 trusted.afr.vol-client-1=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x0000000100000000000000007ffffffe trusted.glusterfs.volume-id=0xde822e25ebd049ea83bfaa3c4be2b440In this example, the extended attributestrusted.afr.vol-client-0andtrusted.afr.vol-client-1have zero values. This means that the data on the two bricks is identical. If these attributes are not zero after self-heal is completed, the data has not been synchronised correctly. - Start the geo-replication session using
forceoption:# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL start force
11.8.2. Replacing a Host Machine with the Same Hostname Copia collegamentoCollegamento copiato negli appunti!
/var/lib/glusterd/glusterd/info file.
- Stop the geo-replication session if configured by executing the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop force - Stop the
glusterdservice on the server0.example.com.# service glusterd stopImportant
Ifglusterdcrashes, there is no functionality impact to this crash as it occurs during the shutdown. For more information, see Section 24.3, “ResolvingglusterdCrash” - Retrieve the UUID of the failed host (server0.example.com) from another of the Red Hat Gluster Storage Trusted Storage Pool by executing the following command:
# gluster peer status Number of Peers: 2 Hostname: server1.example.com Uuid: 1d9677dc-6159-405e-9319-ad85ec030880 State: Peer in Cluster (Connected) Hostname: server0.example.com Uuid: b5ab2ec3-5411-45fa-a30f-43bd04caf96b State: Peer Rejected (Connected)Note that the UUID of the failed host isb5ab2ec3-5411-45fa-a30f-43bd04caf96b - Edit the
glusterd.infofile in the new host and include the UUID of the host you retrieved in the previous step.# cat /var/lib/glusterd/glusterd.info UUID=b5ab2ec3-5411-45fa-a30f-43bd04caf96b operating-version=30703Note
The operating version of this node must be same as in other nodes of the trusted storage pool. - Select any host (say for example, server1.example.com) in the Red Hat Gluster Storage Trusted Storage Pool and retrieve its UUID from the
glusterd.infofile.# grep -i uuid /var/lib/glusterd/glusterd.info UUID=8cc6377d-0153-4540-b965-a4015494461c - Gather the peer information files from the host (server1.example.com) in the previous step. Execute the following command in that host (server1.example.com) of the cluster.
# cp -a /var/lib/glusterd/peers /tmp/ - Remove the peer file corresponding to the failed host (server0.example.com) from the
/tmp/peersdirectory.# rm /tmp/peers/b5ab2ec3-5411-45fa-a30f-43bd04caf96bNote that the UUID corresponds to the UUID of the failed host (server0.example.com) retrieved in Step 3. - Archive all the files and copy those to the failed host(server0.example.com).
# cd /tmp; tar -cvf peers.tar peers - Copy the above created file to the new peer.
# scp /tmp/peers.tar root@server0.example.com:/tmp - Copy the extracted content to the
/var/lib/glusterd/peersdirectory. Execute the following command in the newly added host with the same name (server0.example.com) and IP Address.# tar -xvf /tmp/peers.tar # cp peers/* /var/lib/glusterd/peers/ - Select any other host in the cluster other than the node (server1.example.com) selected in step 5. Copy the peer file corresponding to the UUID of the host retrieved in Step 4 to the new host (server0.example.com) by executing the following command:
# scp /var/lib/glusterd/peers/<UUID-retrieved-from-step4> root@Example1:/var/lib/glusterd/peers/ - Retrieve the brick directory information, by executing the following command in any host in the cluster.
# gluster volume info Volume Name: vol Type: Replicate Volume ID: 0x8f16258c88a0498fbd53368706af7496 Status: Started Snap Volume: no Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: server0.example.com:/rhgs/brick1 Brick2: server1.example.com:/rhgs/brick1 Options Reconfigured: performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disableIn the above example, the brick path in server0.example.com is,/rhgs/brick1. If the brick path does not exist in server0.example.com, perform steps a, b, and c.- Create a brick path in the host, server0.example.com.
mkdir /rhgs/brick1 - Retrieve the volume ID from the existing brick of another host by executing the following command on any host that contains the bricks for the volume.
# getfattr -d -m. -ehex <brick-path>Copy the volume-id.# getfattr -d -m. -ehex /rhgs/brick1 getfattr: Removing leading '/' from absolute path names # file: rhgs/brick1 trusted.afr.vol-client-0=0x000000000000000000000000 trusted.afr.vol-client-1=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x0000000100000000000000007ffffffe trusted.glusterfs.volume-id=0x8f16258c88a0498fbd53368706af7496In the above example, the volume id is 0x8f16258c88a0498fbd53368706af7496 - Set this volume ID on the brick created in the newly added host and execute the following command on the newly added host (server0.example.com).
# setfattr -n trusted.glusterfs.volume-id -v <volume-id> <brick-path>For Example:# setfattr -n trusted.glusterfs.volume-id -v 0x8f16258c88a0498fbd53368706af7496 /rhs/brick2/drv2
Data recovery is possible only if the volume type is replicate or distribute-replicate. If the volume type is plain distribute, you can skip steps 12 and 13. - Create a FUSE mount point to mount the glusterFS volume.
# mount -t glusterfs <server-name>:/VOLNAME <mount> - Perform the following operations to change the Automatic File Replication extended attributes so that the heal process happens from the other brick (server1.example.com:/rhgs/brick1) in the replica pair to the new brick (server0.example.com:/rhgs/brick1). Note that /mnt/r2 is the FUSE mount path.
- Create a new directory on the mount point and ensure that a directory with such a name is not already present.
# mkdir /mnt/r2/<name-of-nonexistent-dir> - Delete the directory and set the extended attributes.
# rmdir /mnt/r2/<name-of-nonexistent-dir> # setfattr -n trusted.non-existent-key -v abc /mnt/r2 # setfattr -x trusted.non-existent-key /mnt/r2 - Ensure that the extended attributes on the other bricks in the replica (in this example,
trusted.afr.vol-client-0) is not set to zero.# getfattr -d -m. -e hex /rhgs/brick1 # file: rhgs/brick1 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.vol-client-0=0x000000000000000300000002 trusted.afr.vol-client-1=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x0000000100000000000000007ffffffe trusted.glusterfs.volume-id=0x8f16258c88a0498fbd53368706af7496
Note
You must ensure to perform steps 12, 13, and 14 for all the volumes having bricks fromserver0.example.com. - Start the
glusterdservice.# service glusterd start - Perform the self-heal operation on the restored volume.
# gluster volume heal VOLNAME - You can view the gluster volume self-heal status by executing the following command:
# gluster volume heal VOLNAME info - If the geo-replication session is configured, perform the following steps:
- Setup the geo-replication session by generating the ssh keys:
# gluster system:: execute gsec_create - Create geo-replication session again with
forceoption to distribute the keys from new nodes to Slave nodes.# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL create push-pem force - After successfully setting up the shared storage volume, when a new node is replaced in the cluster, the shared storage is not mounted automatically on this node. Neither is the
/etc/fstabentry added for the shared storage on this node. To make use of shared storage on this node, execute the following commands:# mount -t glusterfs <local node's ip>:gluster_shared_storage /var/run/gluster/shared_storage # cp /etc/fstab /var/run/gluster/fstab.tmp # echo "<local node's ip>:/gluster_shared_storage /var/run/gluster/shared_storage/ glusterfs defaults 0 0" >> /etc/fstabFor more information on setting up shared storage volume, see Section 11.10, “Setting up Shared Storage Volume”. - Configure the meta-volume for geo-replication:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL config use_meta_volume true - Start the geo-replication session using
forceoption:# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL start force
If there are only 2 hosts in the Red Hat Gluster Storage Trusted Storage Pool where the host server0.example.com must be replaced, perform the following steps:
- Stop the geo-replication session if configured by executing the following command:
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop force - Stop the
glusterdservice on server0.example.com.# service glusterd stopImportant
Ifglusterdcrashes, there is no functionality impact to this crash as it occurs during the shutdown. For more information, see Section 24.3, “ResolvingglusterdCrash” - Retrieve the UUID of the failed host (server0.example.com) from another peer in the Red Hat Gluster Storage Trusted Storage Pool by executing the following command:
# gluster peer status Number of Peers: 1 Hostname: server0.example.com Uuid: b5ab2ec3-5411-45fa-a30f-43bd04caf96b State: Peer Rejected (Connected)Note that the UUID of the failed host isb5ab2ec3-5411-45fa-a30f-43bd04caf96b - Edit the
glusterd.infofile in the new host (server0.example.com) and include the UUID of the host you retrieved in the previous step.# cat /var/lib/glusterd/glusterd.info UUID=b5ab2ec3-5411-45fa-a30f-43bd04caf96b operating-version=30703Note
The operating version of this node must be same as in other nodes of the trusted storage pool. - Create the peer file in the newly created host (server0.example.com) in /var/lib/glusterd/peers/<uuid-of-other-peer> with the name of the UUID of the other host (server1.example.com).UUID of the host can be obtained with the following:
# gluster system:: uuid getExample 11.7. Example to obtain the UUID of a host
For example, # gluster system:: uuid get UUID: 1d9677dc-6159-405e-9319-ad85ec030880In this case the UUID of other peer is1d9677dc-6159-405e-9319-ad85ec030880 - Create a file
/var/lib/glusterd/peers/1d9677dc-6159-405e-9319-ad85ec030880in server0.example.com, with the following command:# touch /var/lib/glusterd/peers/1d9677dc-6159-405e-9319-ad85ec030880The file you create must contain the following information:UUID=<uuid-of-other-node> state=3 hostname=<hostname> - Continue to perform steps 12 to 18 as documented in the previous procedure.