Chapter 13. Replacing hosts
13.1. Replacing the primary hyperconverged host using ansible
Follow this section to replace the hyperconverged host that you used to perform all deployment operations.
When self-signed encryption is enabled, replacing a node is a disruptive process that requires virtual machines and the Hosted Engine to be shut down.
- (Optional) If encryption using a Certificate Authority is enabled, follow the steps under Expanding Volumes in the Network Encryption chapter of the Red Hat Gluster Storage 3.4 Administration Guide.
Move the server to be replaced into Maintenance mode.
-
In the Administration Portal, click Compute
Hosts and select the host to replace. -
Click Management
Maintenance and click OK to move the host to Maintenance mode.
-
In the Administration Portal, click Compute
Install the replacement host
Follow the instructions in Deploying Red Hat Hyperconverged Infrastructure for Virtualization for Virtualization to install the physical machine and configure storage on the new host.
Configure the replacement host
Follow the instructions in Section 13.3, “Preparing a replacement hyperconverged host using ansible”.
(Optional) If encryption with self-signed certificates is enabled:
- Generate the private key and self-signed certificate on the replacement host. See the Red Hat Gluster Storage Administration Guide for details: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/chap-network_encryption#chap-Network_Encryption-Prereqs.
On a healthy host, create a copy of the
/etc/ssl/glusterfs.ca
file.# cp /etc/ssl/glusterfs.ca /etc/ssl/glusterfs.ca.bk
-
Append the new host’s certificate to the content of the original
/etc/ssl/glusterfs.ca
file. -
Distribute the
/etc/ssl/glusterfs.ca
file to all hosts in the cluster, including the new host. Run the following command on the replacement host to enable management encryption:
# touch /var/lib/glusterd/secure-access
Include the new host in the value of the
auth.ssl-allow
volume option by running the following command for each volume.# gluster volume set <volname> auth.ssl-allow "<old_host1>,<old_host2>,<new_host>"
Restart the glusterd service on all hosts.
# systemctl restart glusterd
- Follow the steps in Section 4.1, “Configuring TLS/SSL using self-signed certificates” to remount all gluster processes.
Add the replacement host to the cluster.
Run the following command from any host already in the cluster.
# gluster peer probe <new_host>
Move the Hosted Engine into Maintenance mode:
# hosted-engine --set-maintenance --mode=global
Stop the ovirt-engine service.
# systemctl stop ovirt-engine
Update the database.
# hosted-engine --set-shared-config storage <new_host_IP>:/engine
Start the ovirt-engine service.
# systemctl start ovirt-engine
- Stop all virtual machines except the Hosted Engine.
- Move all storage domains except the Hosted Engine domain into Maintenance mode.
Stop the Hosted Engine virtual machine.
Run the following command on the existing server that hosts the Hosted Engine.
# hosted-engine --vm-shutdown
Stop high availability services on all hosts.
# systemctl stop ovirt-ha-agent # systemctl stop ovirt-ha-broker
Disconnect Hosted Engine storage from the hyperconverged host.
Run the following command on the existing server that hosts the Hosted Engine.
# hosted-engine --disconnect-storage
Update the Hosted Engine configuration file.
Edit the storage parameter in the
/etc/ovirt-hosted-engine/hosted-engine.conf
file to use the replacement host.storage=<new_server_IP>:/engine
NoteTo configure the Hosted Engine for new hosts, use the command:
# hosted-engine --set-shared-config storage <new_server_IP>:/engine
Restart high availability services on all hosts.
# systemctl restart ovirt-ha-agent # systemctl restart ovirt-ha-broker
Reboot the existing and replacement hosts.
Wait until all hosts are available before continuing.
Take the Hosted Engine out of Maintenance mode.
# hosted-engine --set-maintenance --mode=none
Verify that the replacement host is used.
On all hyperconverged hosts, verify that the engine volume is mounted from the replacement host by checking the IP address in the output of the
mount
command.Activate storage domains.
Verify that storage domains mount using the IP address of the replacement host.
Using the RHV Management UI, add the replacement host.
Specify that the replacement host be used to host the Hosted Engine.
Move the replacement host into Maintenance mode.
# hosted-engine --set-maintenance --mode=global
Reboot the replacement host.
Wait until the host is back online before continuing.
Activate the replacement host from the RHV Management UI.
Ensure that all volumes are mounted using the IP address of the replacement host.
Replace engine volume brick.
Replace the brick on the old host that belongs to the
engine
volume with a new brick on the replacement host.-
Click Storage
Volumes and select the volume. - Click the Bricks subtab.
- Select the brick to replace, and then click Replace brick.
- Select the host that hosts the brick being replaced.
- In the Replace brick window, provide the path to the new brick.
-
Click Storage
Remove the old host.
-
Click Compute
Hosts and select the old host. -
Click Management
Maintenance to move the host to maintenance mode. - Click Remove. The Remove Host(s) confirmation dialog appears.
- If there are still volume bricks on this host, or the host is non-responsive, check the Force Remove checkbox.
- Click OK.
Detach the old host from the cluster.
# gluster peer detach <old_host_IP> force
-
Click Compute
On the replacement host, run the following command to remove metadata from the previous host.
# hosted-engine --clean-metadata --host-id=<old_host_id> --force-clean
13.2. Replacing other hyperconverged hosts using ansible
There are two options for replacing a hyperconverged host that is not the first host:
- Replace the host with a new host that has a different fully-qualified domain name by following the instructions in Section 13.2.1, “Replacing a hyperconverged host to use a different FQDN”.
- Replace the host with a new host that has the same fully-qualified domain name by following the instructions in Section 13.2.2, “Replacing a hyperconverged host to use the same FQDN”.
Follow the instructions in whichever section is appropriate for your deployment.
13.2.1. Replacing a hyperconverged host to use a different FQDN
When self-signed encryption is enabled, replacing a node is a disruptive process that requires virtual machines and the Hosted Engine to be shut down.
Install the replacement host
Follow the instructions in Deploying Red Hat Hyperconverged Infrastructure for Virtualization for Virtualization to install the physical machine.
Stop any existing geo-replication sessions
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop
For further information, see the Red Hat Gluster Storage Administration Guide: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/sect-starting_geo-replication#Stopping_a_Geo-replication_Session.
Move the host to be replaced into Maintenance mode
Perform the following steps in the Administration Portal:
-
Click Compute
Hosts and select the hyperconverged host in the results list. -
Click Management
Maintenance and click OK to move the host to Maintenance mode.
-
Click Compute
Prepare the replacement host
Configure key-based SSH authentication without a password
Configure key-based SSH authentication without a password from a physical machine still in the cluster to the replacement host. For details, see https://access.redhat.com/documentation/en-us/red_hat_hyperconverged_infrastructure_for_virtualization/1.6/html/deploying_red_hat_hyperconverged_infrastructure_for_virtualization/task-configure-key-based-ssh-auth.
Prepare the replacement host
Follow the instructions in Section 13.3, “Preparing a replacement hyperconverged host using ansible”.
Create replacement brick directories
Ensure the new directories are owned by the
vdsm
user and thekvm
group.# mkdir /gluster_bricks/engine/engine # chmod vdsm:kvm /gluster_bricks/engine/engine # mkdir /gluster_bricks/data/data # chmod vdsm:kvm /gluster_bricks/data/data # mkdir /gluster_bricks/vmstore/vmstore # chmod vdsm:kvm /gluster_bricks/vmstore/vmstore
(Optional) If encryption is enabled
Generate the private key and self-signed certificate on the new server using the steps in the Red Hat Gluster Storage Administration Guide: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/chap-network_encryption#chap-Network_Encryption-Prereqs.
If encryption using a Certificate Authority is enabled, follow the steps under Expanding Volumes in the Network Encryption chapter of the Red Hat Gluster Storage 3.4 Administration Guide.
Add the new host’s certificate to existing certificates.
-
On a healthy host, make a backup copy of the
/etc/ssl/glusterfs.ca
file. -
Add the new host’s certificate to the
/etc/ssl/glusterfs.ca
file on the healthy host. -
Distribute the updated
/etc/ssl/glusterfs.ca
file to all other hosts, including the new host.
-
On a healthy host, make a backup copy of the
Enable management encryption
Run the following command on the new host to enable management encryption:
# touch /var/lib/glusterd/secure-access
Include the new host in the value of the
auth.ssl-allow
volume option by running the following command for each volume.# gluster volume set <volname> auth.ssl-allow "<old_host1>,<old_host2>,<new_host>"
Restart the glusterd service on all hosts
# systemctl restart glusterd
- If encryption uses self-signed certificates, follow the steps in Section 4.1, “Configuring TLS/SSL using self-signed certificates” to remount all gluster processes.
Add the new host to the existing cluster
Run the following command from one of the healthy hosts:
# gluster peer probe <new_host>
Add the new host to the existing cluster
-
Click Compute
Hosts and then click New to open the New Host dialog. - Provide a Name, Address, and Password for the new host.
- Uncheck the Automatically configure host firewall checkbox, as firewall rules are already configured by gdeploy.
-
In the Hosted Engine tab of the New Host dialog, set the value of Choose hosted engine deployment action to
Deploy
. - Click OK.
- When the host is available, click the name of the new host.
- Click the Network Interfaces subtab and then click Setup Host Networks. The Setup Host Networks dialog appears.
Drag and drop the network you created for gluster to the IP associated with this host, and click OK.
See the Red Hat Virtualization 4.3 Self-Hosted Engine Guide for further details: https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/html/self-hosted_engine_guide/chap-installing_additional_hosts_to_a_self-hosted_environment.
-
Click Compute
Configure and mount shared storage on the new host
# cp /etc/fstab /etc/fstab.bk # echo "<new_host>:/gluster_shared_storage /var/run/gluster/shared_storage/ glusterfs defaults 0 0" >> /etc/fstab # mount /gluster_shared_storage
Replace the old brick with the brick on the new host
-
In the Administration Portal, click Storage
Volumes and select the volume. - Click the Bricks subtab.
- Select the brick that you want to replace and click Replace Brick. The Replace Brick dialog appears.
- Specify the Host and the Brick Directory of the new brick.
- Verify that brick heal completes successfully.
-
In the Administration Portal, click Storage
-
Click Compute
Hosts. Select the old host and click Remove.
Use
gluster peer status
to verify that that the old host is no longer part of the cluster. If the old host is still present in the status output, run the following command to forcibly remove it:# gluster peer detach <old_host> force
Clean old host metadata.
# hosted-engine --clean-metadata --host-id=<old_host_id> --force-clean
Set up new SSH keys for geo-replication of new brick.
# gluster system:: execute gsec_create
Recreate geo-replication session and distribute new SSH keys.
# gluster volume geo-replication <MASTER_VOL> <SLAVE_HOST>::<SLAVE_VOL> create push-pem force
Start the geo-replication session.
# gluster volume geo-replication <MASTER_VOL> <SLAVE_HOST>::<SLAVE_VOL> start
13.2.2. Replacing a hyperconverged host to use the same FQDN
When self-signed encryption is enabled, replacing a node is a disruptive process that requires virtual machines and the Hosted Engine to be shut down.
- (Optional) If encryption using a Certificate Authority is enabled, follow the steps under Expanding Volumes in the Network Encryption chapter of the Red Hat Gluster Storage 3.4 Administration Guide.
Move the host to be replaced into Maintenance mode
-
In the Administration Portal, click Compute
Hosts and select the hyperconverged host. -
Click Management
Maintenance. - Click OK to move the host to Maintenance mode.
-
In the Administration Portal, click Compute
Install the replacement host
Follow the instructions in Deploying Red Hat Hyperconverged Infrastructure for Virtualization for Virtualization to install the physical machine and configure storage on the new host.
Configure the replacement host
Follow the instructions in Section 13.3, “Preparing a replacement hyperconverged host using ansible”.
(Optional) If encryption with self-signed certificates is enabled
- Generate the private key and self-signed certificate on the replacement host. See the Red Hat Gluster Storage Administration Guide for details: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/chap-network_encryption#chap-Network_Encryption-Prereqs.
On a healthy host, make a backup copy of the
/etc/ssl/glusterfs.ca
file:# cp /etc/ssl/glusterfs.ca /etc/ssl/glusterfs.ca.bk
-
Append the new host’s certificate to the content of the
/etc/ssl/glusterfs.ca
file. -
Distribute the
/etc/ssl/glusterfs.ca
file to all hosts in the cluster, including the new host. Run the following command on the replacement host to enable management encryption:
# touch /var/lib/glusterd/secure-access
Replace the host machine
Follow the instructions in the Red Hat Gluster Storage Administration Guide to replace the host: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/sect-replacing_hosts#Replacing_a_Host_Machine_with_the_Same_Hostname.
Restart the glusterd service on all hosts
# systemctl restart glusterd
Verify that all hosts reconnect
# gluster peer status
- (Optional) If encryption uses self-signed certificates, follow the steps in Section 4.1, “Configuring TLS/SSL using self-signed certificates” to remount all gluster processes.
Verify that all hosts reconnect and that brick heal completes successfully
# gluster peer status
Refresh fingerprint
-
In the Administration Portal, click Compute
Hosts and select the new host. - Click Edit.
- Click Advanced Parameters on the General tab.
- Click fetch to fetch the fingerprint from the host.
- Click OK.
-
In the Administration Portal, click Compute
-
Click Installation
Reinstall and provide the root password when prompted. - On the Hosted Engine tab set the value of Choose hosted engine deployment action to Deploy.
Attach the gluster network to the host
-
Click Compute
Hosts and click the name of the host. - Click the Network Interfaces subtab and then click Setup Host Networks.
- Drag and drop the newly created network to the correct interface.
- Ensure that the Verify connectivity between Host and Engine checkbox is checked.
- Ensure that the Save network configuration checkbox is checked.
- Click OK to save.
-
Click Compute
Verify the health of the network
Click the Network Interfaces tab and check the state of the host’s network. If the network interface enters an "Out of sync" state or does not have an IP Address, click Management
Refresh Capabilities.
13.3. Preparing a replacement hyperconverged host using ansible
Follow this process to replace a hyperconverged host in the cluster.
Prerequisites
- Ensure that the host you intend to replace is not associated with the FQDN that you want to use for the new host.
- Ensure that the new host is associated with the FQDN you want it to use.
Procedure
Create
node_prep_inventory.yml
inventory fileCreate an inventory file called
node_prep_inventory.yml
, based on the following example.Replace
host1
with the FQDN that you want to use for the new host, and device details with details appropriate for your host.Example
node_prep_inventory.yml
filehc_nodes: hosts: # New host newhost.example.com: # Dedupe & Compression config # If logicalsize >= 1000G then slabsize=32G else slabsize=2G #gluster_infra_vdo: # - { name: 'vdo_sdc', device: '/dev/sdc', logicalsize: '3000G', emulate512: 'on', slabsize: '32G', # blockmapcachesize: '128M', readcachesize: '20M', readcache: 'enabled', writepolicy: 'auto' } # With Dedupe & Compression #gluster_infra_volume_groups: # - vgname: gluster_vg_sdc # pvname: /dev/mapper/vdo_sdc # Without Dedupe & Compression gluster_infra_volume_groups: - vgname: gluster_vg_sdc pvname: /dev/sdc gluster_infra_mount_devices: - path: /gluster_bricks/engine lvname: gluster_lv_engine vgname: gluster_vg_sdc - path: /gluster_bricks/data lvname: gluster_lv_data vgname: gluster_vg_sdc - path: /gluster_bricks/vmstore lvname: gluster_lv_vmstore vgname: gluster_vg_sdc gluster_infra_thinpools: - {vgname: 'gluster_vg_sdc', thinpoolname: 'thinpool_gluster_vg_sdc', thinpoolsize: '500G', poolmetadatasize: '4G'} # This is optional gluster_infra_cache_vars: - vgname: gluster_vg_sdc cachedisk: /dev/sde cachelvname: cachelv_thinpool_vg_sdc cachethinpoolname: thinpool_gluster_vg_sdc # cachethinpoolname is equal to the already created thinpool which you want to attach cachelvsize: '10G' cachemetalvsize: '2G' cachemetalvname: cache_thinpool_vg_sdc cachemode: writethrough gluster_infra_thick_lvs: - vgname: gluster_vg_sdc lvname: gluster_lv_engine size: 100G gluster_infra_lv_logicalvols: - vgname: gluster_vg_sdc thinpool: thinpool_gluster_vg_sdc lvname: gluster_lv_data lvsize: 500G - vgname: gluster_vg_sdc thinpool: thinpool_gluster_vg_sdc lvname: gluster_lv_vmstore lvsize: 500G # Mount the devices gluster_infra_mount_devices: - { path: '/gluster_bricks/data', vgname: gluster_vg_sdc, lvname: gluster_lv_data } - { path: '/gluster_bricks/vmstore', vgname: gluster_vg_sdc, lvname: gluster_lv_vmstore } - { path: '/gluster_bricks/engine', vgname: gluster_vg_sdc, lvname: gluster_lv_engine } # Common configurations vars: # Firewall setup gluster_infra_fw_ports: - 2049/tcp - 54321/tcp - 5900/tcp - 5900-6923/tcp - 5666/tcp - 16514/tcp gluster_infra_fw_permanent: true gluster_infra_fw_state: enabled gluster_infra_fw_zone: public gluster_infra_fw_services: - glusterfs gluster_infra_disktype: RAID6 gluster_infra_diskcount: 12 gluster_infra_stripe_unit_size: 128
Create
node_prep.yml
playbookCreate a
node_prep.yml
playbook file based on the following example.Example
node_prep.yml
playbook--- # Prepare Node for replace - name: Setup backend hosts: hc_nodes remote_user: root gather_facts: no any_errors_fatal: true roles: - gluster.infra - gluster.features
Run
node_prep.yml
playbook# ansible-playbook -i node_prep_inventory.yml node_prep.yml