Ce contenu n'est pas disponible dans la langue sélectionnée.
Chapter 4. Upgrading to Red Hat Hyperconverged Infrastructure for Virtualization 1.8
4.1. Upgrade workflow
The procedure to upgrade to Red Hat Hyperconverged Infrastructure for Virtualization (RHHI for Virtualization) 1.8 is not a direct upgrade from the previous versions of RHHI for Virtualization using ‘yum update’ as RHHI for Virtualization 1.7 uses Red Hat Enterprise Linux 7 platform, whereas the new version 1.8 uses the Red Hat Enterprise Linux 8 platform.
As this is not a direct upgrade, engine backup is performed along with gluster configurations and the nodes are reinstalled. The configurations are then restored with new gluster volume created for the hosted engine. Newly installed nodes are allowed to synchronize with other nodes in the cluster and the procedure is repeated across all the nodes one after the other.
The connected hosts and virtual machines can continue to work while the Manager is being upgraded.
4.2. Prerequisites
- Red Hat recommends minimizing workload on the virtual machines, this will help to shorten the upgrade window. If there are highly write intensive workloads, the time taken to sync data will be longer leading to a longer upgrade window.
- If there are scheduled geo-replication sessions on the storage domains, Red Hat recommends to remove these schedules to avoid overlapping with the upgrade window.
- If geo-replication is in progress, wait for the sync to complete to start the upgrade.
- All data center’s and clusters in the environment must have the cluster compatibility level set to version 4.3 before starting the procedure.
4.3. Restrictions
- If the previous version of RHHI for Virtualization environment did not have deduplication and compression enabled, this feature can not be enabled during upgrade to RHHI for Virtualization 1.8.
- Network-Bound Disk Encryption (NBDE) is supported only with new deployments of RHHI for Virtualization 1.8. This feature can not be enabled during upgrade.
4.4. Procedure
This section describes the procedure to upgrade to RHHI for Virtualization 1.8 from RHHI for Virtualization 1.7.
The playbooks mentioned in this section are only available in RHHI for Virtualization 1.7 environment, make sure the RHHI for virtualization versions 1.5 and 1.6 are upgraded to the latest version of RHHI for Virtualization 1.7.
4.4.1. Creating a new gluster volume for Red Hat Virtualization 4.4 Hosted Engine deployment
Procedure
Create a new gluster volume for the new Red Hat Virtualization 4.4 Hosted Engine deployment with bricks for each host under the existing engine brick mount which is /gluster_bricks/engine
Use the free space in the existing engine brick mount path
/gluster_bricks/engine
on each host to create the new replica 3 volume.# gluster volume create newengine replica 3 host1:/gluster_bricks/engine/newengine host2:/gluster_bricks/engine/newengine host3:/gluster_bricks/engine/newengine # gluster volume set newengine group virt # gluster volume set newengine storage.owner-uid 36 # gluster volume set newengine storage.owner-gid 36 # gluster volume set newengine cluster.granular-entry-heal on # gluster volume set newengine performance.strict-o-direct on # gluster volume set newengine network.remote-dio off # gluster volume start newengine
Verify
Status of the bricks can be verified with the following command:
# gluster volume status newengine
4.4.2. Backing up the Gluster configurations
Prerequisites
The tasks/backup.yml
and archive_config.yml
playbooks are available with the latest version of RHV 4.3.z at /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment
.
If tasks/backup.yml and archive_config.yml are not available at /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment
, you can create these playbooks from Understanding the archive_config_inventory.yml
file.
Procedure
Edit
archive_config_inventory.yml
inventory file at /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/archive_config_inventory.yml- Hosts
- Host FQDN of all the hosts in the cluster.
- Common Variables
The default value is correct for the common variables
backup_dir
,nbde_setup
andupgrade
.all: hosts: host1.example.com: host2.example.com: host3.example.com: vars: backup_dir: /archive nbde_setup: false upgrade: true
Run the
archive_config.yml
playbook using your updated inventory file with the backupfiles tag.# cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment # ansible-playbook -i archive_config_inventory.yml archive_config.yml --tags backupfiles
-
The backup configuration tar file is generated on all the hosts under
/root
with namerhvh-node-<HOSTNAME>-backup.tar.gz
. Copy this backup configuration tar file from all the hosts to a different machine(backup host).
Verify
- Verify that the backup configuration files are generated on all hosts and are copied to the different machine(backup host).
4.4.3. Migrating the virtual machines
-
Click on Compute
Hosts Select the first host. -
Click on the hostname
Select Virtual Machines tab. -
Select all Virtual Machines
Migrate. - Wait for all Virtual Machines to migrate to other hosts in the cluster.
4.4.4. Backing up the Hosted Engine configurations
Enable Global Maintenance for Hosted Engine. Run the following command on one of the active hosts in the cluster deployed with
hosted-engine --deploy
.# hosted-engine --set-maintenance --mode=global
Log in to the Hosted Engine VM using SSH and stop the
ovirt-engine
service.# systemctl stop ovirt-engine
Run the following command in Hosted Engine VM to create a backup of the engine from the Hosted Engine VM.
# engine-backup --mode=backup --scope=all --file=<backup-file.tar.gz> --log=<logfile> Example: # engine-backup --mode=backup --scope=all --file=engine-backup.tar.gz --log=backup.log Start of engine-backup with mode 'backup' scope: all archive file: engine-backup.tar.gz log file: backup.log Backing up: Notifying engine - Files - Engine database 'engine' - DWH database 'ovirt_engine_history' Packing into file 'engine-backup.tar.gz' Notifying engine Done.
Copy the backup file from the Hosted Engine VM to a different machine (backup host).
# scp <backup-file.tar.gz> root@backup-host.example.com:/backup/
- Shut down the Hosted Engine VM by running poweroff command from the Hosted Engine VM.
4.4.5. Checking self-heal status
Check for any pending self-heal on all the replica 3 volumes and wait for the heal to complete. Run the following command on one of the hosts.
# gluster volume heal <volume> info summary
Once confirmed there are no pending self-heals, stop the glusterfs brick process and unmount all the bricks on the first host(the current host you are working on) to maintain file system consistency. Run the following on the first host:
# pkill glusterfsd; pkill glusterfs # systemctl stop glusterd # umount /gluster_bricks/*
4.4.6. Reinstalling the first host with Red Hat Virtualization Host 4.4
Use the Installing Red Hat Virtualization Host guide to re-install the host with Red Hat Virtualization Host 4.4 ISO, formatting only the OS disk.
ImportantMake sure that the installation does not format the other disks, as bricks are created on top of these disks.
Subscribe to Red Hat Virtualization Host(RHVH) 4.4 repositories once the node is up post RHVH 4.4 installation or install the RHV 4.4 appliance downloaded from customer portal.
# yum install rhvm-appliance
See Configuring software repository access to subscribe to Red Hat Virtualization Host.
4.4.7. Copying the backup files to the newly installed host
Copy the engine backup and host configuration tar files from the backup host to the newly installed host and untar the content.
# scp root@backuphost.example.com:/backupdir/engine-backup.tar.gz /root/ # scp root@backuphost.example.com:/backupdir/rhvh-node-host1.example.com-backup.tar.gz /root/
4.4.8. Restoring gluster configuration files to the newly installed host
Ensure to remove the existing LVM filter before restoring the backup and regenerate the LVM filter after restoration.
Remove the existing LVM filter, to allow using the existing Physical Volumes (PVs).
# sed -i /^filter/d /etc/lvm/lvm.conf
Extract the contents of gluster configuration files.
# mkdir /archive # tar -xvf /root/rhvh-host-host1.example.com.tar.gz -C /archive/
Edit the
archive_config_inventory.yml
file to restore the configuration files. Thearchive_config_inventory.yml
file is available at/etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/archive_config_inventory.yml
all: hosts: host1.example.com: vars: backup_dir: /archive nbde_setup: false upgrade: true
ImportantUse only one host under the hosts section of the restoration playbook.
Execute the playbook to restore the configuration files.
# cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment # ansible-playbook -i archive_config_inventory.yml archive_config.yml --tags restorefiles
Regenerate new LVM filters for the newly identified PVs.
# vdsm-tool config-lvm-filter -y
4.4.9. Deploying hosted engine on the newly installed host
Deploy hosted engine with option hosted-engine --deploy --restore-from-file=<engine-backup.tar.gz>
pointing to the backed-up archive from the engine.
The hosted engine can be deployed interactively using hosted-engine --deploy
command, providing storage corresponding to newly created engine volume.
The hosted engine can also be deployed using ovirt-ansible-hosted-engine-setup role in an automated way and Red Hat recommends to use the automated way to avoid errors. The following procedure explains the automated way of deploying Hosted Engine VM:
Create the playbook for Hosted Engine deployment in the newly installed host at
/etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he.yml
--- - name: Deploy oVirt hosted engine hosts: localhost roles: - role: ovirt.ovirt.hosted_engine_setup
Update the Hosted Engine related information using the
he_gluster_vars.json
template file at/etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json
.# cat /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json { "he_appliance_password": "password", "he_admin_password": "password", "he_domain_type": "glusterfs", "he_fqdn": "hostedengine.example.com", "he_vm_mac_addr": "00:18:15:20:59:01", "he_default_gateway": "19.70.12.254", "he_mgmt_network": "ovirtmgmt", "he_storage_domain_name": "HostedEngine", "he_storage_domain_path": "/newengine", "he_storage_domain_addr": "host1.example.com", "he_mount_options": "backup-volfile-servers=host2.example.com:host3.example.com", "he_bridge_if": "eth0", "he_enable_hc_gluster_service": true, "he_mem_size_MB": "16384", "he_cluster": "Default", "he_restore_from_file": "/root/engine-backup.tar.gz", "he_vcpus": "4" }
NoteIn
he_gluster_vars.json file
, there are 2 important values- he_restore_from_file
- This value is not given in template and should be added. This value should point to the absolute file name of engine backup archive copied to the local machine.
- he_storage_domain_path
- This value should refer to the newly created gluster volume.
The previous version of Red Hat Virtualization running on the Hosted Engine VM is down and discarded. MAC address and FQDN corresponding to the older hosted engine VM can be reused for the new engine as well.
For static Hosted Engine network configuration, add more options as:
- he_vm_ip_addr
- engine VM IP address
- he_vm_ip_prefix
- engine VM IP prefix
- he_dns_addr
- engine VM DNS server
- he_default_gateway
engine VM default gateway
NoteIf there are no specific DNS available, include 2 more options as he_vm_etc_hosts: true and he_network_test: ping.
Run the playbook to deploy the Hosted Engine:
# cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment # ansible-playbook he.yml --extra-vars='@he_gluster_vars.json'
ImportantIf you are using Red Hat Virtualization Host (RHVH) 4.4 SP1 based on Red Hat Enterprise Linux 8.6 (RHEL 8.6), add the
-e 'ansible_python_interpreter=/usr/bin/python3.6'
parameter:# ansible-playbook -e 'ansible_python_interpreter=/usr/bin/python3.6' he.yml --extra-vars='@he_gluster_vars.json'
Wait for the Hosted Engine deployment to complete.
NoteIf there are any failures during Hosted Engine deployment, find the problem looking at the log messages under
/var/log/ovirt-hosted-engine-setup
, fix the problem. Clean the failed hosted engine deployment using the commandovirt-hosted-engine-cleanup
and rerun the deployment.-
Login into the RHV 4.4 Administration Portal on the newly installed RHV manager and ensure all the hosts are in
up
state. Wait for the self-heal on the gluster volumes to complete.
4.4.10. Upgrading the next host
Move to the next host(second host), ideally the next in order to maintenance mode from RHV Administration Portal and stop the gluster service.
-
Click on Compute
Hosts select the next host. -
Click on Management
Select Maintenance New Maintenance Host(s) dialog box opens up. -
Select the check box Stop Gluster service
OK.
-
Click on Compute
From the command line of the host unmount gluster bricks.
# umount /gluster_bricks/*
Reinstall this host with RHVH 4.4.
ImportantEnsure that the installation does not format the other disks, as bricks are created on these disks.
Copy the gluster configuration tar files from the backup host to the newly installed host and untar the content.
# scp root@backuphost.example.com:/backupdir/rhvh-node-<hostname>-backup.tar.gz /root/ # tar -xvf /root/rhvh-host-<hostname>-backup.tar.gz -C /archive/
Restore gluster configuration files on the newly installed host by executing the playbook mentioned in step Restoring the configuration files on the newly installed host on this host.
NoteEdit the
archive_config_inventory.yml
playbook and execute it on the newly installed host.Reinstall the host in RHV Administration Portal.
Copy the authorized key from the first deployed host in RHV 4.4.
# scp root@host1.example.com:/root/.ssh/authorized_keys /root/.ssh/
-
In the RHV Administration Portal, the host will be in Maintenance mode. Click on Compute
Select Hosts Click on Installation Select Re-install. New host dialog box will open, select the Hosted Engine tab and choose the hosted engine deployment action as deploy. - Wait for the host to become up.
- Repeat the steps in Upgrading next host for all the Red Hat Virtualization Host 4.3 hosts in the cluster.
4.4.11. Attaching gluster logical network
(optional)If a separate gluster logical network exists in the cluster, attach that gluster logical network to the required interface on each host.
-
Select Compute
Hosts select host Select tab Network Interfaces -
Click on button Setup Host Networks
Drag and drop the gluster logical network to the appropriate network interface.
4.4.12. Removing old hosted engine storage domain
Identify the old hosted engine storage domain with name hosted_storage and no golden star next to it.
-
Click on Storage
Domains Select hosted_storage Data center tab Maintenance. - Wait for that storage domain to move into Maintenance.
- Once the storage domain is in Maintenance click on Detach, the storage domain will go unattached.
-
Select the unattached storage domain and click on Remove button
OK.
-
Click on Storage
Stop and remove old engine volume.
-
Click on Storage
Volumes Select old engine volume Click on Stop button Confirm OK. -
Click on the same volume
Remove Confirm OK.
-
Click on Storage
Remove engine bricks on the hyperconverged hosts.
# rm -r /gluster_bricks/engine/engine
NoteBe cautious when removing the old engine brick as the new engine brick directory is also created on the same mount path as
/gluster_bricks/engine
.
4.4.13. Updating cluster compatibility
Select Compute
Clusters Select the cluster Default Edit update Compatibility Version to 4.6 OK. NoteThere will be a warning for changing compatibility version as VMs on the cluster to be restarted click OK.
4.4.14. Updating data center compatibility
-
Select Compute
Data Centers. - Select the appropriate data center.
- Click Edit.
- The Edit Data Center dialog box opens.
-
Update Compatibility Version to
4.6
from the dropdown list.
4.4.15. Adding new gluster volume options available with RHV 4.4
New gluster volume options available with RHV 4.4, apply these volume options on all the volumes.
Execute the following on one of the nodes in the cluster.
# for vol in `gluster volume list`; do gluster volume set $vol group virt; done
4.4.16. Removing the archives and extracted content
Remove the archives and extracted contents of backup configuration files from all the nodes.
# rm -rf /root/rhvh-node-<hostname>-backup.tar.gz # rm -rf /archive/
Disable the gluster volume option cluster.lookup-optimize
on all the gluster volumes after the upgrade.
# for volume in `gluster volume list`; do gluster volume set $volume cluster.lookup-optimize off; done
4.4.17. Troubleshooting
GFID mismatch leading to HA agents not syncing with each other.
Appropriate Input/Output error is seen in
/var/log/ovirt-hosted-engine-ha/broker.log
# grep -i error /var/log/ovirt-hosted-engine-ha/broker.log MainThread::ERROR::2020-07-13 06:25:16,188::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) Failed initializing the broker: [Errno 5] Input/output error: '/rhev/data-center/mnt/glusterSD/rhsqa-grafton10.lab.eng.blr.redhat.com:_newengine/1d94d115-8ddd-41c9-bd9c-477347e95ad4/ha_agent/hosted-engine.lockspace'
Run the following command to check if there is any GFID mismatch on the volume.
# grep -i ‘gfid mismatch’ /var/log/glusterfs/rhev* Example: # grep -i 'gfid mismatch' /var/log/glusterfs/rhev* /var/log/glusterfs/rhev-data-center-mnt-glusterSD-rhsqa-grafton10.lab.eng.blr.redhat.com:_newengine.log:[2020-07-13 06:14:12.992345] E [MSGID: 108008] [afr-self-heal-common.c:392:afr_gfid_split_brain_source] 0-newengine-replicate-0: Gfid mismatch detected for <gfid:580f8fe2-a42f-4f62-a5b0-7591c3740885>/hosted-engine.metadata>, d6a1fe1d-fc04-48cc-953f-d195d40749c1 on newengine-client-1 and c5e89641-e08f-462f-85ab-13518c21b7dc on newengine-client-0.
If there are entries listed with GFID mismatch, resolve the GFID split-brain.
# gluster volume heal <volume> split-brain latest-mtime <relative_path_of_file_in_brick> Example: # gluster volume heal newengine split-brain latest-mtime /1d94d115-8ddd-41c9-bd9c-477347e95ad4/ha_agent/hosted-engine.lockspace
RHV Administration portal shows gluster volume in degraded state with one of the bricks on the upgraded node as
down
.Check the gluster volume status from the gluster command line on one of the hyperconverged hosts.The brick entry corresponding to the node which was upgraded and rebooted is listed with the brick process and port as N/A.
In the following example, notice that there is no process ID or port information for host rhvh2.example.com:
# gluster volume status engine Example: Status of volume: engine Gluster process TCP Port RDMA Port --------------------------------------------------------------- Brick rhvh1.example.com:/gluster_bricks/eng ine/engine 49158 0 Brick rhvh2.example.com:/gluster_bricks/eng ine/engine N/A N/A Brick rhvh3.example.com:/gluster_bricks/eng ine/engine 49152 0 Self-heal Daemon on localhost N/A N/A Self-heal Daemon on rhvh2.example.com N/A N/A Self-heal Daemon on rhvh3.example.com N/A N/A Online Pid ------------ Y 94365 Y 11052 Y 31153 Y 128608 Y 11838 Y 9806 Task Status of Volume engine ------------------------------------------------------------------ There are no active volume tasks
To fix this problem, kill the brick process and restart
glusterfsd
service.# pkill glusterfsd # systemctl restart glusterd
Check the
gluster volume status
once again to make sure all the brick entries have got a brick process ID as well as the port information. Wait for a couple minutes for this information to reflect in the RHV administration portal.# gluster volume status engine
4.5. Verifying the upgrade
Verify that the upgrade has completed successfully.
Verify the RHV Manager version.
Login in to Administration Portal
Help( ?
symbol) on the top rightAbout. The software version should be as
Software Version:4.4.X.X-X.X.el8ev
.Example: Software Version:4.4.1.8-0.7.el8ev
Verify the host version.
Run the following command on all the hosts to get the latest version of the host:
# nodectl info | grep default
Example: # nodectl info | grep default default: rhvh-4.4.1.1-0.20200707.0 (4.18.0-193.12.1.el8_2.x86_64)