Rechercher

Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 4. Upgrading to Red Hat Hyperconverged Infrastructure for Virtualization 1.8

download PDF

4.1. Upgrade workflow

The procedure to upgrade to Red Hat Hyperconverged Infrastructure for Virtualization (RHHI for Virtualization) 1.8 is not a direct upgrade from the previous versions of RHHI for Virtualization using ‘yum update’ as RHHI for Virtualization 1.7 uses Red Hat Enterprise Linux 7 platform, whereas the new version 1.8 uses the Red Hat Enterprise Linux 8 platform.

As this is not a direct upgrade, engine backup is performed along with gluster configurations and the nodes are reinstalled. The configurations are then restored with new gluster volume created for the hosted engine. Newly installed nodes are allowed to synchronize with other nodes in the cluster and the procedure is repeated across all the nodes one after the other.

The connected hosts and virtual machines can continue to work while the Manager is being upgraded.

4.2. Prerequisites

  • Red Hat recommends minimizing workload on the virtual machines, this will help to shorten the upgrade window. If there are highly write intensive workloads, the time taken to sync data will be longer leading to a longer upgrade window.
  • If there are scheduled geo-replication sessions on the storage domains, Red Hat recommends to remove these schedules to avoid overlapping with the upgrade window.
  • If geo-replication is in progress, wait for the sync to complete to start the upgrade.
  • All data center’s and clusters in the environment must have the cluster compatibility level set to version 4.3 before starting the procedure.

4.3. Restrictions

  • If the previous version of RHHI for Virtualization environment did not have deduplication and compression enabled, this feature can not be enabled during upgrade to RHHI for Virtualization 1.8.
  • Network-Bound Disk Encryption (NBDE) is supported only with new deployments of RHHI for Virtualization 1.8. This feature can not be enabled during upgrade.

4.4. Procedure

This section describes the procedure to upgrade to RHHI for Virtualization 1.8 from RHHI for Virtualization 1.7.

Important

The playbooks mentioned in this section are only available in RHHI for Virtualization 1.7 environment, make sure the RHHI for virtualization versions 1.5 and 1.6 are upgraded to the latest version of RHHI for Virtualization 1.7.

4.4.1. Creating a new gluster volume for Red Hat Virtualization 4.4 Hosted Engine deployment

Procedure

Create a new gluster volume for the new Red Hat Virtualization 4.4 Hosted Engine deployment with bricks for each host under the existing engine brick mount which is /gluster_bricks/engine

  • Use the free space in the existing engine brick mount path /gluster_bricks/engine on each host to create the new replica 3 volume.

    # gluster volume create newengine replica 3 host1:/gluster_bricks/engine/newengine host2:/gluster_bricks/engine/newengine host3:/gluster_bricks/engine/newengine
    # gluster volume set newengine group virt
    # gluster volume set newengine storage.owner-uid 36
    # gluster volume set newengine storage.owner-gid 36
    # gluster volume set newengine cluster.granular-entry-heal on
    # gluster volume set newengine performance.strict-o-direct on
    # gluster volume set newengine network.remote-dio off
    # gluster volume start newengine

Verify

  • Status of the bricks can be verified with the following command:

    # gluster volume status newengine

4.4.2. Backing up the Gluster configurations

Prerequisites

The tasks/backup.yml and archive_config.yml playbooks are available with the latest version of RHV 4.3.z at /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment.

Important

If tasks/backup.yml and archive_config.yml are not available at /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment, you can create these playbooks from Understanding the archive_config_inventory.yml file.

Procedure

  1. Edit archive_config_inventory.yml inventory file at /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/archive_config_inventory.yml

    Hosts
    Host FQDN of all the hosts in the cluster.
    Common Variables

    The default value is correct for the common variables backup_dir, nbde_setup and upgrade.

    all:
      hosts:
        host1.example.com:
        host2.example.com:
        host3.example.com:
      vars:
        backup_dir: /archive
        nbde_setup: false
        upgrade: true
  2. Run the archive_config.yml playbook using your updated inventory file with the backupfiles tag.

    # cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment
    
    # ansible-playbook -i archive_config_inventory.yml archive_config.yml --tags backupfiles
  3. The backup configuration tar file is generated on all the hosts under /root with name rhvh-node-<HOSTNAME>-backup.tar.gz. Copy this backup configuration tar file from all the hosts to a different machine(backup host).

Verify

  • Verify that the backup configuration files are generated on all hosts and are copied to the different machine(backup host).

4.4.3. Migrating the virtual machines

  1. Click on Compute Hosts Select the first host.
  2. Click on the hostname Select Virtual Machines tab.
  3. Select all Virtual Machines Migrate.
  4. Wait for all Virtual Machines to migrate to other hosts in the cluster.

4.4.4. Backing up the Hosted Engine configurations

  1. Enable Global Maintenance for Hosted Engine. Run the following command on one of the active hosts in the cluster deployed with hosted-engine --deploy.

    # hosted-engine --set-maintenance --mode=global
  2. Log in to the Hosted Engine VM using SSH and stop the ovirt-engine service.

    # systemctl stop ovirt-engine
  3. Run the following command in Hosted Engine VM to create a backup of the engine from the Hosted Engine VM.

    # engine-backup --mode=backup --scope=all --file=<backup-file.tar.gz> --log=<logfile>
    
    Example:
    # engine-backup --mode=backup --scope=all --file=engine-backup.tar.gz --log=backup.log
    Start of engine-backup with mode 'backup'
    scope: all
    archive file: engine-backup.tar.gz
    log file: backup.log
    Backing up:
    Notifying engine
    - Files
    - Engine database 'engine'
    - DWH database 'ovirt_engine_history'
    Packing into file 'engine-backup.tar.gz'
    Notifying engine
    Done.
  4. Copy the backup file from the Hosted Engine VM to a different machine (backup host).

    # scp <backup-file.tar.gz> root@backup-host.example.com:/backup/
  5. Shut down the Hosted Engine VM by running poweroff command from the Hosted Engine VM.

4.4.5. Checking self-heal status

  1. Check for any pending self-heal on all the replica 3 volumes and wait for the heal to complete. Run the following command on one of the hosts.

    # gluster volume heal <volume> info summary
  2. Once confirmed there are no pending self-heals, stop the glusterfs brick process and unmount all the bricks on the first host(the current host you are working on) to maintain file system consistency. Run the following on the first host:

    # pkill glusterfsd; pkill glusterfs
    # systemctl stop glusterd
    # umount /gluster_bricks/*

4.4.6. Reinstalling the first host with Red Hat Virtualization Host 4.4

  1. Use the Installing Red Hat Virtualization Host guide to re-install the host with Red Hat Virtualization Host 4.4 ISO, formatting only the OS disk.

    Important

    Make sure that the installation does not format the other disks, as bricks are created on top of these disks.

  2. Subscribe to Red Hat Virtualization Host(RHVH) 4.4 repositories once the node is up post RHVH 4.4 installation or install the RHV 4.4 appliance downloaded from customer portal.

    # yum install rhvm-appliance

See Configuring software repository access to subscribe to Red Hat Virtualization Host.

4.4.7. Copying the backup files to the newly installed host

  • Copy the engine backup and host configuration tar files from the backup host to the newly installed host and untar the content.

    # scp root@backuphost.example.com:/backupdir/engine-backup.tar.gz /root/
    # scp root@backuphost.example.com:/backupdir/rhvh-node-host1.example.com-backup.tar.gz /root/

4.4.8. Restoring gluster configuration files to the newly installed host

Note

Ensure to remove the existing LVM filter before restoring the backup and regenerate the LVM filter after restoration.

  1. Remove the existing LVM filter, to allow using the existing Physical Volumes (PVs).

    # sed -i /^filter/d /etc/lvm/lvm.conf
  2. Extract the contents of gluster configuration files.

    # mkdir /archive
    # tar -xvf /root/rhvh-host-host1.example.com.tar.gz -C /archive/
  3. Edit the archive_config_inventory.yml file to restore the configuration files. The archive_config_inventory.yml file is available at /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/archive_config_inventory.yml

    all:
      hosts:
        host1.example.com:
      vars:
        backup_dir: /archive
        nbde_setup: false
        upgrade: true
    Important

    Use only one host under the hosts section of the restoration playbook.

  4. Execute the playbook to restore the configuration files.

    # cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment
    # ansible-playbook -i archive_config_inventory.yml archive_config.yml --tags restorefiles
  5. Regenerate new LVM filters for the newly identified PVs.

    # vdsm-tool config-lvm-filter -y

4.4.9. Deploying hosted engine on the newly installed host

Deploy hosted engine with option hosted-engine --deploy --restore-from-file=<engine-backup.tar.gz> pointing to the backed-up archive from the engine.

The hosted engine can be deployed interactively using hosted-engine --deploy command, providing storage corresponding to newly created engine volume.

The hosted engine can also be deployed using ovirt-ansible-hosted-engine-setup role in an automated way and Red Hat recommends to use the automated way to avoid errors. The following procedure explains the automated way of deploying Hosted Engine VM:

  1. Create the playbook for Hosted Engine deployment in the newly installed host at /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he.yml

    ---
    - name: Deploy oVirt hosted engine
      hosts: localhost
      roles:
        - role: ovirt.ovirt.hosted_engine_setup
  2. Update the Hosted Engine related information using the he_gluster_vars.json template file at /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json.

    # cat /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json
    
    {
      "he_appliance_password": "password",
      "he_admin_password": "password",
      "he_domain_type": "glusterfs",
      "he_fqdn": "hostedengine.example.com",
      "he_vm_mac_addr": "00:18:15:20:59:01",
      "he_default_gateway": "19.70.12.254",
      "he_mgmt_network": "ovirtmgmt",
      "he_storage_domain_name": "HostedEngine",
      "he_storage_domain_path": "/newengine",
      "he_storage_domain_addr": "host1.example.com",
      "he_mount_options": "backup-volfile-servers=host2.example.com:host3.example.com",
      "he_bridge_if": "eth0",
      "he_enable_hc_gluster_service": true,
      "he_mem_size_MB": "16384",
      "he_cluster": "Default",
      "he_restore_from_file": "/root/engine-backup.tar.gz",
      "he_vcpus": "4"
      }
    Note

    In he_gluster_vars.json file, there are 2 important values

    he_restore_from_file
    This value is not given in template and should be added. This value should point to the absolute file name of engine backup archive copied to the local machine.
    he_storage_domain_path
    This value should refer to the newly created gluster volume.

    The previous version of Red Hat Virtualization running on the Hosted Engine VM is down and discarded. MAC address and FQDN corresponding to the older hosted engine VM can be reused for the new engine as well.

  3. For static Hosted Engine network configuration, add more options as:

    he_vm_ip_addr
    engine VM IP address
    he_vm_ip_prefix
    engine VM IP prefix
    he_dns_addr
    engine VM DNS server
    he_default_gateway

    engine VM default gateway

    Note

    If there are no specific DNS available, include 2 more options as he_vm_etc_hosts: true and he_network_test: ping.

  4. Run the playbook to deploy the Hosted Engine:

    # cd /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment
    # ansible-playbook he.yml --extra-vars='@he_gluster_vars.json'
    Important

    If you are using Red Hat Virtualization Host (RHVH) 4.4 SP1 based on Red Hat Enterprise Linux 8.6 (RHEL 8.6), add the -e 'ansible_python_interpreter=/usr/bin/python3.6' parameter:

    # ansible-playbook -e 'ansible_python_interpreter=/usr/bin/python3.6' he.yml --extra-vars='@he_gluster_vars.json'
  5. Wait for the Hosted Engine deployment to complete.

    Note

    If there are any failures during Hosted Engine deployment, find the problem looking at the log messages under /var/log/ovirt-hosted-engine-setup, fix the problem. Clean the failed hosted engine deployment using the command ovirt-hosted-engine-cleanup and rerun the deployment.

  6. Login into the RHV 4.4 Administration Portal on the newly installed RHV manager and ensure all the hosts are in up state. Wait for the self-heal on the gluster volumes to complete.

4.4.10. Upgrading the next host

  1. Move to the next host(second host), ideally the next in order to maintenance mode from RHV Administration Portal and stop the gluster service.

    1. Click on Compute Hosts select the next host.
    2. Click on Management Select Maintenance New Maintenance Host(s) dialog box opens up.
    3. Select the check box Stop Gluster service OK.
  2. From the command line of the host unmount gluster bricks.

    # umount /gluster_bricks/*
  3. Reinstall this host with RHVH 4.4.

    Important

    Ensure that the installation does not format the other disks, as bricks are created on these disks.

  4. Copy the gluster configuration tar files from the backup host to the newly installed host and untar the content.

    # scp root@backuphost.example.com:/backupdir/rhvh-node-<hostname>-backup.tar.gz /root/
    # tar -xvf /root/rhvh-host-<hostname>-backup.tar.gz -C /archive/
  5. Restore gluster configuration files on the newly installed host by executing the playbook mentioned in step Restoring the configuration files on the newly installed host on this host.

    Note

    Edit the archive_config_inventory.yml playbook and execute it on the newly installed host.

  6. Reinstall the host in RHV Administration Portal.

    1. Copy the authorized key from the first deployed host in RHV 4.4.

      # scp root@host1.example.com:/root/.ssh/authorized_keys /root/.ssh/
    2. In the RHV Administration Portal, the host will be in Maintenance mode. Click on Compute Select Hosts Click on Installation Select Re-install. New host dialog box will open, select the Hosted Engine tab and choose the hosted engine deployment action as deploy.
    3. Wait for the host to become up.
  7. Repeat the steps in Upgrading next host for all the Red Hat Virtualization Host 4.3 hosts in the cluster.

4.4.11. Attaching gluster logical network

(optional)If a separate gluster logical network exists in the cluster, attach that gluster logical network to the required interface on each host.

  1. Select Compute Hosts select host Select tab Network Interfaces
  2. Click on button Setup Host Networks Drag and drop the gluster logical network to the appropriate network interface.

4.4.12. Removing old hosted engine storage domain

  1. Identify the old hosted engine storage domain with name hosted_storage and no golden star next to it.

    1. Click on Storage Domains Select hosted_storage Data center tab Maintenance.
    2. Wait for that storage domain to move into Maintenance.
    3. Once the storage domain is in Maintenance click on Detach, the storage domain will go unattached.
    4. Select the unattached storage domain and click on Remove button OK.
  2. Stop and remove old engine volume.

    1. Click on Storage Volumes Select old engine volume Click on Stop button Confirm OK.
    2. Click on the same volume Remove Confirm OK.
  3. Remove engine bricks on the hyperconverged hosts.

    # rm -r /gluster_bricks/engine/engine
    Note

    Be cautious when removing the old engine brick as the new engine brick directory is also created on the same mount path as /gluster_bricks/engine.

4.4.13. Updating cluster compatibility

  • Select Compute Clusters Select the cluster Default Edit update Compatibility Version to 4.6 OK.

    Note

    There will be a warning for changing compatibility version as VMs on the cluster to be restarted click OK.

4.4.14. Updating data center compatibility

  1. Select Compute Data Centers.
  2. Select the appropriate data center.
  3. Click Edit.
  4. The Edit Data Center dialog box opens.
  5. Update Compatibility Version to 4.6 from the dropdown list.

4.4.15. Adding new gluster volume options available with RHV 4.4

New gluster volume options available with RHV 4.4, apply these volume options on all the volumes.

Execute the following on one of the nodes in the cluster.

# for vol in `gluster volume list`; do gluster volume set $vol group virt; done

4.4.16. Removing the archives and extracted content

Remove the archives and extracted contents of backup configuration files from all the nodes.

# rm -rf /root/rhvh-node-<hostname>-backup.tar.gz
# rm -rf /archive/
Important

Disable the gluster volume option cluster.lookup-optimize on all the gluster volumes after the upgrade.

# for volume in `gluster volume list`; do gluster volume set $volume cluster.lookup-optimize off; done

4.4.17. Troubleshooting

  1. GFID mismatch leading to HA agents not syncing with each other.

    1. Appropriate Input/Output error is seen in /var/log/ovirt-hosted-engine-ha/broker.log

      # grep -i  error /var/log/ovirt-hosted-engine-ha/broker.log
      
      MainThread::ERROR::2020-07-13 06:25:16,188::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) Failed initializing the broker: [Errno 5] Input/output error: '/rhev/data-center/mnt/glusterSD/rhsqa-grafton10.lab.eng.blr.redhat.com:_newengine/1d94d115-8ddd-41c9-bd9c-477347e95ad4/ha_agent/hosted-engine.lockspace'
    2. Run the following command to check if there is any GFID mismatch on the volume.

      # grep -i ‘gfid mismatch’ /var/log/glusterfs/rhev*
      
      Example:
      # grep -i 'gfid mismatch' /var/log/glusterfs/rhev*
      
      /var/log/glusterfs/rhev-data-center-mnt-glusterSD-rhsqa-grafton10.lab.eng.blr.redhat.com:_newengine.log:[2020-07-13 06:14:12.992345] E [MSGID: 108008] [afr-self-heal-common.c:392:afr_gfid_split_brain_source] 0-newengine-replicate-0: Gfid mismatch detected for <gfid:580f8fe2-a42f-4f62-a5b0-7591c3740885>/hosted-engine.metadata>, d6a1fe1d-fc04-48cc-953f-d195d40749c1 on newengine-client-1 and c5e89641-e08f-462f-85ab-13518c21b7dc on newengine-client-0.
    3. If there are entries listed with GFID mismatch, resolve the GFID split-brain.

      # gluster volume heal <volume> split-brain latest-mtime <relative_path_of_file_in_brick>
      
      Example:
      # gluster volume heal newengine split-brain latest-mtime /1d94d115-8ddd-41c9-bd9c-477347e95ad4/ha_agent/hosted-engine.lockspace
  2. RHV Administration portal shows gluster volume in degraded state with one of the bricks on the upgraded node as down.

    1. Check the gluster volume status from the gluster command line on one of the hyperconverged hosts.The brick entry corresponding to the node which was upgraded and rebooted is listed with the brick process and port as N/A.

      In the following example, notice that there is no process ID or port information for host rhvh2.example.com:

      # gluster volume status engine
      
      Example:
      Status of volume: engine
      Gluster process                             TCP Port  RDMA Port
      ---------------------------------------------------------------
      Brick rhvh1.example.com:/gluster_bricks/eng
      ine/engine                                   49158     0
      Brick rhvh2.example.com:/gluster_bricks/eng
      ine/engine                                   N/A       N/A
      Brick rhvh3.example.com:/gluster_bricks/eng
      ine/engine                                   49152     0
      Self-heal Daemon on localhost                N/A       N/A
      Self-heal Daemon on rhvh2.example.com        N/A       N/A
      Self-heal Daemon on rhvh3.example.com        N/A       N/A
      
      Online  Pid
      ------------
      Y       94365
      Y       11052
      Y       31153
      Y       128608
      Y       11838
      Y       9806
      
      Task Status of Volume engine
      ------------------------------------------------------------------
      There are no active volume tasks
    2. To fix this problem, kill the brick process and restart glusterfsd service.

       # pkill glusterfsd
       # systemctl restart glusterd
    3. Check the gluster volume status once again to make sure all the brick entries have got a brick process ID as well as the port information. Wait for a couple minutes for this information to reflect in the RHV administration portal.

      # gluster volume status engine

4.5. Verifying the upgrade

Verify that the upgrade has completed successfully.

  1. Verify the RHV Manager version.

    • Login in to Administration Portal Help(? symbol) on the top right About.

      • The software version should be as Software Version:4.4.X.X-X.X.el8ev.

        Example: Software Version:4.4.1.8-0.7.el8ev
  2. Verify the host version.

    • Run the following command on all the hosts to get the latest version of the host:

      # nodectl info | grep default
      Example:
      # nodectl info | grep default
      default: rhvh-4.4.1.1-0.20200707.0 (4.18.0-193.12.1.el8_2.x86_64)
Red Hat logoGithubRedditYoutubeTwitter

Apprendre

Essayez, achetez et vendez

Communautés

À propos de la documentation Red Hat

Nous aidons les utilisateurs de Red Hat à innover et à atteindre leurs objectifs grâce à nos produits et services avec un contenu auquel ils peuvent faire confiance.

Rendre l’open source plus inclusif

Red Hat s'engage à remplacer le langage problématique dans notre code, notre documentation et nos propriétés Web. Pour plus de détails, consultez leBlog Red Hat.

À propos de Red Hat

Nous proposons des solutions renforcées qui facilitent le travail des entreprises sur plusieurs plates-formes et environnements, du centre de données central à la périphérie du réseau.

© 2024 Red Hat, Inc.