Chapter 12. Migrating virtual machines
If the current host of a virtual machine (VM) becomes unsuitable or cannot be used anymore, or if you want to redistribute the hosting workload, you can migrate the VM to another KVM host.
12.1. How migrating virtual machines works
The essential part of virtual machine (VM) migration is copying the XML configuration of a VM to a different host machine. If the migrated VM is not shut down, the migration also transfers the state of the VM’s memory and any virtualized devices to a destination host machine. For the VM to remain functional on the destination host, the VM’s disk images must remain available to it.
By default, the migrated VM is transient on the destination host, and remains defined also on the source host.
You can migrate a running VM by using live or non-live migrations. To migrate a shut-off VM, you must use an offline migration. For details, see the following table.
Migration type | Description | Use case | Storage requirements |
---|---|---|---|
Live migration | The VM continues to run on the source host machine while KVM is transferring the VM’s memory pages to the destination host. When the migration is nearly complete, KVM very briefly suspends the VM, and resumes it on the destination host. | Useful for VMs that require constant uptime. However, VMs that modify memory pages faster than KVM can transfer them, such as VMs under heavy I/O load, cannot be live-migrated, and non-live migration must be used instead. | The VM’s disk images must be accessible both to the source host and the destination host during the migration. (1) |
Non-live migration | Suspends the VM, copies its configuration and its memory to the destination host, and resumes the VM. | Creates downtime for the VM, but is generally more reliable than live migration. Recommended for VMs under heavy memory load. | The VM’s disk images must be accessible both to the source host and the destination host during the migration. (1) |
Hybrid migration | Combines live migration and non-live migration. You suspend the source VM during live migration, which prevents additional dirty memory pages from being generated. As a result, the migration is significantly more likely to complete. | Recommended for example when live-migrating a VM that uses very many vCPUs or a large amount of memory, which prevents the migration from completing. Based on guest workload and the number of static pages during migration, a hybrid migration might also cause significantly less downtime than a non-live migration. | The VM’s disk images must be accessible both to the source host and the destination host during the migration. (1) |
Offline migration | Moves the VM’s configuration to the destination host | Recommended for shut-off VMs and in situations when shutting down the VM does not disrupt your workloads. | The VM’s disk images do not have to be available on a shared network, and can be copied or moved manually to the destination host instead. |
(1) To achieve this, use one of the following:
- Storage located on a shared network
-
The
--copy-storage-all
parameter for thevirsh migrate
command, which copies disk image contents from the source to the destination over the network. - Storage area network (SAN) logical units (LUNs).
- Ceph storage clusters
12.2. Benefits of migrating virtual machines
Migrating virtual machines (VMs) can be useful for:
- Load balancing
- VMs can be moved to host machines with lower usage if their host becomes overloaded, or if another host is under-utilized.
- Hardware independence
- When you need to upgrade, add, or remove hardware devices on the host machine, you can safely relocate VMs to other hosts. This means that VMs do not experience any downtime for hardware improvements.
- Energy saving
- VMs can be redistributed to other hosts, and the unloaded host systems can thus be powered off to save energy and cut costs during low usage periods.
- Geographic migration
- VMs can be moved to another physical location for lower latency or when required for other reasons.
12.3. Limitations for migrating virtual machines
Before migrating virtual machines (VMs) in RHEL 9, ensure you are aware of the migration’s limitations.
-
Migrating VMs from or to a session connection of
libvirt
is unreliable and therefore not recommended. VMs that use certain features and configurations will not work correctly if migrated, or the migration will fail. Such features include:
- Device passthrough
- SR-IOV device assignment
- Mediated devices, such as vGPUs
- A migration between hosts that use Non-Uniform Memory Access (NUMA) pinning works only if the hosts have similar topology. However, the performance on running workloads might be negatively affected by the migration.
The emulated CPUs, both on the source VM and the destination VM, must be identical, otherwise the migration might fail. Any differences between the VMs in the following CPU related areas can cause problems with the migration:
CPU model
- Migrating between an Intel 64 host and an AMD64 host is unsupported, even though they share the x86-64 instruction set.
- For steps to ensure that a VM will work correctly after migrating to a host with a different CPU model, see Verifying host CPU compatibility for virtual machine migration.
- Firmware settings
- Microcode version
- BIOS version
- BIOS settings
- QEMU version
- Kernel version
- Live migrating a VM that uses more than 1 TB of memory might in some cases be unreliable. For instructions on how to prevent or fix this problem, see Live migration of a VM takes a long time without completing.
12.4. Verifying host CPU compatibility for virtual machine migration
For migrated virtual machines (VMs) to work correctly on the destination host, the CPUs on the source and the destination hosts must be compatible. To ensure that this is the case, calculate a common CPU baseline before you begin the migration.
The instructions in this section use an example migration scenario with the following host CPUs:
- Source host: Intel Core i7-8650U
- Destination hosts: Intel Xeon CPU E5-2620 v2
Prerequisites
- Virtualization is installed and enabled on your system.
- You have administrator access to the source host and the destination host for the migration.
Procedure
On the source host, obtain its CPU features and paste them into a new XML file, such as
domCaps-CPUs.xml
.# virsh domcapabilities | xmllint --xpath "//cpu/mode[@name='host-model']" - > domCaps-CPUs.xml
-
In the XML file, replace the
<mode> </mode>
tags with<cpu> </cpu>
. Optional: Verify that the content of the
domCaps-CPUs.xml
file looks similar to the following:# cat domCaps-CPUs.xml <cpu> <model fallback="forbid">Skylake-Client-IBRS</model> <vendor>Intel</vendor> <feature policy="require" name="ss"/> <feature policy="require" name="vmx"/> <feature policy="require" name="pdcm"/> <feature policy="require" name="hypervisor"/> <feature policy="require" name="tsc_adjust"/> <feature policy="require" name="clflushopt"/> <feature policy="require" name="umip"/> <feature policy="require" name="md-clear"/> <feature policy="require" name="stibp"/> <feature policy="require" name="arch-capabilities"/> <feature policy="require" name="ssbd"/> <feature policy="require" name="xsaves"/> <feature policy="require" name="pdpe1gb"/> <feature policy="require" name="invtsc"/> <feature policy="require" name="ibpb"/> <feature policy="require" name="ibrs"/> <feature policy="require" name="amd-stibp"/> <feature policy="require" name="amd-ssbd"/> <feature policy="require" name="rsba"/> <feature policy="require" name="skip-l1dfl-vmentry"/> <feature policy="require" name="pschange-mc-no"/> <feature policy="disable" name="hle"/> <feature policy="disable" name="rtm"/> </cpu>
On the destination host, use the following command to obtain its CPU features:
# virsh domcapabilities | xmllint --xpath "//cpu/mode[@name='host-model']" - <mode name="host-model" supported="yes"> <model fallback="forbid">IvyBridge-IBRS</model> <vendor>Intel</vendor> <feature policy="require" name="ss"/> <feature policy="require" name="vmx"/> <feature policy="require" name="pdcm"/> <feature policy="require" name="pcid"/> <feature policy="require" name="hypervisor"/> <feature policy="require" name="arat"/> <feature policy="require" name="tsc_adjust"/> <feature policy="require" name="umip"/> <feature policy="require" name="md-clear"/> <feature policy="require" name="stibp"/> <feature policy="require" name="arch-capabilities"/> <feature policy="require" name="ssbd"/> <feature policy="require" name="xsaveopt"/> <feature policy="require" name="pdpe1gb"/> <feature policy="require" name="invtsc"/> <feature policy="require" name="ibpb"/> <feature policy="require" name="amd-ssbd"/> <feature policy="require" name="skip-l1dfl-vmentry"/> <feature policy="require" name="pschange-mc-no"/> </mode>
-
Add the obtained CPU features from the destination host to the
domCaps-CPUs.xml
file on the source host. Again, replace the<mode> </mode>
tags with<cpu> </cpu>
and save the file. Optional: Verify that the XML file now contains the CPU features from both hosts.
# cat domCaps-CPUs.xml <cpu> <model fallback="forbid">Skylake-Client-IBRS</model> <vendor>Intel</vendor> <feature policy="require" name="ss"/> <feature policy="require" name="vmx"/> <feature policy="require" name="pdcm"/> <feature policy="require" name="hypervisor"/> <feature policy="require" name="tsc_adjust"/> <feature policy="require" name="clflushopt"/> <feature policy="require" name="umip"/> <feature policy="require" name="md-clear"/> <feature policy="require" name="stibp"/> <feature policy="require" name="arch-capabilities"/> <feature policy="require" name="ssbd"/> <feature policy="require" name="xsaves"/> <feature policy="require" name="pdpe1gb"/> <feature policy="require" name="invtsc"/> <feature policy="require" name="ibpb"/> <feature policy="require" name="ibrs"/> <feature policy="require" name="amd-stibp"/> <feature policy="require" name="amd-ssbd"/> <feature policy="require" name="rsba"/> <feature policy="require" name="skip-l1dfl-vmentry"/> <feature policy="require" name="pschange-mc-no"/> <feature policy="disable" name="hle"/> <feature policy="disable" name="rtm"/> </cpu> <cpu> <model fallback="forbid">IvyBridge-IBRS</model> <vendor>Intel</vendor> <feature policy="require" name="ss"/> <feature policy="require" name="vmx"/> <feature policy="require" name="pdcm"/> <feature policy="require" name="pcid"/> <feature policy="require" name="hypervisor"/> <feature policy="require" name="arat"/> <feature policy="require" name="tsc_adjust"/> <feature policy="require" name="umip"/> <feature policy="require" name="md-clear"/> <feature policy="require" name="stibp"/> <feature policy="require" name="arch-capabilities"/> <feature policy="require" name="ssbd"/> <feature policy="require" name="xsaveopt"/> <feature policy="require" name="pdpe1gb"/> <feature policy="require" name="invtsc"/> <feature policy="require" name="ibpb"/> <feature policy="require" name="amd-ssbd"/> <feature policy="require" name="skip-l1dfl-vmentry"/> <feature policy="require" name="pschange-mc-no"/> </cpu>
Use the XML file to calculate the CPU feature baseline for the VM you intend to migrate.
# virsh hypervisor-cpu-baseline domCaps-CPUs.xml <cpu mode='custom' match='exact'> <model fallback='forbid'>IvyBridge-IBRS</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='pcid'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='arat'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='ssbd'/> <feature policy='require' name='xsaveopt'/> <feature policy='require' name='pdpe1gb'/> <feature policy='require' name='invtsc'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='pschange-mc-no'/> </cpu>
Open the XML configuration of the VM you intend to migrate, and replace the contents of the
<cpu>
section with the settings obtained in the previous step.# virsh edit VM-name
If the VM is running, restart it.
# virsh reboot VM-name
12.5. Sharing virtual machine disk images with other hosts
To perform a live migration of a virtual machine (VM) between supported KVM hosts, you must also migrate the storage of the running VM in a way that makes it possible for the VM to read from and write to the storage during the migration process.
One of the methods to do this is using shared VM storage. The following procedure provides instructions for sharing a locally stored VM image with the source host and the destination host by using the NFS protocol.
Prerequisites
- The VM intended for migration is shut down.
- Optional: A host system is available for hosting the storage that is not the source or destination host, but both the source and the destination host can reach it through the network. This is the optimal solution for shared storage and is recommended by Red Hat.
- Make sure that NFS file locking is not used as it is not supported in KVM.
- The NFS protocol is installed and enabled on the source and destination hosts. See
- Deploying an NFS server.
Procedure
Connect to the host that will provide shared storage. In this example, it is the
example-shared-storage
host:# ssh root@example-shared-storage root@example-shared-storage's password: Last login: Mon Sep 24 12:05:36 2019 root~#
Create a directory on the source host that will hold the disk image and will be shared with the migration hosts:
# mkdir /var/lib/libvirt/shared-images
Copy the disk image of the VM from the source host to the newly created directory. The following example copies the disk image
example-disk-1
of the VM to the/var/lib/libvirt/shared-images/
directory of theexample-shared-storage
host:# scp /var/lib/libvirt/images/example-disk-1.qcow2 root@example-shared-storage:/var/lib/libvirt/shared-images/example-disk-1.qcow2
On the host that you want to use for sharing the storage, add the sharing directory to the
/etc/exports
file. The following example shares the/var/lib/libvirt/shared-images
directory with theexample-source-machine
andexample-destination-machine
hosts:# /var/lib/libvirt/shared-images example-source-machine(rw,no_root_squash) example-destination-machine(rw,no\_root_squash)
On both the source and destination host, mount the shared directory in the
/var/lib/libvirt/images
directory:# mount example-shared-storage:/var/lib/libvirt/shared-images /var/lib/libvirt/images
Verification
- Start the VM on the source host and observe if it boots successfully.
Additional resources
12.6. Migrating a virtual machine by using the command-line interface
If the current host of a virtual machine (VM) becomes unsuitable or cannot be used anymore, or if you want to redistribute the hosting workload, you can migrate the VM to another KVM host. The following procedure provides instructions and examples for various scenarios of such migrations.
Prerequisites
- The source host and the destination host both use the KVM hypervisor.
-
The source host and the destination host are able to reach each other over the network. Use the
ping
utility to verify this. Ensure the following ports are open on the destination host.
- Port 22 is needed for connecting to the destination host by using SSH.
- Port 16509 is needed for connecting to the destination host by using TLS.
- Port 16514 is needed for connecting to the destination host by using TCP.
- Ports 49152-49215 are needed by QEMU for transfering the memory and disk migration data.
- For the migration to be supportable by Red Hat, the source host and destination host must be using specific operating systems and machine types. To ensure this is the case, see Supported hosts for virtual machine migration.
- The VM must be compatible with the CPU features of the destination host. To ensure this is the case, see Verifying host CPU compatibility for virtual machine migration.
The disk images of VMs that will be migrated are accessible to both the source host and the destination host. This is optional for offline migration, but required for migrating a running VM. To ensure storage accessibility for both hosts, one of the following must apply:
- You are using storage area network (SAN) logical units (LUNs).
- You are using a Ceph storage clusters.
-
You have copied the disk image of the VM to the destination host, and you will use
--copy-storage-all
parameter when migrating the VM. Alternatively, you have created a disk image with the same format and size as the source VM disk. - The disk image is located on a separate networked location. For instructions to set up such shared VM storage, see Sharing virtual machine disk images with other hosts.
When migrating a running VM, your network bandwidth must be higher than the rate in which the VM generates dirty memory pages.
To obtain the dirty page rate of your VM before you start the live migration, do the following:
Monitor the rate of dirty page generation of the VM for a short period of time.
# virsh domdirtyrate-calc example-VM 30
After the monitoring finishes, obtain its results:
# virsh domstats example-VM --dirtyrate Domain: 'example-VM' dirtyrate.calc_status=2 dirtyrate.calc_start_time=200942 dirtyrate.calc_period=30 dirtyrate.megabytes_per_second=2
In this example, the VM is generating 2 MB of dirty memory pages per second. Attempting to live-migrate such a VM on a network with a bandwidth of 2 MB/s or less will cause the live migration not to progress if you do not pause the VM or lower its workload.
To ensure that the live migration finishes successfully, Red Hat recommends that your network bandwidth is significantly greater than the VM’s dirty page generation rate.
NoteThe value of the
calc_period
option might differ based on the workload and dirty page rate. You can experiment with severalcalc_period
values to determine the most suitable period that aligns with the dirty page rate in your environment.
- When migrating an existing VM in a public bridge tap network, the source and destination hosts must be located on the same network. Otherwise, the VM network will not work after migration.
When performing a VM migration, the
virsh
client on the source host can use one of several protocols to connect to the libvirt daemon on the destination host. Examples in the following procedure use an SSH connection, but you can choose a different one.If you want libvirt to use an SSH connection, ensure that the
virtqemud
socket is enabled and running on the destination host.# systemctl enable --now virtqemud.socket
If you want libvirt to use a TLS connection, ensure that the
virtproxyd-tls
socket is enabled and running on the destination host.# systemctl enable --now virtproxyd-tls.socket
If you want libvirt to use a TCP connection, ensure that the
virtproxyd-tcp
socket is enabled and running on the destination host.# systemctl enable --now virtproxyd-tcp.socket
Procedure
Use the
virsh migrate
command with options appropriate for your migration requirements.The following command migrates the
example-VM-1
VM from your local host to the system connection of theexample-destination
host by using an SSH tunnel. The VM keeps running during the migration.# virsh migrate --persistent --live example-VM-1 qemu+ssh://example-destination/system
The following commands enable you to make manual adjustments to the configuration of the
example-VM-2
VM running on your local host, and then migrate the VM to theexample-destination
host. The migrated VM will automatically use the updated configuration.# virsh dumpxml --migratable example-VM-2 > example-VM-2.xml # vi example-VM-2.xml # virsh migrate --live --persistent --xml example-VM-2.xml example-VM-2 qemu+ssh://example-destination/system
This procedure can be useful for example when the destination host needs to use a different path to access the shared VM storage or when configuring a feature specific to the destination host.
The following command suspends the
example-VM-3
VM from theexample-source
host, migrates it to theexample-destination
host, and instructs it to use the adjusted XML configuration, provided by theexample-VM-3-alt.xml
file. When the migration is completed,libvirt
resumes the VM on the destination host.# virsh migrate example-VM-3 qemu+ssh://example-source/system qemu+ssh://example-destination/system --xml example-VM-3-alt.xml
After the migration, the VM is in the shut off state on the source host, and the migrated copy is deleted after it is shut down.
The following deletes the shut-down
example-VM-4
VM from theexample-source
host, and moves its configuration to theexample-destination
host.# virsh migrate --offline --persistent --undefinesource example-VM-4 qemu+ssh://example-source/system qemu+ssh://example-destination/system
Note that this type of migration does not require moving the VM’s disk image to shared storage. However, for the VM to be usable on the destination host, you also need to migrate the VM’s disk image. For example:
# scp root@example-source:/var/lib/libvirt/images/example-VM-4.qcow2 root@example-destination:/var/lib/libvirt/images/example-VM-4.qcow2
The following command migrates the example-VM-5 VM to the example-destination host and uses multiple parallel connections, also known as multiple file descriptors (multi-FD) migration. With multi-FD migration, you can speed up the migration by utilizing all of the available network bandwidth for the migration process.
# virsh migrate --parallel --parallel-connections 4 <example-VM-5> qemu+ssh://<example-destination>/system
This example uses 4 multi-FD channels to migrate the example-VM-5 VM. It is recommended to use one channel for each 10 Gbps of available network bandwidth. The default value is 2 channels.
Wait for the migration to complete. The process may take some time depending on network bandwidth, system load, and the size of the VM. If the
--verbose
option is not used forvirsh migrate
, the CLI does not display any progress indicators except errors.When the migration is in progress, you can use the
virsh domjobinfo
utility to display the migration statistics.
Verification
On the destination host, list the available VMs to verify if the VM has been migrated:
# virsh list Id Name State ---------------------------------- 10 example-VM-1 running
If the migration is still running, this command will list the VM state as
paused
.
Troubleshooting
-
In some cases, the target host will not be compatible with certain values of the migrated VM’s XML configuration, such as the network name or CPU type. As a result, the VM will fail to boot on the target host. To fix these problems, you can update the problematic values by using the
virsh edit
command. After updating the values, you must restart the VM for the changes to be applied. If a live migration is taking a long time to complete, this may be because the VM is under heavy load and too many memory pages are changing for live migration to be possible. To fix this problem, change the migration to a non-live one by suspending the VM.
# virsh suspend example-VM-1
Additional resources
-
virsh migrate --help
command -
virsh (1)
man page on your system
12.7. Live migrating a virtual machine by using the web console
If you wish to migrate a virtual machine (VM) that is performing tasks which require it to be constantly running, you can migrate that VM to another KVM host without shutting it down. This is also known as live migration. The following instructions explain how to do so by using the web console.
For tasks that modify memory pages faster than KVM can transfer them, such as heavy I/O load tasks, it is recommended that you do not live migrate the VM.
Prerequisites
You have installed the RHEL 9 web console.
For instructions, see Installing and enabling the web console.
- The web console VM plug-in is installed on your system.
- The source and destination hosts are running.
Ensure the following ports are open on the destination host.
- Port 22 is needed for connecting to the destination host by using SSH.
- Port 16509 is needed for connecting to the destination host by using TLS.
- Port 16514 is needed for connecting to the destination host by using TCP.
- Ports 49152-49215 are needed by QEMU for transfering the memory and disk migration data.
- The VM must be compatible with the CPU features of the destination host. To ensure this is the case, see Verifying host CPU compatibility for virtual machine migration.
- The VM’s disk images are located on a shared storage that is accessible to the source host as well as the destination host.
When migrating a running VM, your network bandwidth must be higher than the rate in which the VM generates dirty memory pages.
To obtain the dirty page rate of your VM before you start the live migration, do the following in your command-line interface:
Monitor the rate of dirty page generation of the VM for a short period of time.
# virsh domdirtyrate-calc vm-name 30
After the monitoring finishes, obtain its results:
# virsh domstats vm-name --dirtyrate Domain: 'vm-name' dirtyrate.calc_status=2 dirtyrate.calc_start_time=200942 dirtyrate.calc_period=30 dirtyrate.megabytes_per_second=2
In this example, the VM is generating 2 MB of dirty memory pages per second. Attempting to live-migrate such a VM on a network with a bandwidth of 2 MB/s or less will cause the live migration not to progress if you do not pause the VM or lower its workload.
To ensure that the live migration finishes successfully, Red Hat recommends that your network bandwidth is significantly greater than the VM’s dirty page generation rate.
The value of the calc_period
option might differ based on the workload and dirty page rate. You can experiment with several calc_period
values to determine the most suitable period that aligns with the dirty page rate in your environment.
Procedure
In the Virtual Machines interface of the web console, click the Menu button
of the VM that you want to migrate.A drop down menu appears with controls for various VM operations.
Click
The Migrate VM to another host dialog appears.
- Enter the URI of the destination host.
Configure the duration of the migration:
- Permanent - Do not check the box if you wish to migrate the VM permanently. Permanent migration completely removes the VM configuration from the source host.
- Temporary - Temporary migration migrates a copy of the VM to the destination host. This copy is deleted from the destination host when the VM is shut down. The original VM remains on the source host.
Click
Your VM is migrated to the destination host.
Verification
To verify whether the VM has been successfully migrated and is working correctly:
- Confirm whether the VM appears in the list of VMs available on the destination host.
- Start the migrated VM and observe if it boots up.
12.8. Live migrating a virtual machine with an attached Mellanox virtual function
As a Technology Preview, you can live migrate a virtual machine (VM) with an attached virtual function (VF) of a Mellanox networking device. Currently, this is only possible when using a Mellanox CX-7 networking device. The VF on the Mellanox CX-7 networking device uses a new mlx5_vfio_pci
driver, which adds functionality that is necessary for the live migration, and libvirt
binds the new driver to the VF automatically.
Limitations
Currently, some virtualization features cannot be used when live migrating a VM with an attached Mellanox virtual function:
- Calculating dirty memory page rate generation of the VM.
- Using a post-copy live migration.
- Using a virtual I/O Memory Management Unit (vIOMMU) device in the VM.
This feature is included in RHEL 9 only as a Technology Preview, which means it is not supported.
Prerequisites
You have a Mellanox CX-7 networking device with a firmware version that is equal to or greater than 28.36.1010.
Refer to Mellanox documentation for details about firmware versions.
The
mstflint
package is installed on both the source and destination host:# dnf install mstflint
The Mellanox CX-7 networking device has
VF_MIGRATION_MODE
set toMIGRATION_ENABLED
:# mstconfig -d <device_pci_address> query | grep -i VF_migration VF_MIGRATION_MODE MIGRATION_ENABLED(2)
You can set
VF_MIGRATION_MODE
toMIGRATION_ENABLED
by using the following command:# mstconfig -d <device_pci_address> set VF_MIGRATION_MODE=2
The
openvswitch
package is installed on both the source and destination host:# dnf install openvswitch
The CPU and the firmware of your host support the I/O Memory Management Unit (IOMMU).
- If using an Intel CPU, it must support the Intel Virtualization Technology for Directed I/O (VT-d).
- If using an AMD CPU, it must support the AMD-Vi feature.
The host system uses Access Control Service (ACS) to provide direct memory access (DMA) isolation for PCIe topology. Verify this with the system vendor.
For additional information, see Hardware Considerations for Implementing SR-IOV.
The host network interface you want to use for creating VFs is running. For example, to activate the eth1 interface and verify it is running, use the following commands:
# ip link set eth1 up # ip link show eth1 8: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000 link/ether a0:36:9f:8f:3f:b8 brd ff:ff:ff:ff:ff:ff vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
For SR-IOV device assignment to work, the IOMMU feature must be enabled in the host BIOS and kernel. To do so:
On an Intel host, enable Intel Virtualization Technology for Directed I/O (VT-d):
Regenerate the GRUB configuration with the
intel_iommu=on
andiommu=pt
parameters:# grubby --args="intel_iommu=on iommu=pt" --update-kernel=ALL
- Reboot the host.
On an AMD host, enable AMD-Vi:
Regenerate the GRUB configuration with the
iommu=pt
parameter:# grubby --args="iommu=pt" --update-kernel=ALL
- Reboot the host.
- The source host and the destination host both use the KVM hypervisor.
-
The source host and the destination host are able to reach each other over the network. Use the
ping
utility to verify this. The following ports are open on the destination host.
- Port 22 is needed for connecting to the destination host by using SSH.
- Port 16509 is needed for connecting to the destination host by using TLS.
- Port 16514 is needed for connecting to the destination host by using TCP.
- Ports 49152-49215 are needed by QEMU for transferring the memory and disk migration data.
- The source host and destination host are using operating systems and machine types that allow migration. To ensure this is the case, see Supported hosts for virtual machine migration.
- The VM must be compatible with the CPU features of the destination host. To ensure this is the case, see Verifying host CPU compatibility for virtual machine migration.
The disk images of VMs that will be migrated are located on a separate networked location accessible to both the source host and the destination host. This is optional for offline migration, but required for migrating a running VM.
For instructions to set up such shared VM storage, see Sharing virtual machine disk images with other hosts.
- When migrating a running VM, your network bandwidth must be higher than the rate in which the VM generates dirty memory pages.
A virtual network socket is enabled that corresponds to the connection protocol.
When performing a VM migration, the
virsh
client on the source host can use one of several protocols to connect to the libvirt daemon on the destination host. Examples in the following procedure use an SSH connection, but you can choose a different one.If you want libvirt to use an SSH connection, ensure that the
virtqemud
socket is enabled and running on the destination host.# systemctl enable --now virtqemud.socket
If you want libvirt to use a TLS connection, ensure that the
virtproxyd-tls
socket is enabled and running on the destination host.# systemctl enable --now virtproxyd-tls.socket
If you want libvirt to use a TCP connection, ensure that the
virtproxyd-tcp
socket is enabled and running on the destination host.# systemctl enable --now virtproxyd-tcp.socket
Procedure
On the source host, set the Mellanox networking device to the
switchdev
mode.# devlink dev eswitch set pci/<device_pci_address> mode switchdev
On the source host, create a virtual function on the Mellanox device.
# echo 1 > /sys/bus/pci/devices/0000\:e1\:00.0/sriov_numvfs
The
/0000\:e1\:00.0/
part of the file path is based on the PCI address of the device. In the example it is:0000:e1:00.0
On the source host, unbind the VF from its driver.
# virsh nodedev-detach <vf_pci_address> --driver pci-stub
You can view the PCI address of the VF by using the following command:
# lshw -c network -businfo Bus info Device Class Description =========================================================================== pci@0000:e1:00.0 enp225s0np0 network MT2910 Family [ConnectX-7] pci@0000:e1:00.1 enp225s0v0 network ConnectX Family mlx5Gen Virtual Function
On the source host, enable the migration function of the VF.
# devlink port function set pci/0000:e1:00.0/1 migratable enable
In this example,
pci/0000:e1:00.0/1
refers to the first VF on the Mellanox device with the given PCI address.On the source host, configure Open vSwitch (OVS) for the migration of the VF. If the Mellanox device is in
switchdev
mode, it cannot transfer data over the network.Ensure the
openvswitch
service is running.# systemctl start openvswitch
Enable hardware offloading to improve networking performance.
# ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
Increase the maximum idle time to ensure network connections remain open during the migration.
# ovs-vsctl set Open_vSwitch . other_config:max-idle=300000
Create a new bridge in the OVS instance.
# ovs-vsctl add-br <bridge_name>
Restart the
openvswitch
service.# systemctl restart openvswitch
Add the physical Mellanox device to the OVS bridge.
# ovs-vsctl add-port <bridge_name> enp225s0np0
In this example,
<bridge_name>
is the name of the bridge you created in step d andenp225s0np0
is the network interface name of the Mellanox device.Add the VF of the Mellanox device to the OVS bridge.
# ovs-vsctl add-port <bridge_name> enp225s0npf0vf0
In this example,
<bridge_name>
is the name of the bridge you created in step d andenp225s0npf0vf0
is the network interface name of the VF.
- Repeat steps 1-5 on the destination host.
On the source host, open a new file, such as
mlx_vf.xml
, and add the following XML configuration of the VF:<interface type='hostdev' managed='yes'> <mac address='52:54:00:56:8c:f7'/> <source> <address type='pci' domain='0x0000' bus='0xe1' slot='0x00' function='0x1'/> </source> </interface>
This example configures a pass-through of the VF as a network interface for the VM. Ensure the MAC address is unique, and use the PCI address of the VF on the source host.
On the source host, attach the VF XML file to the VM.
# virsh attach-device <vm_name> mlx_vf.xml --live --config
In this example,
mlx_vf.xml
is the name of the XML file with the VF configuration. Use the--live
option to attach the device to a running VM.On the source host, start the live migration of the running VM with the attached VF.
# virsh migrate --live --domain <vm_name> --desturi qemu+ssh://<destination_host_ip_address>/system
Verification
In the migrated VM, view the network interface name of the Mellanox VF.
# ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.1.10 netmask 255.255.255.0 broadcast 192.168.1.255 inet6 fe80::a00:27ff:fe4e:66a1 prefixlen 64 scopeid 0x20<link> ether 08:00:27:4e:66:a1 txqueuelen 1000 (Ethernet) RX packets 100000 bytes 6543210 (6.5 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 100000 bytes 6543210 (6.5 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 enp4s0f0v0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.3.10 netmask 255.255.255.0 broadcast 192.168.3.255 inet6 fe80::a00:27ff:fe4e:66c3 prefixlen 64 scopeid 0x20<link> ether 08:00:27:4e:66:c3 txqueuelen 1000 (Ethernet) RX packets 200000 bytes 12345678 (12.3 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 200000 bytes 12345678 (12.3 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
In the migrated VM, check that the Mellanox VF works, for example:
# ping -I <VF_interface_name> 8.8.8.8 PING 8.8.8.8 (8.8.8.8) from 192.168.3.10 <VF_interface_name>: 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=57 time=27.4 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=57 time=26.9 ms --- 8.8.8.8 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1002ms rtt min/avg/max/mdev = 26.944/27.046/27.148/0.102 ms
12.9. Troubleshooting virtual machine migrations
If you are facing one of the following problems when migrating virtual machines (VMs), see the provided instructions to fix or avoid the issue.
12.9.1. Live migration of a VM takes a long time without completing
Cause
In some cases, migrating a running VM might cause the the VM to generate dirty memory pages faster than they can be migrated. When this occurs, the migration cannot complete successfully.
The following scenarios frequently cause this problem:
- Live migrating a VM under a heavy load
Live migrating a VM that uses a large amount of memory, such as 1 TB or more
ImportantRed Hat has successfully tested live migration of VMs with up to 6 TB of memory. However, for live migration scenarios that involve VMs with more than 1 TB of memory, customers should reach out to Red Hat technical support.
Diagnosis
If your VM live migration is taking longer than expected, use the virsh domjobinfo
command to obtain the memory page data for the VM:
# virsh domjobinfo vm-name Job type: Unbounded Operation: Outgoing migration Time elapsed: 168286974 ms Data processed: 26.106 TiB Data remaining: 34.383 MiB Data total: 10.586 TiB Memory processed: 26.106 TiB Memory remaining: 34.383 MiB Memory total: 10.586 TiB Memory bandwidth: 29.056 MiB/s Dirty rate: 17225 pages/s Page size: 4096 bytes
In this output, the multiplication of Dirty rate
and Page size
is greater than Memory bandwidth
. This means that the VM is generating dirty memory pages faster than the network can migrate them. As a consequence, the state of the VM on the destination host cannot converge with the state of the VM on the source host, which prevents the migration from completing.
Fix
To improve the chances that a stalled live migration finishes successfully, you can do any of the following:
Reduce the workload of the VM, especially memory updates.
- To do this, stop or cancel non-essential processes in the guest operating system of the source VM.
Increase the downtime allowed for the live migration:
Display the current maximum downtime at the end of a live migration for the VM that is being migrated:
# virsh migrate-getmaxdowntime vm-name
Set a higher maximum downtime:
# virsh migrate-setmaxdowntime vm-name downtime-in-miliseconds
The higher you set the maximum downtime, the more likely it will be for the migration to complete.
Switch the live migration to post-copy mode.
# virsh migrate-start-postcopy vm-name
This ensures that the memory pages of the VM can converge on the destination host, and that the migration can complete.
However, when post-copy mode is active, the VM might slow down significantly, due to remote page requests from the destination host to the source host. In addition, if the network connection between the source host and the destination host stops working during post-copy migration, some of the VM processes may halt due to missing memory pages.
Therefore, do not use post-copy migration if the VM availability is critical or if the migration network is unstable.
- If your workload allows it, suspend the VM and let the migration finish as a non-live migration. This increases the downtime of the VM, but in most cases ensures that the migration completes successfully.
Prevention
The probability of successfully completing a live migration of a VM depends on the following:
The workload of the VM during the migration
- Before starting the migration, stop or cancel non-essential processes in the guest operating system of the VM.
The network bandwidth that the host can use for migration
- For optimal results of a live migration, the bandwidth of the network used for the migration must be significantly higher than the dirty page generation of the VM. For instructions on obtaining the VM dirty page generation rate, see the Prerequisites in Migrating a virtual machine by using the command-line interface.
- Both the source host and the destination host must have a dedicated network interface controller (NIC) for the migration. For live migrating a VM with more than 1 TB of memory, Red Hat recommends a NIC with the speed of 25 Gb/s or more.
-
You can also specify the network bandwidth assigned to the live migration by using the
--bandwidth
option when initiating the migration. For migrating very large VMs, assign as much bandwidth as viable for your deployment.
The mode of live migration
- The default pre-copy migration mode copies memory pages repeatedly if they become dirty.
Post-copy migration copies memory pages only once.
To enable your live migration to switch to post-copy mode if the migration stalls, use the
--postcopy
option withvirsh migrate
when starting the migration.
The downtime specified for the deployment
-
You can adjust this during the migration by using
virsh migrate-setmaxdowntime
as described previously.
-
You can adjust this during the migration by using
12.10. Supported hosts for virtual machine migration
For the virtual machine (VM) migration to work properly and be supported by Red Hat, the source and destination hosts must be specific RHEL versions and machine types. The following table shows supported VM migration paths.
Migration method | Release type | Future version example | Support status |
---|---|---|---|
Forward | Minor release |
9.0.1 | On supported RHEL 9 systems: machine type q35. |
Backward | Minor release |
9.1 | On supported RHEL 9 systems: machine type q35. |
Support level is different for other virtualization solutions provided by Red Hat, including RHOSP and OpenShift Virtualization.