Chapter 10. Live migration


10.1. About live migration

Live migration is the process of moving a running virtual machine (VM) to another node in the cluster without interrupting the virtual workload. Live migration enables smooth transitions during cluster upgrades or any time a node needs to be drained for maintenance or configuration changes.

By default, live migration traffic is encrypted using Transport Layer Security (TLS).

10.1.1. Live migration requirements

Live migration has the following requirements:

  • The cluster must have shared storage with ReadWriteMany (RWX) access mode.
  • The cluster must have sufficient RAM and network bandwidth.

    Note

    You must ensure that there is enough memory request capacity in the cluster to support node drains that result in live migrations. You can determine the approximate required spare memory by using the following calculation:

    Product of (Maximum number of nodes that can drain in parallel) and (Highest total VM memory request allocations across nodes)

    The default number of migrations that can run in parallel in the cluster is 5.

  • If a VM uses a host model CPU, the nodes must support the CPU.
  • Configuring a dedicated Multus network for live migration is highly recommended. A dedicated network minimizes the effects of network saturation on tenant workloads during migration.

10.1.2. VM migration tuning

You can adjust your cluster-wide live migration settings based on the type of workload and migration scenario. This enables you to control how many VMs migrate at the same time, the network bandwidth you want to use for each migration, and how long OpenShift Virtualization attempts to complete the migration before canceling the process. Configure these settings in the HyperConverged custom resource (CR).

If you are migrating multiple VMs per node at the same time, set a bandwidthPerMigration limit to prevent a large or busy VM from using a large portion of the node’s network bandwidth. By default, the bandwidthPerMigration value is 0, which means unlimited.

A large VM running a heavy workload (for example, database processing), with higher memory dirty rates, requires a higher bandwidth to complete the migration.

Note

Post copy mode, when enabled, triggers if the initial pre-copy phase does not complete within the defined timeout. During post copy, the VM CPUs pause on the source host while transferring the minimum required memory pages. Then the VM CPUs activate on the destination host, and the remaining memory pages transfer into the destination node at runtime. This can impact performance during the transfer.

Post copy mode should not be used for critical data, or with unstable networks.

10.1.3. Common live migration tasks

You can perform the following live migration tasks:

10.1.4. Additional resources

10.2. Configuring live migration

You can configure live migration settings to ensure that the migration processes do not overwhelm the cluster.

You can configure live migration policies to apply different migration configurations to groups of virtual machines (VMs).

10.2.1. Configuring live migration limits and timeouts

Configure live migration limits and timeouts for the cluster by updating the HyperConverged custom resource (CR), which is located in the openshift-cnv namespace.

Procedure

  • Edit the HyperConverged CR and add the necessary live migration parameters:

    $ oc edit hyperconverged kubevirt-hyperconverged -n openshift-cnv

    Example configuration file

    apiVersion: hco.kubevirt.io/v1beta1
    kind: HyperConverged
    metadata:
      name: kubevirt-hyperconverged
      namespace: openshift-cnv
    spec:
      liveMigrationConfig:
        bandwidthPerMigration: 64Mi 1
        completionTimeoutPerGiB: 800 2
        parallelMigrationsPerCluster: 5 3
        parallelOutboundMigrationsPerNode: 2 4
        progressTimeout: 150 5
        allowPostCopy: false 6

    1
    Bandwidth limit of each migration, where the value is the quantity of bytes per second. For example, a value of 2048Mi means 2048 MiB/s. Default: 0, which is unlimited.
    2
    The migration is canceled if it has not completed in this time, in seconds per GiB of memory. For example, a VM with 6GiB memory times out if it has not completed migration in 4800 seconds. If the Migration Method is BlockMigration, the size of the migrating disks is included in the calculation.
    3
    Number of migrations running in parallel in the cluster. Default: 5.
    4
    Maximum number of outbound migrations per node. Default: 2.
    5
    The migration is canceled if memory copy fails to make progress in this time, in seconds. Default: 150.
    6
    If a VM is running a heavy workload and the memory dirty rate is too high, this can prevent the migration from one node to another from converging. To prevent this, you can enable post copy mode. By default, allowPostCopy is set to false.
Note

You can restore the default value for any spec.liveMigrationConfig field by deleting that key/value pair and saving the file. For example, delete progressTimeout: <value> to restore the default progressTimeout: 150.

10.2.2. Configure live migration for heavy workloads

When migrating a VM running a heavy workload (for example, database processing) with higher memory dirty rates, you need a higher bandwidth to complete the migration.

If the dirty rate is too high, the migration from one node to another does not converge. To prevent this, enable post copy mode.

Post copy mode triggers if the initial pre-copy phase does not complete within the defined timeout. During post copy, the VM CPUs pause on the source host while transferring the minimum required memory pages. Then the VM CPUs activate on the destination host, and the remaining memory pages transfer into the destination node at runtime.

Configure live migration for heavy workloads by updating the HyperConverged custom resource (CR), which is located in the openshift-cnv namespace.

Procedure

  1. Edit the HyperConverged CR and add the necessary parameters for migrating heavy workloads:

    $ oc edit hyperconverged kubevirt-hyperconverged -n openshift-cnv

    Example configuration file

    apiVersion: hco.kubevirt.io/v1beta1
    kind: HyperConverged
    metadata:
      name: kubevirt-hyperconverged
      namespace: openshift-cnv
    spec:
      liveMigrationConfig:
        bandwidthPerMigration: 0Mi 1
        completionTimeoutPerGiB: 150 2
        parallelMigrationsPerCluster: 5 3
        parallelOutboundMigrationsPerNode: 1 4
        progressTimeout: 150 5
        allowPostCopy: true 6

    1
    Bandwidth limit of each migration, where the value is the quantity of bytes per second. The default is 0, which is unlimited.
    2
    The migration is canceled if it is not completed in this time, and triggers post copy mode, when post copy is enabled. This value is measured in seconds per GiB of memory. You can lower completionTimeoutPerGiB to trigger post copy mode earlier in the migration process, or raise the completionTimeoutPerGiB to trigger post copy mode later in the migration process.
    3
    Number of migrations running in parallel in the cluster. The default is 5. Keeping the parallelMigrationsPerCluster setting low is better when migrating heavy workloads.
    4
    Maximum number of outbound migrations per node. Configure a single VM per node for heavy workloads.
    5
    The migration is canceled if memory copy fails to make progress in this time. This value is measured in seconds. Increase this parameter for large memory sizes running heavy workloads.
    6
    Use post copy mode when memory dirty rates are high to ensure the migration converges. Set allowPostCopy to true to enable post copy mode.
  2. Optional: If your main network is too busy for the migration, configure a secondary, dedicated migration network.
Note

Post copy mode can impact performance during the transfer, and should not be used for critical data, or with unstable networks.

10.2.3. Additional resources

10.2.4. Live migration policies

You can create live migration policies to apply different migration configurations to groups of VMs that are defined by VM or project labels.

Tip

You can create live migration policies by using the OpenShift Virtualization web console.

10.2.4.1. Creating a live migration policy by using the command line

You can create a live migration policy by using the command line. KubeVirt applies the live migration policy to selected virtual machines (VMs) by using any combination of labels:

  • VM labels such as size, os, or gpu
  • Project labels such as priority, bandwidth, or hpc-workload

For the policy to apply to a specific group of VMs, all labels on the group of VMs must match the labels of the policy.

Note

If multiple live migration policies apply to a VM, the policy with the greatest number of matching labels takes precedence.

If multiple policies meet this criteria, the policies are sorted by alphabetical order of the matching label keys, and the first one in that order takes precedence.

Procedure

  1. Edit the VM object to which you want to apply a live migration policy, and add the corresponding VM labels.

    1. Open the YAML configuration of the resource:

      $ oc edit vm <vm_name>
    2. Adjust the required label values in the .spec.template.metadata.labels section of the configuration. For example, to mark the VM as a production VM for the purposes of migration policies, add the kubevirt.io/environment: production line:

      apiVersion: migrations.kubevirt.io/v1alpha1
      kind: VirtualMachine
      metadata:
        name: <vm_name>
        namespace: default
        labels:
          app: my-app
          environment: production
      spec:
        template:
          metadata:
            labels:
              kubevirt.io/domain: <vm_name>
              kubevirt.io/size: large
              kubevirt.io/environment: production
      # ...
    3. Save and exit the configuration.
  2. Configure a MigrationPolicy object with the corresponding labels. The following example configures a policy that applies to all VMs that are labeled as production:

    apiVersion: migrations.kubevirt.io/v1alpha1
    kind: MigrationPolicy
    metadata:
      name: <migration_policy>
    spec:
      selectors:
        namespaceSelector: 1
          hpc-workloads: "True"
          xyz-workloads-type: ""
        virtualMachineInstanceSelector: 2
          kubevirt.io/environment: "production"
    1
    Specify project labels.
    2
    Specify VM labels.
  3. Create the migration policy by running the following command:

    $ oc create migrationpolicy -f <migration_policy>.yaml

10.2.5. Additional resources

10.3. Initiating and canceling live migration

You can initiate the live migration of a virtual machine (VM) to another node by using the OpenShift Container Platform web console or the command line.

You can cancel a live migration by using the web console or the command line. The VM remains on its original node.

Tip

You can also initiate and cancel live migration by using the virtctl migrate <vm_name> and virtctl migrate-cancel <vm_name> commands.

10.3.1. Initiating live migration

10.3.1.1. Initiating live migration by using the web console

You can live migrate a running virtual machine (VM) to a different node in the cluster by using the OpenShift Container Platform web console.

Note

The Migrate action is visible to all users but only cluster administrators can initiate a live migration.

Prerequisites

  • The VM must be migratable.
  • If the VM is configured with a host model CPU, the cluster must have an available node that supports the CPU model.

Procedure

  1. Navigate to Virtualization VirtualMachines in the web console.
  2. Select Migrate from the Options menu kebab beside a VM.
  3. Click Migrate.

10.3.1.2. Initiating live migration by using the command line

You can initiate the live migration of a running virtual machine (VM) by using the command line to create a VirtualMachineInstanceMigration object for the VM.

Procedure

  1. Create a VirtualMachineInstanceMigration manifest for the VM that you want to migrate:

    apiVersion: kubevirt.io/v1
    kind: VirtualMachineInstanceMigration
    metadata:
      name: <migration_name>
    spec:
      vmiName: <vm_name>
  2. Create the object by running the following command:

    $ oc create -f <migration_name>.yaml

    The VirtualMachineInstanceMigration object triggers a live migration of the VM. This object exists in the cluster for as long as the virtual machine instance is running, unless manually deleted.

Verification

  • Obtain the VM status by running the following command:

    $ oc describe vmi <vm_name> -n <namespace>

    Example output

    # ...
    Status:
      Conditions:
        Last Probe Time:       <nil>
        Last Transition Time:  <nil>
        Status:                True
        Type:                  LiveMigratable
      Migration Method:  LiveMigration
      Migration State:
        Completed:                    true
        End Timestamp:                2018-12-24T06:19:42Z
        Migration UID:                d78c8962-0743-11e9-a540-fa163e0c69f1
        Source Node:                  node2.example.com
        Start Timestamp:              2018-12-24T06:19:35Z
        Target Node:                  node1.example.com
        Target Node Address:          10.9.0.18:43891
        Target Node Domain Detected:  true

10.3.2. Canceling live migration

10.3.2.1. Canceling live migration by using the web console

You can cancel the live migration of a virtual machine (VM) by using the OpenShift Container Platform web console.

Procedure

  1. Navigate to Virtualization VirtualMachines in the web console.
  2. Select Cancel Migration on the Options menu kebab beside a VM.

10.3.2.2. Canceling live migration by using the command line

Cancel the live migration of a virtual machine by deleting the VirtualMachineInstanceMigration object associated with the migration.

Procedure

  • Delete the VirtualMachineInstanceMigration object that triggered the live migration, migration-job in this example:

    $ oc delete vmim migration-job
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.