Chapter 14. Backup and restore
14.1. Backup and restore by using VM snapshots
You can back up and restore virtual machines (VMs) by using snapshots. Snapshots are supported by the following storage providers:
- Red Hat OpenShift Data Foundation
- Any other cloud storage provider with the Container Storage Interface (CSI) driver that supports the Kubernetes Volume Snapshot API
Online snapshots have a default time deadline of five minutes (5m
) that can be changed, if needed.
Online snapshots are supported for virtual machines that have hot plugged virtual disks. However, hot plugged disks that are not in the virtual machine specification are not included in the snapshot.
To create snapshots of an online (Running state) VM with the highest integrity, install the QEMU guest agent if it is not included with your operating system. The QEMU guest agent is included with the default Red Hat templates.
The QEMU guest agent takes a consistent snapshot by attempting to quiesce the VM file system as much as possible, depending on the system workload. This ensures that in-flight I/O is written to the disk before the snapshot is taken. If the guest agent is not present, quiescing is not possible and a best-effort snapshot is taken. The conditions under which the snapshot was taken are reflected in the snapshot indications that are displayed in the web console or CLI.
14.1.1. About snapshots
A snapshot represents the state and data of a virtual machine (VM) at a specific point in time. You can use a snapshot to restore an existing VM to a previous state (represented by the snapshot) for backup and disaster recovery or to rapidly roll back to a previous development version.
A VM snapshot is created from a VM that is powered off (Stopped state) or powered on (Running state).
When taking a snapshot of a running VM, the controller checks that the QEMU guest agent is installed and running. If so, it freezes the VM file system before taking the snapshot, and thaws the file system after the snapshot is taken.
The snapshot stores a copy of each Container Storage Interface (CSI) volume attached to the VM and a copy of the VM specification and metadata. Snapshots cannot be changed after creation.
You can perform the following snapshot actions:
- Create a new snapshot
- Create a copy of a virtual machine from a snapshot
- List all snapshots attached to a specific VM
- Restore a VM from a snapshot
- Delete an existing VM snapshot
VM snapshot controller and custom resources
The VM snapshot feature introduces three new API objects defined as custom resource definitions (CRDs) for managing snapshots:
-
VirtualMachineSnapshot
: Represents a user request to create a snapshot. It contains information about the current state of the VM. -
VirtualMachineSnapshotContent
: Represents a provisioned resource on the cluster (a snapshot). It is created by the VM snapshot controller and contains references to all resources required to restore the VM. -
VirtualMachineRestore
: Represents a user request to restore a VM from a snapshot.
The VM snapshot controller binds a VirtualMachineSnapshotContent
object with the VirtualMachineSnapshot
object for which it was created, with a one-to-one mapping.
14.1.2. About application-consistent snapshots and backups
You can configure application-consistent snapshots and backups for Linux or Windows virtual machines (VMs) through a cycle of freezing and thawing. For any application, you can either configure a script on a Linux VM or register on a Windows VM to be notified when a snapshot or backup is due to begin.
On a Linux VM, freeze and thaw processes trigger automatically when a snapshot is taken or a backup is started by using, for example, a plugin from Velero or another backup vendor. The freeze process, performed by QEMU Guest Agent (QEMU GA) freeze hooks, ensures that before the snapshot or backup of a VM occurs, all of the VM’s filesystems are frozen and each appropriately configured application is informed that a snapshot or backup is about to start. This notification affords each application the opportunity to quiesce its state. Depending on the application, quiescing might involve temporarily refusing new requests, finishing in-progress operations, and flushing data to disk. The operating system is then directed to quiesce the filesystems by flushing outstanding writes to disk and freezing new write activity. All new connection requests are refused. When all applications have become inactive, the QEMU GA freezes the filesystems, and a snapshot is taken or a backup initiated. After the taking of the snapshot or start of the backup, the thawing process begins. Filesystems writing is reactivated and applications receive notification to resume normal operations.
The same cycle of freezing and thawing is available on a Windows VM. Applications register with the Volume Shadow Copy Service (VSS) to receive notifications that they should flush out their data because a backup or snapshot is imminent. Thawing of the applications after the backup or snapshot is complete returns them to an active state. For more details, see the Windows Server documentation about the Volume Shadow Copy Service.
14.1.3. Creating snapshots
You can create snapshots of virtual machines (VMs) by using the OpenShift Container Platform web console or the command line.
14.1.3.1. Creating a snapshot by using the web console
You can create a snapshot of a virtual machine (VM) by using the OpenShift Container Platform web console.
The VM snapshot includes disks that meet the following requirements:
- Either a data volume or a persistent volume claim
- Belong to a storage class that supports Container Storage Interface (CSI) volume snapshots
Procedure
-
Navigate to Virtualization
VirtualMachines in the web console. - Select a VM to open the VirtualMachine details page.
- Click the Snapshots tab and then click Take Snapshot.
- Enter the snapshot name.
- Expand Disks included in this Snapshot to see the storage volumes to be included in the snapshot.
- If your VM has disks that cannot be included in the snapshot and you wish to proceed, select I am aware of this warning and wish to proceed.
- Click Save.
14.1.3.2. Creating a snapshot by using the command line
You can create a virtual machine (VM) snapshot for an offline or online VM by creating a VirtualMachineSnapshot
object.
Prerequisites
- Ensure that the persistent volume claims (PVCs) are in a storage class that supports Container Storage Interface (CSI) volume snapshots.
-
Install the OpenShift CLI (
oc
). - Optional: Power down the VM for which you want to create a snapshot.
Procedure
Create a YAML file to define a
VirtualMachineSnapshot
object that specifies the name of the newVirtualMachineSnapshot
and the name of the source VM as in the following example:apiVersion: snapshot.kubevirt.io/v1beta1 kind: VirtualMachineSnapshot metadata: name: <snapshot_name> spec: source: apiGroup: kubevirt.io kind: VirtualMachine name: <vm_name>
Create the
VirtualMachineSnapshot
object:$ oc create -f <snapshot_name>.yaml
The snapshot controller creates a
VirtualMachineSnapshotContent
object, binds it to theVirtualMachineSnapshot
, and updates thestatus
andreadyToUse
fields of theVirtualMachineSnapshot
object.Optional: If you are taking an online snapshot, you can use the
wait
command and monitor the status of the snapshot:Enter the following command:
$ oc wait <vm_name> <snapshot_name> --for condition=Ready
Verify the status of the snapshot:
-
InProgress
- The online snapshot operation is still in progress. -
Succeeded
- The online snapshot operation completed successfully. Failed
- The online snapshot operaton failed.NoteOnline snapshots have a default time deadline of five minutes (
5m
). If the snapshot does not complete successfully in five minutes, the status is set tofailed
. Afterwards, the file system will be thawed and the VM unfrozen but the status remainsfailed
until you delete the failed snapshot image.To change the default time deadline, add the
FailureDeadline
attribute to the VM snapshot spec with the time designated in minutes (m
) or in seconds (s
) that you want to specify before the snapshot operation times out.To set no deadline, you can specify
0
, though this is generally not recommended, as it can result in an unresponsive VM.If you do not specify a unit of time such as
m
ors
, the default is seconds (s
).
-
Verification
Verify that the
VirtualMachineSnapshot
object is created and bound withVirtualMachineSnapshotContent
and that thereadyToUse
flag is set totrue
:$ oc describe vmsnapshot <snapshot_name>
Example output
apiVersion: snapshot.kubevirt.io/v1beta1 kind: VirtualMachineSnapshot metadata: creationTimestamp: "2020-09-30T14:41:51Z" finalizers: - snapshot.kubevirt.io/vmsnapshot-protection generation: 5 name: mysnap namespace: default resourceVersion: "3897" selfLink: /apis/snapshot.kubevirt.io/v1beta1/namespaces/default/virtualmachinesnapshots/my-vmsnapshot uid: 28eedf08-5d6a-42c1-969c-2eda58e2a78d spec: source: apiGroup: kubevirt.io kind: VirtualMachine name: my-vm status: conditions: - lastProbeTime: null lastTransitionTime: "2020-09-30T14:42:03Z" reason: Operation complete status: "False" 1 type: Progressing - lastProbeTime: null lastTransitionTime: "2020-09-30T14:42:03Z" reason: Operation complete status: "True" 2 type: Ready creationTime: "2020-09-30T14:42:03Z" readyToUse: true 3 sourceUID: 355897f3-73a0-4ec4-83d3-3c2df9486f4f virtualMachineSnapshotContentName: vmsnapshot-content-28eedf08-5d6a-42c1-969c-2eda58e2a78d 4
- 1
- The
status
field of theProgressing
condition specifies if the snapshot is still being created. - 2
- The
status
field of theReady
condition specifies if the snapshot creation process is complete. - 3
- Specifies if the snapshot is ready to be used.
- 4
- Specifies that the snapshot is bound to a
VirtualMachineSnapshotContent
object created by the snapshot controller.
-
Check the
spec:volumeBackups
property of theVirtualMachineSnapshotContent
resource to verify that the expected PVCs are included in the snapshot.
14.1.4. Verifying online snapshots by using snapshot indications
Snapshot indications are contextual information about online virtual machine (VM) snapshot operations. Indications are not available for offline virtual machine (VM) snapshot operations. Indications are helpful in describing details about the online snapshot creation.
Prerequisites
- You must have attempted to create an online VM snapshot.
Procedure
Display the output from the snapshot indications by performing one of the following actions:
-
Use the command line to view indicator output in the
status
stanza of theVirtualMachineSnapshot
object YAML. -
In the web console, click VirtualMachineSnapshot
Status in the Snapshot details screen.
-
Use the command line to view indicator output in the
Verify the status of your online VM snapshot by viewing the values of the
status.indications
parameter:-
Online
indicates that the VM was running during online snapshot creation. -
GuestAgent
indicates that the QEMU guest agent was running during online snapshot creation. -
NoGuestAgent
indicates that the QEMU guest agent was not running during online snapshot creation. The QEMU guest agent could not be used to freeze and thaw the file system, either because the QEMU guest agent was not installed or running or due to another error.
-
14.1.5. Restoring virtual machines from snapshots
You can restore virtual machines (VMs) from snapshots by using the OpenShift Container Platform web console or the command line.
14.1.5.1. Restoring a VM from a snapshot by using the web console
You can restore a virtual machine (VM) to a previous configuration represented by a snapshot in the OpenShift Container Platform web console.
Procedure
-
Navigate to Virtualization
VirtualMachines in the web console. - Select a VM to open the VirtualMachine details page.
- If the VM is running, click the options menu and select Stop to power it down.
- Click the Snapshots tab to view a list of snapshots associated with the VM.
- Select a snapshot to open the Snapshot Details screen.
- Click the options menu and select Restore VirtualMachine from snapshot.
- Click Restore.
14.1.5.2. Restoring a VM from a snapshot by using the command line
You can restore an existing virtual machine (VM) to a previous configuration by using the command line. You can only restore from an offline VM snapshot.
Prerequisites
- Power down the VM you want to restore.
Procedure
Create a YAML file to define a
VirtualMachineRestore
object that specifies the name of the VM you want to restore and the name of the snapshot to be used as the source as in the following example:apiVersion: snapshot.kubevirt.io/v1beta1 kind: VirtualMachineRestore metadata: name: <vm_restore> spec: target: apiGroup: kubevirt.io kind: VirtualMachine name: <vm_name> virtualMachineSnapshotName: <snapshot_name>
Create the
VirtualMachineRestore
object:$ oc create -f <vm_restore>.yaml
The snapshot controller updates the status fields of the
VirtualMachineRestore
object and replaces the existing VM configuration with the snapshot content.
Verification
Verify that the VM is restored to the previous state represented by the snapshot and that the
complete
flag is set totrue
:$ oc get vmrestore <vm_restore>
Example output
apiVersion: snapshot.kubevirt.io/v1beta1 kind: VirtualMachineRestore metadata: creationTimestamp: "2020-09-30T14:46:27Z" generation: 5 name: my-vmrestore namespace: default ownerReferences: - apiVersion: kubevirt.io/v1 blockOwnerDeletion: true controller: true kind: VirtualMachine name: my-vm uid: 355897f3-73a0-4ec4-83d3-3c2df9486f4f resourceVersion: "5512" selfLink: /apis/snapshot.kubevirt.io/v1beta1/namespaces/default/virtualmachinerestores/my-vmrestore uid: 71c679a8-136e-46b0-b9b5-f57175a6a041 spec: target: apiGroup: kubevirt.io kind: VirtualMachine name: my-vm virtualMachineSnapshotName: my-vmsnapshot status: complete: true 1 conditions: - lastProbeTime: null lastTransitionTime: "2020-09-30T14:46:28Z" reason: Operation complete status: "False" 2 type: Progressing - lastProbeTime: null lastTransitionTime: "2020-09-30T14:46:28Z" reason: Operation complete status: "True" 3 type: Ready deletedDataVolumes: - test-dv1 restoreTime: "2020-09-30T14:46:28Z" restores: - dataVolumeName: restore-71c679a8-136e-46b0-b9b5-f57175a6a041-datavolumedisk1 persistentVolumeClaim: restore-71c679a8-136e-46b0-b9b5-f57175a6a041-datavolumedisk1 volumeName: datavolumedisk1 volumeSnapshotName: vmsnapshot-28eedf08-5d6a-42c1-969c-2eda58e2a78d-volume-datavolumedisk1
14.1.6. Deleting snapshots
You can delete snapshots of virtual machines (VMs) by using the OpenShift Container Platform web console or the command line.
14.1.6.1. Deleting a snapshot by using the web console
You can delete an existing virtual machine (VM) snapshot by using the web console.
Procedure
-
Navigate to Virtualization
VirtualMachines in the web console. - Select a VM to open the VirtualMachine details page.
- Click the Snapshots tab to view a list of snapshots associated with the VM.
- Click the options menu beside a snapshot and select Delete snapshot.
- Click Delete.
14.1.6.2. Deleting a virtual machine snapshot in the CLI
You can delete an existing virtual machine (VM) snapshot by deleting the appropriate VirtualMachineSnapshot
object.
Prerequisites
-
Install the OpenShift CLI (
oc
).
Procedure
Delete the
VirtualMachineSnapshot
object:$ oc delete vmsnapshot <snapshot_name>
The snapshot controller deletes the
VirtualMachineSnapshot
along with the associatedVirtualMachineSnapshotContent
object.
Verification
Verify that the snapshot is deleted and no longer attached to this VM:
$ oc get vmsnapshot
14.1.7. Additional resources
14.2. Backing up and restoring virtual machines
Red Hat supports using OpenShift Virtualization 4.14 or later with OADP 1.3.x or later.
OADP versions earlier than 1.3.0 are not supported for back up and restore of OpenShift Virtualization.
Back up and restore virtual machines by using the OpenShift API for Data Protection.
You can install the OpenShift API for Data Protection (OADP) with OpenShift Virtualization by installing the OADP Operator and configuring a backup location. You can then install the Data Protection Application.
OpenShift API for Data Protection with OpenShift Virtualization supports the following backup and restore storage options:
- Container Storage Interface (CSI) backups
- Container Storage Interface (CSI) backups with DataMover
The following storage options are excluded:
- File system backup and restore
- Volume snapshot backup and restore
For more information, see Backing up applications with File System Backup: Kopia or Restic.
To install the OADP Operator in a restricted network environment, you must first disable the default OperatorHub sources and mirror the Operator catalog.
See Using Operator Lifecycle Manager in disconnected environments for details.
14.2.1. Installing and configuring OADP with OpenShift Virtualization
As a cluster administrator, you install OADP by installing the OADP Operator.
The latest version of the OADP Operator installs Velero 1.14.
Prerequisites
-
Access to the cluster as a user with the
cluster-admin
role.
Procedure
- Install the OADP Operator according to the instructions for your storage provider.
-
Install the Data Protection Application (DPA) with the
kubevirt
andopenshift
OADP plugins. Back up virtual machines by creating a
Backup
custom resource (CR).WarningRed Hat support is limited to only the following options:
- CSI backups
- CSI backups with DataMover.
You restore the Backup
CR by creating a Restore
CR.
14.2.2. Installing the Data Protection Application
You install the Data Protection Application (DPA) by creating an instance of the DataProtectionApplication
API.
Prerequisites
- You must install the OADP Operator.
- You must configure object storage as a backup location.
- If you use snapshots to back up PVs, your cloud provider must support either a native snapshot API or Container Storage Interface (CSI) snapshots.
If the backup and snapshot locations use the same credentials, you must create a
Secret
with the default name,cloud-credentials
.NoteIf you do not want to specify backup or snapshot locations during the installation, you can create a default
Secret
with an emptycredentials-velero
file. If there is no defaultSecret
, the installation will fail.
Procedure
-
Click Operators
Installed Operators and select the OADP Operator. - Under Provided APIs, click Create instance in the DataProtectionApplication box.
Click YAML View and update the parameters of the
DataProtectionApplication
manifest:apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp 1 spec: configuration: velero: defaultPlugins: - kubevirt 2 - gcp 3 - csi 4 - openshift 5 resourceTimeout: 10m 6 nodeAgent: 7 enable: true 8 uploaderType: kopia 9 podConfig: nodeSelector: <node_selector> 10 backupLocations: - velero: provider: gcp 11 default: true credential: key: cloud name: <default_secret> 12 objectStorage: bucket: <bucket_name> 13 prefix: <prefix> 14
- 1
- The default namespace for OADP is
openshift-adp
. The namespace is a variable and is configurable. - 2
- The
kubevirt
plugin is mandatory for OpenShift Virtualization. - 3
- Specify the plugin for the backup provider, for example,
gcp
, if it exists. - 4
- The
csi
plugin is mandatory for backing up PVs with CSI snapshots. Thecsi
plugin uses the Velero CSI beta snapshot APIs. You do not need to configure a snapshot location. - 5
- The
openshift
plugin is mandatory. - 6
- Specify how many minutes to wait for several Velero resources before timeout occurs, such as Velero CRD availability, volumeSnapshot deletion, and backup repository availability. The default is 10m.
- 7
- The administrative agent that routes the administrative requests to servers.
- 8
- Set this value to
true
if you want to enablenodeAgent
and perform File System Backup. - 9
- Enter
kopia
as your uploader to use the Built-in DataMover. ThenodeAgent
deploys a daemon set, which means that thenodeAgent
pods run on each working node. You can configure File System Backup by addingspec.defaultVolumesToFsBackup: true
to theBackup
CR. - 10
- Specify the nodes on which Kopia are available. By default, Kopia runs on all nodes.
- 11
- Specify the backup provider.
- 12
- Specify the correct default name for the
Secret
, for example,cloud-credentials-gcp
, if you use a default plugin for the backup provider. If specifying a custom name, then the custom name is used for the backup location. If you do not specify aSecret
name, the default name is used. - 13
- Specify a bucket as the backup storage location. If the bucket is not a dedicated bucket for Velero backups, you must specify a prefix.
- 14
- Specify a prefix for Velero backups, for example,
velero
, if the bucket is used for multiple purposes.
- Click Create.
Verification
Verify the installation by viewing the OpenShift API for Data Protection (OADP) resources by running the following command:
$ oc get all -n openshift-adp
Example output
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/node-agent-9cq4q 1/1 Running 0 94s pod/node-agent-m4lts 1/1 Running 0 94s pod/node-agent-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s service/openshift-adp-velero-metrics-svc ClusterIP 172.30.10.0 <none> 8085/TCP 8h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
Verify that the
DataProtectionApplication
(DPA) is reconciled by running the following command:$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
Example output
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
-
Verify the
type
is set toReconciled
. Verify the backup storage location and confirm that the
PHASE
isAvailable
by running the following command:$ oc get backupstoragelocations.velero.io -n openshift-adp
Example output
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
14.3. Disaster recovery
OpenShift Virtualization supports using disaster recovery (DR) solutions to ensure that your environment can recover after a site outage. To use these methods, you must plan your OpenShift Virtualization deployment in advance.
14.3.1. About disaster recovery methods
For an overview of disaster recovery (DR) concepts, architecture, and planning considerations, see the Red Hat OpenShift Virtualization disaster recovery guide in the Red Hat Knowledgebase.
The two primary DR methods for OpenShift Virtualization are Metropolitan Disaster Recovery (Metro-DR) and Regional-DR.
14.3.1.1. Metro-DR
Metro-DR uses synchronous replication. It writes to storage at both the primary and secondary sites so that the data is always synchronized between sites. Because the storage provider is responsible for ensuring that the synchronization succeeds, the environment must meet the throughput and latency requirements of the storage provider.
14.3.1.2. Regional-DR
Regional-DR uses asynchronous replication. The data in the primary site is synchronized with the secondary site at regular intervals. For this type of replication, you can have a higher latency connection between the primary and secondary sites.
14.3.2. Defining applications for disaster recovery
Define applications for disaster recovery by using VMs that Red Hat Advanced Cluster Management (RHACM) manages or discovers.
14.3.2.1. Best practices when defining an RHACM-managed VM
An RHACM-managed application that includes a VM must be created by using a GitOps workflow and by creating an RHACM application or ApplicationSet
.
There are several actions you can take to improve your experience and chance of success when defining an RHACM-managed VM.
Use a PVC and populator to define storage for the VM
Because data volumes create persistent volume claims (PVCs) implicitly, data volumes and VMs with data volume templates do not fit as neatly into the GitOps model.
Use the import method when choosing a population source for your VM disk
Select a RHEL image from the software catalog to use the import method. Red Hat recommends using a specific version of the image rather than a floating tag for consistent results. The KubeVirt community maintains container disks for other operating systems in a Quay repository.
Use pullMethod: node
Use the pod pullMethod: node
when creating a data volume from a registry source to take advantage of the OpenShift Container Platform pull secret, which is required to pull container images from the Red Hat registry.
14.3.2.2. Best practices when defining an RHACM-discovered VM
You can configure any VM in the cluster that is not an RHACM-managed application as an RHACM-discovered application. This includes VMs imported by using the Migration Toolkit for Virtualization (MTV), VMs created by using the OpenShift Virtualization web console, or VMs created by any other means, such as the CLI.
There are several actions you can take to improve your experience and chance of success when defining an RHACM-discovered VM.
Protect the VM when using MTV, the OpenShift Virtualization web console, or a custom VM
Because automatic labeling is not currently available, the application owner must manually label the components of the VM application when using MTV, the OpenShift Virtualization web console, or a custom VM.
After creating the VM, apply a common label to the following resources associated with the VM: VirtualMachine
, DataVolume
, PersistentVolumeClaim
, Service
, Route
, Secret
, ConfigMap
, VirtualMachinePreference
, and VirtualMachineInstancetype
. Do not label virtual machine instances (VMIs) or pods; OpenShift Virtualization creates and manages these automatically.
You must apply the common label to everything in the namespace that you want to protect, including objects that you added to the VM that are not listed here.
Include more than the VirtualMachine
object in the VM
Working VMs typically also contain data volumes, persistent volume claims (PVCs), services, routes, secrets, ConfigMap
objects, and VirtualMachineSnapshot
objects.
Include the VM as part of a larger logical application
This includes other pod-based workloads and VMs.
14.3.3. VM behavior during disaster recovery scenarios
VMs typically act similarly to pod-based workloads during both relocate and failover disaster recovery flows.
Relocate
Use relocate to move an application from the primary environment to the secondary environment when the primary environment is still accessible. During relocate, the VM is gracefully terminated, any unreplicated data is synchronized to the secondary environment, and the VM starts in the secondary environment.
Because the VM terminates gracefully, there is no data loss. Therefore, the VM operating system will not perform crash recovery.
Failover
Use failover when there is a critical failure in the primary environment that makes it impractical or impossible to use relocation to move the workload to a secondary environment. When failover is executed, the storage is fenced from the primary environment, the I/O to the VM disks is abruptly halted, and the VM restarts in the secondary environment using the replicated data.
You should expect data loss due to failover. The extent of loss depends on whether you use Metro-DR, which uses synchronous replication, or Regional-DR, which uses asynchronous replication. Because Regional-DR uses snapshot-based replication intervals, the window of data loss is proportional to the replication interval length. When the VM restarts, the operating system might perform crash recovery.
14.3.4. Disaster recovery solutions for Red Hat managed clusters
The following DR solutions combine Red Hat Advanced Cluster Management (RHACM), Red Hat Ceph Storage, and OpenShift Data Foundation components. You can use them to failover applications from the primary to the secondary site, and to relocate the applications back to the primary site after you restore the disaster site.
14.3.4.1. Metro-DR for Red Hat OpenShift Data Foundation
OpenShift Virtualization supports the Metro-DR solution for OpenShift Data Foundation, which provides two-way synchronous data replication between managed OpenShift Virtualization clusters installed on primary and secondary sites.
Metro-DR differences
- This synchronous solution is only available to metropolitan distance data centers with a network round-trip latency of 10 milliseconds or less.
- Multiple disk VMs are supported.
To prevent data corruption, you must ensure that storage is fenced during failover.
TipFencing means isolating a node so that workloads do not run on it.
For more information about using the Metro-DR solution for OpenShift Data Foundation with OpenShift Virtualization, see IBM’s OpenShift Data Foundation Metro-DR documentation.
14.3.4.2. Regional-DR for Red Hat OpenShift Data Foundation
OpenShift Virtualization supports the Regional-DR solution for OpenShift Data Foundation, which provides asynchronous data replication at regular intervals between managed OpenShift Virtualization clusters installed on primary and secondary sites.
Regional-DR differences
- Regional-DR supports higher network latency between the primary and secondary sites.
- Regional-DR uses RBD snapshots to replicate data asynchronously. Currently, your applications must be resilient to small variances between VM disks. You can prevent these variances by using single disk VMs.
-
Using the import method when selecting a population source for your VM disk is recommended. However, you can protect VMs that use cloned PVCs if you select a
VolumeReplicationClass
that enables image flattening. For more information, see the OpenShift Data Foundation documentation.
For more information about using the Regional-DR solution for OpenShift Data Foundation with OpenShift Virtualization, see IBM’s OpenShift Data Foundation Regional-DR documentation.