Chapter 14. Backup and restore
14.1. Backup and restore by using VM snapshots
You can back up and restore virtual machines (VMs) by using snapshots. Snapshots are supported by the following storage providers:
- Red Hat OpenShift Data Foundation
- Any other cloud storage provider with the Container Storage Interface (CSI) driver that supports the Kubernetes Volume Snapshot API
Online snapshots have a default time deadline of five minutes (5m
) that can be changed, if needed.
Online snapshots are supported for virtual machines that have hot plugged virtual disks. However, hot plugged disks that are not in the virtual machine specification are not included in the snapshot.
To create snapshots of an online (Running state) VM with the highest integrity, install the QEMU guest agent if it is not included with your operating system. The QEMU guest agent is included with the default Red Hat templates.
The QEMU guest agent takes a consistent snapshot by attempting to quiesce the VM file system as much as possible, depending on the system workload. This ensures that in-flight I/O is written to the disk before the snapshot is taken. If the guest agent is not present, quiescing is not possible and a best-effort snapshot is taken. The conditions under which the snapshot was taken are reflected in the snapshot indications that are displayed in the web console or CLI.
14.1.1. About snapshots
A snapshot represents the state and data of a virtual machine (VM) at a specific point in time. You can use a snapshot to restore an existing VM to a previous state (represented by the snapshot) for backup and disaster recovery or to rapidly roll back to a previous development version.
A VM snapshot is created from a VM that is powered off (Stopped state) or powered on (Running state).
When taking a snapshot of a running VM, the controller checks that the QEMU guest agent is installed and running. If so, it freezes the VM file system before taking the snapshot, and thaws the file system after the snapshot is taken.
The snapshot stores a copy of each Container Storage Interface (CSI) volume attached to the VM and a copy of the VM specification and metadata. Snapshots cannot be changed after creation.
You can perform the following snapshot actions:
- Create a new snapshot
- Create a copy of a virtual machine from a snapshot
- List all snapshots attached to a specific VM
- Restore a VM from a snapshot
- Delete an existing VM snapshot
VM snapshot controller and custom resources
The VM snapshot feature introduces three new API objects defined as custom resource definitions (CRDs) for managing snapshots:
-
VirtualMachineSnapshot
: Represents a user request to create a snapshot. It contains information about the current state of the VM. -
VirtualMachineSnapshotContent
: Represents a provisioned resource on the cluster (a snapshot). It is created by the VM snapshot controller and contains references to all resources required to restore the VM. -
VirtualMachineRestore
: Represents a user request to restore a VM from a snapshot.
The VM snapshot controller binds a VirtualMachineSnapshotContent
object with the VirtualMachineSnapshot
object for which it was created, with a one-to-one mapping.
14.1.2. Creating snapshots
You can create snapshots of virtual machines (VMs) by using the OpenShift Container Platform web console or the command line.
14.1.2.1. Creating a snapshot by using the web console
You can create a snapshot of a virtual machine (VM) by using the OpenShift Container Platform web console.
The VM snapshot includes disks that meet the following requirements:
- Either a data volume or a persistent volume claim
- Belong to a storage class that supports Container Storage Interface (CSI) volume snapshots
Procedure
-
Navigate to Virtualization
VirtualMachines in the web console. - Select a VM to open the VirtualMachine details page.
- If the VM is running, click the options menu and select Stop to power it down.
- Click the Snapshots tab and then click Take Snapshot.
- Enter the snapshot name.
- Expand Disks included in this Snapshot to see the storage volumes to be included in the snapshot.
- If your VM has disks that cannot be included in the snapshot and you wish to proceed, select I am aware of this warning and wish to proceed.
- Click Save.
14.1.2.2. Creating a snapshot by using the command line
You can create a virtual machine (VM) snapshot for an offline or online VM by creating a VirtualMachineSnapshot
object.
Prerequisites
- Ensure that the persistent volume claims (PVCs) are in a storage class that supports Container Storage Interface (CSI) volume snapshots.
-
Install the OpenShift CLI (
oc
). - Optional: Power down the VM for which you want to create a snapshot.
Procedure
Create a YAML file to define a
VirtualMachineSnapshot
object that specifies the name of the newVirtualMachineSnapshot
and the name of the source VM as in the following example:apiVersion: snapshot.kubevirt.io/v1alpha1 kind: VirtualMachineSnapshot metadata: name: <snapshot_name> spec: source: apiGroup: kubevirt.io kind: VirtualMachine name: <vm_name>
Create the
VirtualMachineSnapshot
object:$ oc create -f <snapshot_name>.yaml
The snapshot controller creates a
VirtualMachineSnapshotContent
object, binds it to theVirtualMachineSnapshot
, and updates thestatus
andreadyToUse
fields of theVirtualMachineSnapshot
object.Optional: If you are taking an online snapshot, you can use the
wait
command and monitor the status of the snapshot:Enter the following command:
$ oc wait <vm_name> <snapshot_name> --for condition=Ready
Verify the status of the snapshot:
-
InProgress
- The online snapshot operation is still in progress. -
Succeeded
- The online snapshot operation completed successfully. Failed
- The online snapshot operaton failed.NoteOnline snapshots have a default time deadline of five minutes (
5m
). If the snapshot does not complete successfully in five minutes, the status is set tofailed
. Afterwards, the file system will be thawed and the VM unfrozen but the status remainsfailed
until you delete the failed snapshot image.To change the default time deadline, add the
FailureDeadline
attribute to the VM snapshot spec with the time designated in minutes (m
) or in seconds (s
) that you want to specify before the snapshot operation times out.To set no deadline, you can specify
0
, though this is generally not recommended, as it can result in an unresponsive VM.If you do not specify a unit of time such as
m
ors
, the default is seconds (s
).
-
Verification
Verify that the
VirtualMachineSnapshot
object is created and bound withVirtualMachineSnapshotContent
and that thereadyToUse
flag is set totrue
:$ oc describe vmsnapshot <snapshot_name>
Example output
apiVersion: snapshot.kubevirt.io/v1alpha1 kind: VirtualMachineSnapshot metadata: creationTimestamp: "2020-09-30T14:41:51Z" finalizers: - snapshot.kubevirt.io/vmsnapshot-protection generation: 5 name: mysnap namespace: default resourceVersion: "3897" selfLink: /apis/snapshot.kubevirt.io/v1alpha1/namespaces/default/virtualmachinesnapshots/my-vmsnapshot uid: 28eedf08-5d6a-42c1-969c-2eda58e2a78d spec: source: apiGroup: kubevirt.io kind: VirtualMachine name: my-vm status: conditions: - lastProbeTime: null lastTransitionTime: "2020-09-30T14:42:03Z" reason: Operation complete status: "False" 1 type: Progressing - lastProbeTime: null lastTransitionTime: "2020-09-30T14:42:03Z" reason: Operation complete status: "True" 2 type: Ready creationTime: "2020-09-30T14:42:03Z" readyToUse: true 3 sourceUID: 355897f3-73a0-4ec4-83d3-3c2df9486f4f virtualMachineSnapshotContentName: vmsnapshot-content-28eedf08-5d6a-42c1-969c-2eda58e2a78d 4
- 1
- The
status
field of theProgressing
condition specifies if the snapshot is still being created. - 2
- The
status
field of theReady
condition specifies if the snapshot creation process is complete. - 3
- Specifies if the snapshot is ready to be used.
- 4
- Specifies that the snapshot is bound to a
VirtualMachineSnapshotContent
object created by the snapshot controller.
-
Check the
spec:volumeBackups
property of theVirtualMachineSnapshotContent
resource to verify that the expected PVCs are included in the snapshot.
14.1.3. Verifying online snapshots by using snapshot indications
Snapshot indications are contextual information about online virtual machine (VM) snapshot operations. Indications are not available for offline virtual machine (VM) snapshot operations. Indications are helpful in describing details about the online snapshot creation.
Prerequisites
- You must have attempted to create an online VM snapshot.
Procedure
Display the output from the snapshot indications by performing one of the following actions:
-
Use the command line to view indicator output in the
status
stanza of theVirtualMachineSnapshot
object YAML. -
In the web console, click VirtualMachineSnapshot
Status in the Snapshot details screen.
-
Use the command line to view indicator output in the
Verify the status of your online VM snapshot by viewing the values of the
status.indications
parameter:-
Online
indicates that the VM was running during online snapshot creation. -
GuestAgent
indicates that the QEMU guest agent was running during online snapshot creation. -
NoGuestAgent
indicates that the QEMU guest agent was not running during online snapshot creation. The QEMU guest agent could not be used to freeze and thaw the file system, either because the QEMU guest agent was not installed or running or due to another error.
-
14.1.4. Restoring virtual machines from snapshots
You can restore virtual machines (VMs) from snapshots by using the OpenShift Container Platform web console or the command line.
14.1.4.1. Restoring a VM from a snapshot by using the web console
You can restore a virtual machine (VM) to a previous configuration represented by a snapshot in the OpenShift Container Platform web console.
Procedure
-
Navigate to Virtualization
VirtualMachines in the web console. - Select a VM to open the VirtualMachine details page.
- If the VM is running, click the options menu and select Stop to power it down.
- Click the Snapshots tab to view a list of snapshots associated with the VM.
- Select a snapshot to open the Snapshot Details screen.
- Click the options menu and select Restore VirtualMachineSnapshot.
- Click Restore.
14.1.4.2. Restoring a VM from a snapshot by using the command line
You can restore an existing virtual machine (VM) to a previous configuration by using the command line. You can only restore from an offline VM snapshot.
Prerequisites
- Power down the VM you want to restore.
Procedure
Create a YAML file to define a
VirtualMachineRestore
object that specifies the name of the VM you want to restore and the name of the snapshot to be used as the source as in the following example:apiVersion: snapshot.kubevirt.io/v1alpha1 kind: VirtualMachineRestore metadata: name: <vm_restore> spec: target: apiGroup: kubevirt.io kind: VirtualMachine name: <vm_name> virtualMachineSnapshotName: <snapshot_name>
Create the
VirtualMachineRestore
object:$ oc create -f <vm_restore>.yaml
The snapshot controller updates the status fields of the
VirtualMachineRestore
object and replaces the existing VM configuration with the snapshot content.
Verification
Verify that the VM is restored to the previous state represented by the snapshot and that the
complete
flag is set totrue
:$ oc get vmrestore <vm_restore>
Example output
apiVersion: snapshot.kubevirt.io/v1alpha1 kind: VirtualMachineRestore metadata: creationTimestamp: "2020-09-30T14:46:27Z" generation: 5 name: my-vmrestore namespace: default ownerReferences: - apiVersion: kubevirt.io/v1 blockOwnerDeletion: true controller: true kind: VirtualMachine name: my-vm uid: 355897f3-73a0-4ec4-83d3-3c2df9486f4f resourceVersion: "5512" selfLink: /apis/snapshot.kubevirt.io/v1alpha1/namespaces/default/virtualmachinerestores/my-vmrestore uid: 71c679a8-136e-46b0-b9b5-f57175a6a041 spec: target: apiGroup: kubevirt.io kind: VirtualMachine name: my-vm virtualMachineSnapshotName: my-vmsnapshot status: complete: true 1 conditions: - lastProbeTime: null lastTransitionTime: "2020-09-30T14:46:28Z" reason: Operation complete status: "False" 2 type: Progressing - lastProbeTime: null lastTransitionTime: "2020-09-30T14:46:28Z" reason: Operation complete status: "True" 3 type: Ready deletedDataVolumes: - test-dv1 restoreTime: "2020-09-30T14:46:28Z" restores: - dataVolumeName: restore-71c679a8-136e-46b0-b9b5-f57175a6a041-datavolumedisk1 persistentVolumeClaim: restore-71c679a8-136e-46b0-b9b5-f57175a6a041-datavolumedisk1 volumeName: datavolumedisk1 volumeSnapshotName: vmsnapshot-28eedf08-5d6a-42c1-969c-2eda58e2a78d-volume-datavolumedisk1
14.1.5. Deleting snapshots
You can delete snapshots of virtual machines (VMs) by using the OpenShift Container Platform web console or the command line.
14.1.5.1. Deleting a snapshot by using the web console
You can delete an existing virtual machine (VM) snapshot by using the web console.
Procedure
-
Navigate to Virtualization
VirtualMachines in the web console. - Select a VM to open the VirtualMachine details page.
- Click the Snapshots tab to view a list of snapshots associated with the VM.
- Click the options menu beside a snapshot and select Delete VirtualMachineSnapshot.
- Click Delete.
14.1.5.2. Deleting a virtual machine snapshot in the CLI
You can delete an existing virtual machine (VM) snapshot by deleting the appropriate VirtualMachineSnapshot
object.
Prerequisites
-
Install the OpenShift CLI (
oc
).
Procedure
Delete the
VirtualMachineSnapshot
object:$ oc delete vmsnapshot <snapshot_name>
The snapshot controller deletes the
VirtualMachineSnapshot
along with the associatedVirtualMachineSnapshotContent
object.
Verification
Verify that the snapshot is deleted and no longer attached to this VM:
$ oc get vmsnapshot
14.1.6. Additional resources
14.2. Backing up and restoring virtual machines
Red Hat supports using OpenShift Virtualization 4.14 or later with OADP 1.3.x or later.
OADP versions earlier than 1.3.0 are not supported for back up and restore of OpenShift Virtualization.
Back up and restore virtual machines by using the OpenShift API for Data Protection.
You can install the OpenShift API for Data Protection (OADP) with OpenShift Virtualization by installing the OADP Operator and configuring a backup location. You can then install the Data Protection Application.
OpenShift API for Data Protection with OpenShift Virtualization supports the following backup and restore storage options:
- Container Storage Interface (CSI) backups
- Container Storage Interface (CSI) backups with DataMover
The following storage options are excluded:
- File system backup and restore
- Volume snapshot backup and restore
For more information, see Backing up applications with File System Backup: Kopia or Restic.
To install the OADP Operator in a restricted network environment, you must first disable the default OperatorHub sources and mirror the Operator catalog.
See Using Operator Lifecycle Manager on restricted networks for details.
14.2.1. Installing and configuring OADP with OpenShift Virtualization
As a cluster administrator, you install OADP by installing the OADP Operator.
The latest version of the OADP Operator installs Velero 1.14.
Prerequisites
-
Access to the cluster as a user with the
cluster-admin
role.
Procedure
- Install the OADP Operator according to the instructions for your storage provider.
-
Install the Data Protection Application (DPA) with the
kubevirt
andopenshift
OADP plugins. Back up virtual machines by creating a
Backup
custom resource (CR).WarningRed Hat support is limited to only the following options:
- CSI backups
- CSI backups with DataMover.
You restore the Backup
CR by creating a Restore
CR.
14.2.2. Installing the Data Protection Application
You install the Data Protection Application (DPA) by creating an instance of the DataProtectionApplication
API.
Prerequisites
- You must install the OADP Operator.
- You must configure object storage as a backup location.
- If you use snapshots to back up PVs, your cloud provider must support either a native snapshot API or Container Storage Interface (CSI) snapshots.
If the backup and snapshot locations use the same credentials, you must create a
Secret
with the default name,cloud-credentials
.NoteIf you do not want to specify backup or snapshot locations during the installation, you can create a default
Secret
with an emptycredentials-velero
file. If there is no defaultSecret
, the installation will fail.
Procedure
-
Click Operators
Installed Operators and select the OADP Operator. - Under Provided APIs, click Create instance in the DataProtectionApplication box.
Click YAML View and update the parameters of the
DataProtectionApplication
manifest:apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp 1 spec: configuration: velero: defaultPlugins: - kubevirt 2 - gcp 3 - csi 4 - openshift 5 resourceTimeout: 10m 6 nodeAgent: 7 enable: true 8 uploaderType: kopia 9 podConfig: nodeSelector: <node_selector> 10 backupLocations: - velero: provider: gcp 11 default: true credential: key: cloud name: <default_secret> 12 objectStorage: bucket: <bucket_name> 13 prefix: <prefix> 14
- 1
- The default namespace for OADP is
openshift-adp
. The namespace is a variable and is configurable. - 2
- The
kubevirt
plugin is mandatory for OpenShift Virtualization. - 3
- Specify the plugin for the backup provider, for example,
gcp
, if it exists. - 4
- The
csi
plugin is mandatory for backing up PVs with CSI snapshots. Thecsi
plugin uses the Velero CSI beta snapshot APIs. You do not need to configure a snapshot location. - 5
- The
openshift
plugin is mandatory. - 6
- Specify how many minutes to wait for several Velero resources before timeout occurs, such as Velero CRD availability, volumeSnapshot deletion, and backup repository availability. The default is 10m.
- 7
- The administrative agent that routes the administrative requests to servers.
- 8
- Set this value to
true
if you want to enablenodeAgent
and perform File System Backup. - 9
- Enter
kopia
as your uploader to use the Built-in DataMover. ThenodeAgent
deploys a daemon set, which means that thenodeAgent
pods run on each working node. You can configure File System Backup by addingspec.defaultVolumesToFsBackup: true
to theBackup
CR. - 10
- Specify the nodes on which Kopia are available. By default, Kopia runs on all nodes.
- 11
- Specify the backup provider.
- 12
- Specify the correct default name for the
Secret
, for example,cloud-credentials-gcp
, if you use a default plugin for the backup provider. If specifying a custom name, then the custom name is used for the backup location. If you do not specify aSecret
name, the default name is used. - 13
- Specify a bucket as the backup storage location. If the bucket is not a dedicated bucket for Velero backups, you must specify a prefix.
- 14
- Specify a prefix for Velero backups, for example,
velero
, if the bucket is used for multiple purposes.
- Click Create.
Verification
Verify the installation by viewing the OpenShift API for Data Protection (OADP) resources by running the following command:
$ oc get all -n openshift-adp
Example output
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/node-agent-9cq4q 1/1 Running 0 94s pod/node-agent-m4lts 1/1 Running 0 94s pod/node-agent-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s service/openshift-adp-velero-metrics-svc ClusterIP 172.30.10.0 <none> 8085/TCP 8h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
Verify that the
DataProtectionApplication
(DPA) is reconciled by running the following command:$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
Example output
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
-
Verify the
type
is set toReconciled
. Verify the backup storage location and confirm that the
PHASE
isAvailable
by running the following command:$ oc get backupstoragelocations.velero.io -n openshift-adp
Example output
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
14.3. Disaster recovery
OpenShift Virtualization supports using disaster recovery (DR) solutions to ensure that your environment can recover after a site outage. To use these methods, you must plan your OpenShift Virtualization deployment in advance.
14.3.1. About disaster recovery methods
For an overview of disaster recovery (DR) concepts, architecture, and planning considerations, see the Red Hat OpenShift Virtualization disaster recovery guide in the Red Hat Knowledgebase.
The two primary DR methods for OpenShift Virtualization are Metropolitan Disaster Recovery (Metro-DR) and Regional-DR.
- Metro-DR
- Metro-DR uses synchronous replication. It writes to storage at both the primary and secondary sites so that the data is always synchronized between sites. Because the storage provider is responsible for ensuring that the synchronization succeeds, the environment must meet the throughput and latency requirements of the storage provider.
- Regional-DR
- Regional-DR uses asynchronous replication. The data in the primary site is synchronized with the secondary site at regular intervals. For this type of replication, you can have a higher latency connection between the primary and secondary sites.
14.3.1.1. Metro-DR for Red Hat OpenShift Data Foundation
OpenShift Virtualization supports the Metro-DR solution for OpenShift Data Foundation, which provides two-way synchronous data replication between managed OpenShift Virtualization clusters installed on primary and secondary sites. This solution combines Red Hat Advanced Cluster Management (RHACM), Red Hat Ceph Storage, and OpenShift Data Foundation components.
Use this solution during a site disaster to fail applications from the primary to the secondary site, and to relocate the application back to the primary site after restoring the disaster site.
This synchronous solution is only available to metropolitan distance data centers with a 10 millisecond latency or less.
For more information about using the Metro-DR solution for OpenShift Data Foundation with OpenShift Virtualization, see the Red Hat Knowledgebase.