Chapter 13. Backup and restore
13.1. Backup and restore by using VM snapshots
You can back up and restore virtual machines (VMs) by using snapshots. Snapshots are supported by the following storage providers:
- Any cloud storage provider with the Container Storage Interface (CSI) driver that supports the Kubernetes Volume Snapshot API
Online snapshots have a default time deadline of five minutes (5m
) that can be changed, if needed.
Online snapshots are supported for virtual machines that have hot plugged virtual disks. However, hot plugged disks that are not in the virtual machine specification are not included in the snapshot.
To create snapshots of an online (Running state) VM with the highest integrity, install the QEMU guest agent if it is not included with your operating system. The QEMU guest agent is included with the default Red Hat templates.
The QEMU guest agent takes a consistent snapshot by attempting to quiesce the VM file system as much as possible, depending on the system workload. This ensures that in-flight I/O is written to the disk before the snapshot is taken. If the guest agent is not present, quiescing is not possible and a best-effort snapshot is taken. The conditions under which the snapshot was taken are reflected in the snapshot indications that are displayed in the web console or CLI.
13.1.1. About snapshots
A snapshot represents the state and data of a virtual machine (VM) at a specific point in time. You can use a snapshot to restore an existing VM to a previous state (represented by the snapshot) for backup and disaster recovery or to rapidly roll back to a previous development version.
A VM snapshot is created from a VM that is powered off (Stopped state) or powered on (Running state).
When taking a snapshot of a running VM, the controller checks that the QEMU guest agent is installed and running. If so, it freezes the VM file system before taking the snapshot, and thaws the file system after the snapshot is taken.
The snapshot stores a copy of each Container Storage Interface (CSI) volume attached to the VM and a copy of the VM specification and metadata. Snapshots cannot be changed after creation.
You can perform the following snapshot actions:
- Create a new snapshot
- Create a copy of a virtual machine from a snapshot
- List all snapshots attached to a specific VM
- Restore a VM from a snapshot
- Delete an existing VM snapshot
VM snapshot controller and custom resources
The VM snapshot feature introduces three new API objects defined as custom resource definitions (CRDs) for managing snapshots:
-
VirtualMachineSnapshot
: Represents a user request to create a snapshot. It contains information about the current state of the VM. -
VirtualMachineSnapshotContent
: Represents a provisioned resource on the cluster (a snapshot). It is created by the VM snapshot controller and contains references to all resources required to restore the VM. -
VirtualMachineRestore
: Represents a user request to restore a VM from a snapshot.
The VM snapshot controller binds a VirtualMachineSnapshotContent
object with the VirtualMachineSnapshot
object for which it was created, with a one-to-one mapping.
13.1.2. About application-consistent snapshots and backups
You can configure application-consistent snapshots and backups for Linux or Windows virtual machines (VMs) through a cycle of freezing and thawing. For any application, you can either configure a script on a Linux VM or register on a Windows VM to be notified when a snapshot or backup is due to begin.
On a Linux VM, freeze and thaw processes trigger automatically when a snapshot is taken or a backup is started by using, for example, a plugin from Velero or another backup vendor. The freeze process, performed by QEMU Guest Agent (QEMU GA) freeze hooks, ensures that before the snapshot or backup of a VM occurs, all of the VM’s filesystems are frozen and each appropriately configured application is informed that a snapshot or backup is about to start. This notification affords each application the opportunity to quiesce its state. Depending on the application, quiescing might involve temporarily refusing new requests, finishing in-progress operations, and flushing data to disk. The operating system is then directed to quiesce the filesystems by flushing outstanding writes to disk and freezing new write activity. All new connection requests are refused. When all applications have become inactive, the QEMU GA freezes the filesystems, and a snapshot is taken or a backup initiated. After the taking of the snapshot or start of the backup, the thawing process begins. Filesystems writing is reactivated and applications receive notification to resume normal operations.
The same cycle of freezing and thawing is available on a Windows VM. Applications register with the Volume Shadow Copy Service (VSS) to receive notifications that they should flush out their data because a backup or snapshot is imminent. Thawing of the applications after the backup or snapshot is complete returns them to an active state. For more details, see the Windows Server documentation about the Volume Shadow Copy Service.
13.1.3. Creating snapshots
You can create snapshots of virtual machines (VMs) by using the Red Hat OpenShift Service on AWS web console or the command line.
13.1.3.1. Creating a snapshot by using the web console
You can create a snapshot of a virtual machine (VM) by using the Red Hat OpenShift Service on AWS web console.
The VM snapshot includes disks that meet the following requirements:
- Either a data volume or a persistent volume claim
- Belong to a storage class that supports Container Storage Interface (CSI) volume snapshots
Procedure
-
Navigate to Virtualization
VirtualMachines in the web console. - Select a VM to open the VirtualMachine details page.
- Click the Snapshots tab and then click Take Snapshot.
- Enter the snapshot name.
- Expand Disks included in this Snapshot to see the storage volumes to be included in the snapshot.
- If your VM has disks that cannot be included in the snapshot and you wish to proceed, select I am aware of this warning and wish to proceed.
- Click Save.
13.1.3.2. Creating a snapshot by using the command line
You can create a virtual machine (VM) snapshot for an offline or online VM by creating a VirtualMachineSnapshot
object.
Prerequisites
- Ensure that the persistent volume claims (PVCs) are in a storage class that supports Container Storage Interface (CSI) volume snapshots.
-
Install the OpenShift CLI (
oc
). - Optional: Power down the VM for which you want to create a snapshot.
Procedure
Create a YAML file to define a
VirtualMachineSnapshot
object that specifies the name of the newVirtualMachineSnapshot
and the name of the source VM as in the following example:apiVersion: snapshot.kubevirt.io/v1beta1 kind: VirtualMachineSnapshot metadata: name: <snapshot_name> spec: source: apiGroup: kubevirt.io kind: VirtualMachine name: <vm_name>
Create the
VirtualMachineSnapshot
object:$ oc create -f <snapshot_name>.yaml
The snapshot controller creates a
VirtualMachineSnapshotContent
object, binds it to theVirtualMachineSnapshot
, and updates thestatus
andreadyToUse
fields of theVirtualMachineSnapshot
object.Optional: If you are taking an online snapshot, you can use the
wait
command and monitor the status of the snapshot:Enter the following command:
$ oc wait <vm_name> <snapshot_name> --for condition=Ready
Verify the status of the snapshot:
-
InProgress
- The online snapshot operation is still in progress. -
Succeeded
- The online snapshot operation completed successfully. Failed
- The online snapshot operaton failed.NoteOnline snapshots have a default time deadline of five minutes (
5m
). If the snapshot does not complete successfully in five minutes, the status is set tofailed
. Afterwards, the file system will be thawed and the VM unfrozen but the status remainsfailed
until you delete the failed snapshot image.To change the default time deadline, add the
FailureDeadline
attribute to the VM snapshot spec with the time designated in minutes (m
) or in seconds (s
) that you want to specify before the snapshot operation times out.To set no deadline, you can specify
0
, though this is generally not recommended, as it can result in an unresponsive VM.If you do not specify a unit of time such as
m
ors
, the default is seconds (s
).
-
Verification
Verify that the
VirtualMachineSnapshot
object is created and bound withVirtualMachineSnapshotContent
and that thereadyToUse
flag is set totrue
:$ oc describe vmsnapshot <snapshot_name>
Example output
apiVersion: snapshot.kubevirt.io/v1beta1 kind: VirtualMachineSnapshot metadata: creationTimestamp: "2020-09-30T14:41:51Z" finalizers: - snapshot.kubevirt.io/vmsnapshot-protection generation: 5 name: mysnap namespace: default resourceVersion: "3897" selfLink: /apis/snapshot.kubevirt.io/v1beta1/namespaces/default/virtualmachinesnapshots/my-vmsnapshot uid: 28eedf08-5d6a-42c1-969c-2eda58e2a78d spec: source: apiGroup: kubevirt.io kind: VirtualMachine name: my-vm status: conditions: - lastProbeTime: null lastTransitionTime: "2020-09-30T14:42:03Z" reason: Operation complete status: "False" 1 type: Progressing - lastProbeTime: null lastTransitionTime: "2020-09-30T14:42:03Z" reason: Operation complete status: "True" 2 type: Ready creationTime: "2020-09-30T14:42:03Z" readyToUse: true 3 sourceUID: 355897f3-73a0-4ec4-83d3-3c2df9486f4f virtualMachineSnapshotContentName: vmsnapshot-content-28eedf08-5d6a-42c1-969c-2eda58e2a78d 4
- 1
- The
status
field of theProgressing
condition specifies if the snapshot is still being created. - 2
- The
status
field of theReady
condition specifies if the snapshot creation process is complete. - 3
- Specifies if the snapshot is ready to be used.
- 4
- Specifies that the snapshot is bound to a
VirtualMachineSnapshotContent
object created by the snapshot controller.
-
Check the
spec:volumeBackups
property of theVirtualMachineSnapshotContent
resource to verify that the expected PVCs are included in the snapshot.
13.1.4. Verifying online snapshots by using snapshot indications
Snapshot indications are contextual information about online virtual machine (VM) snapshot operations. Indications are not available for offline virtual machine (VM) snapshot operations. Indications are helpful in describing details about the online snapshot creation.
Prerequisites
- You must have attempted to create an online VM snapshot.
Procedure
Display the output from the snapshot indications by performing one of the following actions:
-
Use the command line to view indicator output in the
status
stanza of theVirtualMachineSnapshot
object YAML. -
In the web console, click VirtualMachineSnapshot
Status in the Snapshot details screen.
-
Use the command line to view indicator output in the
Verify the status of your online VM snapshot by viewing the values of the
status.indications
parameter:-
Online
indicates that the VM was running during online snapshot creation. -
GuestAgent
indicates that the QEMU guest agent was running during online snapshot creation. -
NoGuestAgent
indicates that the QEMU guest agent was not running during online snapshot creation. The QEMU guest agent could not be used to freeze and thaw the file system, either because the QEMU guest agent was not installed or running or due to another error.
-
13.1.5. Restoring virtual machines from snapshots
You can restore virtual machines (VMs) from snapshots by using the Red Hat OpenShift Service on AWS web console or the command line.
13.1.5.1. Restoring a VM from a snapshot by using the web console
You can restore a virtual machine (VM) to a previous configuration represented by a snapshot in the Red Hat OpenShift Service on AWS web console.
Procedure
-
Navigate to Virtualization
VirtualMachines in the web console. - Select a VM to open the VirtualMachine details page.
- If the VM is running, click the options menu and select Stop to power it down.
- Click the Snapshots tab to view a list of snapshots associated with the VM.
- Select a snapshot to open the Snapshot Details screen.
- Click the options menu and select Restore VirtualMachine from snapshot.
- Click Restore.
13.1.5.2. Restoring a VM from a snapshot by using the command line
You can restore an existing virtual machine (VM) to a previous configuration by using the command line. You can only restore from an offline VM snapshot.
Prerequisites
- Power down the VM you want to restore.
Procedure
Create a YAML file to define a
VirtualMachineRestore
object that specifies the name of the VM you want to restore and the name of the snapshot to be used as the source as in the following example:apiVersion: snapshot.kubevirt.io/v1beta1 kind: VirtualMachineRestore metadata: name: <vm_restore> spec: target: apiGroup: kubevirt.io kind: VirtualMachine name: <vm_name> virtualMachineSnapshotName: <snapshot_name>
Create the
VirtualMachineRestore
object:$ oc create -f <vm_restore>.yaml
The snapshot controller updates the status fields of the
VirtualMachineRestore
object and replaces the existing VM configuration with the snapshot content.
Verification
Verify that the VM is restored to the previous state represented by the snapshot and that the
complete
flag is set totrue
:$ oc get vmrestore <vm_restore>
Example output
apiVersion: snapshot.kubevirt.io/v1beta1 kind: VirtualMachineRestore metadata: creationTimestamp: "2020-09-30T14:46:27Z" generation: 5 name: my-vmrestore namespace: default ownerReferences: - apiVersion: kubevirt.io/v1 blockOwnerDeletion: true controller: true kind: VirtualMachine name: my-vm uid: 355897f3-73a0-4ec4-83d3-3c2df9486f4f resourceVersion: "5512" selfLink: /apis/snapshot.kubevirt.io/v1beta1/namespaces/default/virtualmachinerestores/my-vmrestore uid: 71c679a8-136e-46b0-b9b5-f57175a6a041 spec: target: apiGroup: kubevirt.io kind: VirtualMachine name: my-vm virtualMachineSnapshotName: my-vmsnapshot status: complete: true 1 conditions: - lastProbeTime: null lastTransitionTime: "2020-09-30T14:46:28Z" reason: Operation complete status: "False" 2 type: Progressing - lastProbeTime: null lastTransitionTime: "2020-09-30T14:46:28Z" reason: Operation complete status: "True" 3 type: Ready deletedDataVolumes: - test-dv1 restoreTime: "2020-09-30T14:46:28Z" restores: - dataVolumeName: restore-71c679a8-136e-46b0-b9b5-f57175a6a041-datavolumedisk1 persistentVolumeClaim: restore-71c679a8-136e-46b0-b9b5-f57175a6a041-datavolumedisk1 volumeName: datavolumedisk1 volumeSnapshotName: vmsnapshot-28eedf08-5d6a-42c1-969c-2eda58e2a78d-volume-datavolumedisk1
13.1.6. Deleting snapshots
You can delete snapshots of virtual machines (VMs) by using the Red Hat OpenShift Service on AWS web console or the command line.
13.1.6.1. Deleting a snapshot by using the web console
You can delete an existing virtual machine (VM) snapshot by using the web console.
Procedure
-
Navigate to Virtualization
VirtualMachines in the web console. - Select a VM to open the VirtualMachine details page.
- Click the Snapshots tab to view a list of snapshots associated with the VM.
- Click the options menu beside a snapshot and select Delete snapshot.
- Click Delete.
13.1.6.2. Deleting a virtual machine snapshot in the CLI
You can delete an existing virtual machine (VM) snapshot by deleting the appropriate VirtualMachineSnapshot
object.
Prerequisites
-
Install the OpenShift CLI (
oc
).
Procedure
Delete the
VirtualMachineSnapshot
object:$ oc delete vmsnapshot <snapshot_name>
The snapshot controller deletes the
VirtualMachineSnapshot
along with the associatedVirtualMachineSnapshotContent
object.
Verification
Verify that the snapshot is deleted and no longer attached to this VM:
$ oc get vmsnapshot
13.2. Backing up and restoring virtual machines
Red Hat supports using OpenShift Virtualization 4.14 or later with OADP 1.3.x or later.
OADP versions earlier than 1.3.0 are not supported for back up and restore of OpenShift Virtualization.
Back up and restore virtual machines by using the OpenShift API for Data Protection.
You can install the OpenShift API for Data Protection (OADP) with OpenShift Virtualization by installing the OADP Operator and configuring a backup location. You can then install the Data Protection Application.
OpenShift API for Data Protection with OpenShift Virtualization supports the following backup and restore storage options:
- Container Storage Interface (CSI) backups
- Container Storage Interface (CSI) backups with DataMover
The following storage options are excluded:
- File system backup and restore
- Volume snapshot backup and restore
To install the OADP Operator in a restricted network environment, you must first disable the default OperatorHub sources and mirror the Operator catalog.
13.2.1. Installing and configuring OADP with OpenShift Virtualization
As a cluster administrator, you install OADP by installing the OADP Operator.
The latest version of the OADP Operator installs Velero 1.14.
Prerequisites
-
Access to the cluster as a user with the
cluster-admin
role.
Procedure
- Install the OADP Operator according to the instructions for your storage provider.
-
Install the Data Protection Application (DPA) with the
kubevirt
andopenshift
OADP plugins. Back up virtual machines by creating a
Backup
custom resource (CR).WarningRed Hat support is limited to only the following options:
- CSI backups
- CSI backups with DataMover.
You restore the Backup
CR by creating a Restore
CR.
13.2.2. Installing the Data Protection Application
You install the Data Protection Application (DPA) by creating an instance of the DataProtectionApplication
API.
Prerequisites
- You must install the OADP Operator.
- You must configure object storage as a backup location.
- If you use snapshots to back up PVs, your cloud provider must support either a native snapshot API or Container Storage Interface (CSI) snapshots.
If the backup and snapshot locations use the same credentials, you must create a
Secret
with the default name,cloud-credentials
.NoteIf you do not want to specify backup or snapshot locations during the installation, you can create a default
Secret
with an emptycredentials-velero
file. If there is no defaultSecret
, the installation will fail.
Procedure
-
Click Operators
Installed Operators and select the OADP Operator. - Under Provided APIs, click Create instance in the DataProtectionApplication box.
Click YAML View and update the parameters of the
DataProtectionApplication
manifest:apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp 1 spec: configuration: velero: defaultPlugins: - kubevirt 2 - gcp 3 - csi 4 - openshift 5 resourceTimeout: 10m 6 nodeAgent: 7 enable: true 8 uploaderType: kopia 9 podConfig: nodeSelector: <node_selector> 10 backupLocations: - velero: provider: gcp 11 default: true credential: key: cloud name: <default_secret> 12 objectStorage: bucket: <bucket_name> 13 prefix: <prefix> 14
- 1
- The default namespace for OADP is
openshift-adp
. The namespace is a variable and is configurable. - 2
- The
kubevirt
plugin is mandatory for OpenShift Virtualization. - 3
- Specify the plugin for the backup provider, for example,
gcp
, if it exists. - 4
- The
csi
plugin is mandatory for backing up PVs with CSI snapshots. Thecsi
plugin uses the Velero CSI beta snapshot APIs. You do not need to configure a snapshot location. - 5
- The
openshift
plugin is mandatory. - 6
- Specify how many minutes to wait for several Velero resources before timeout occurs, such as Velero CRD availability, volumeSnapshot deletion, and backup repository availability. The default is 10m.
- 7
- The administrative agent that routes the administrative requests to servers.
- 8
- Set this value to
true
if you want to enablenodeAgent
and perform File System Backup. - 9
- Enter
kopia
as your uploader to use the Built-in DataMover. ThenodeAgent
deploys a daemon set, which means that thenodeAgent
pods run on each working node. You can configure File System Backup by addingspec.defaultVolumesToFsBackup: true
to theBackup
CR. - 10
- Specify the nodes on which Kopia are available. By default, Kopia runs on all nodes.
- 11
- Specify the backup provider.
- 12
- Specify the correct default name for the
Secret
, for example,cloud-credentials-gcp
, if you use a default plugin for the backup provider. If specifying a custom name, then the custom name is used for the backup location. If you do not specify aSecret
name, the default name is used. - 13
- Specify a bucket as the backup storage location. If the bucket is not a dedicated bucket for Velero backups, you must specify a prefix.
- 14
- Specify a prefix for Velero backups, for example,
velero
, if the bucket is used for multiple purposes.
- Click Create.
Verification
Verify the installation by viewing the OpenShift API for Data Protection (OADP) resources by running the following command:
$ oc get all -n openshift-adp
Example output
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/node-agent-9cq4q 1/1 Running 0 94s pod/node-agent-m4lts 1/1 Running 0 94s pod/node-agent-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s service/openshift-adp-velero-metrics-svc ClusterIP 172.30.10.0 <none> 8085/TCP 8h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
Verify that the
DataProtectionApplication
(DPA) is reconciled by running the following command:$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
Example output
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
-
Verify the
type
is set toReconciled
. Verify the backup storage location and confirm that the
PHASE
isAvailable
by running the following command:$ oc get backupStorageLocation -n openshift-adp
Example output
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true