Questo contenuto non è disponibile nella lingua selezionata.
Backup and restore
Backing up and restoring your Red Hat OpenShift Service on AWS cluster
Abstract
Chapter 1. OADP Application backup and restore
1.1. Introduction to {oadp-full}
The OpenShift API for Data Protection (OADP) product safeguards customer applications on Red Hat OpenShift Service on AWS. It offers comprehensive disaster recovery protection, covering Red Hat OpenShift Service on AWS applications, application-related cluster resources, persistent volumes, and internal images. OADP is also capable of backing up both containerized applications and virtual machines (VMs).
OADP support is provided to customer workload namespaces, and cluster scope resources.
Full cluster backup and restore are not supported.
1.1.1. OpenShift API for Data Protection APIs
OpenShift API for Data Protection (OADP) provides APIs that enable multiple approaches to customizing backups and preventing the inclusion of unnecessary or inappropriate resources.
OADP provides the following APIs:
1.1.1.1. Support for OpenShift API for Data Protection
Version | OCP version | General availability | Full support ends | Maintenance ends | Extended Update Support (EUS) | Extended Update Support Term 2 (EUS Term 2) |
1.4 |
| 10 Jul 2024 | Release of 1.5 | Release of 1.6 | 27 Jun 2026 EUS must be on OCP 4.16 | 27 Jun 2027 EUS Term 2 must be on OCP 4.16 |
1.3 |
| 29 Nov 2023 | 10 Jul 2024 | Release of 1.5 | 31 Oct 2025 EUS must be on OCP 4.14 | 31 Oct 2026 EUS Term 2 must be on OCP 4.14 |
1.1.1.1.1. Unsupported versions of the OADP Operator
Version | General availability | Full support ended | Maintenance ended |
1.2 | 14 Jun 2023 | 29 Nov 2023 | 10 Jul 2024 |
1.1 | 01 Sep 2022 | 14 Jun 2023 | 29 Nov 2023 |
1.0 | 09 Feb 2022 | 01 Sep 2022 | 14 Jun 2023 |
For more details about EUS, see Extended Update Support.
For more details about EUS Term 2, see Extended Update Support Term 2.
1.2. OADP release notes
1.2.1. OADP 1.4 release notes
The release notes for OpenShift API for Data Protection (OADP) describe new features and enhancements, deprecated features, product recommendations, known issues, and resolved issues.
For additional information about OADP, see OpenShift API for Data Protection (OADP) FAQs
1.2.1.1. OADP 1.4.3 release notes
The OpenShift API for Data Protection (OADP) 1.4.3 release notes lists the following new feature.
1.2.1.1.1. New features
Notable changes in the kubevirt
velero plugin in version 0.7.1
With this release, the kubevirt
velero plugin has been updated to version 0.7.1. Notable improvements include the following bug fix and new features:
- Virtual machine instances (VMIs) are no longer ignored from backup when the owner VM is excluded.
- Object graphs now include all extra objects during backup and restore operations.
- Optionally generated labels are now added to new firmware Universally Unique Identifiers (UUIDs) during restore operations.
- Switching VM run strategies during restore operations is now possible.
- Clearing a MAC address by label is now supported.
- The restore-specific checks during the backup operation are now skipped.
-
The
VirtualMachineClusterInstancetype
andVirtualMachineClusterPreference
custom resource definitions (CRDs) are now supported.
1.2.1.2. OADP 1.4.2 release notes
The OpenShift API for Data Protection (OADP) 1.4.2 release notes lists new features, resolved issues and bugs, and known issues.
1.2.1.2.1. New features
Backing up different volumes in the same namespace by using the VolumePolicy feature is now possible
With this release, Velero provides resource policies to back up different volumes in the same namespace by using the VolumePolicy
feature. The supported VolumePolicy
feature to back up different volumes includes skip
, snapshot
, and fs-backup
actions. OADP-1071
File system backup and data mover can now use short-term credentials
File system backup and data mover can now use short-term credentials such as AWS Security Token Service (STS) and GCP WIF. With this support, backup is successfully completed without any PartiallyFailed
status. OADP-5095
1.2.1.2.2. Resolved issues
DPA now reports errors if VSL contains an incorrect provider value
Previously, if the provider of a Volume Snapshot Location (VSL) spec was incorrect, the Data Protection Application (DPA) reconciled successfully. With this update, DPA reports errors and requests for a valid provider value. OADP-5044
Data Mover restore is successful irrespective of using different OADP namespaces for backup and restore
Previously, when backup operation was executed by using OADP installed in one namespace but was restored by using OADP installed in a different namespace, the Data Mover restore failed. With this update, Data Mover restore is now successful. OADP-5460
SSE-C backup works with the calculated MD5 of the secret key
Previously, backup failed with the following error:
Requests specifying Server Side Encryption with Customer provided keys must provide the client calculated MD5 of the secret key.
With this update, missing Server-Side Encryption with Customer-Provided Keys (SSE-C) base64 and MD5 hash are now fixed. As a result, SSE-C backup works with the calculated MD5 of the secret key. In addition, incorrect errorhandling
for the customerKey
size is also fixed. OADP-5388
For a complete list of all issues resolved in this release, see the list of OADP 1.4.2 resolved issues in Jira.
1.2.1.2.3. Known issues
The nodeSelector spec is not supported for the Data Mover restore action
When a Data Protection Application (DPA) is created with the nodeSelector
field set in the nodeAgent
parameter, Data Mover restore partially fails instead of completing the restore operation. OADP-5260
The S3 storage does not use proxy environment when TLS skip verify is specified
In the image registry backup, the S3 storage does not use the proxy environment when the insecureSkipTLSVerify
parameter is set to true
. OADP-3143
Kopia does not delete artifacts after backup expiration
Even after you delete a backup, Kopia does not delete the volume artifacts from the ${bucket_name}/kopia/$openshift-adp
on the S3 location after backup expired. For more information, see "About Kopia repository maintenance". OADP-5131
Additional resources
1.2.1.3. OADP 1.4.1 release notes
The OpenShift API for Data Protection (OADP) 1.4.1 release notes lists new features, resolved issues and bugs, and known issues.
1.2.1.3.1. New features
New DPA fields to update client qps and burst
You can now change Velero Server Kubernetes API queries per second and burst values by using the new Data Protection Application (DPA) fields. The new DPA fields are spec.configuration.velero.client-qps
and spec.configuration.velero.client-burst
, which both default to 100. OADP-4076
Enabling non-default algorithms with Kopia
With this update, you can now configure the hash, encryption, and splitter algorithms in Kopia to select non-default options to optimize performance for different backup workloads.
To configure these algorithms, set the env
variable of a velero
pod in the podConfig
section of the DataProtectionApplication (DPA) configuration. If this variable is not set, or an unsupported algorithm is chosen, Kopia will default to its standard algorithms. OADP-4640
1.2.1.3.2. Resolved issues
Restoring a backup without pods is now successful
Previously, restoring a backup without pods and having StorageClass VolumeBindingMode
set as WaitForFirstConsumer
, resulted in the PartiallyFailed
status with an error: fail to patch dynamic PV, err: context deadline exceeded
. With this update, patching dynamic PV is skipped and restoring a backup is successful without any PartiallyFailed
status. OADP-4231
PodVolumeBackup CR now displays correct message
Previously, the PodVolumeBackup
custom resource (CR) generated an incorrect message, which was: get a podvolumebackup with status "InProgress" during the server starting, mark it as "Failed"
. With this update, the message produced is now:
found a podvolumebackup with status "InProgress" during the server starting, mark it as "Failed".
Overriding imagePullPolicy is now possible with DPA
Previously, OADP set the imagePullPolicy
parameter to Always
for all images. With this update, OADP checks if each image contains sha256
or sha512
digest, then it sets imagePullPolicy
to IfNotPresent
; otherwise imagePullPolicy
is set to Always
. You can now override this policy by using the new spec.containerImagePullPolicy
DPA field. OADP-4172
OADP Velero can now retry updating the restore status if initial update fails
Previously, OADP Velero failed to update the restored CR status. This left the status at InProgress
indefinitely. Components which relied on the backup and restore CR status to determine the completion would fail. With this update, the restore CR status for a restore correctly proceeds to the Completed
or Failed
status. OADP-3227
Restoring BuildConfig Build from a different cluster is successful without any errors
Previously, when performing a restore of the BuildConfig
Build resource from a different cluster, the application generated an error on TLS verification to the internal image registry. The resulting error was failed to verify certificate: x509: certificate signed by unknown authority
error. With this update, the restore of the BuildConfig
build resources to a different cluster can proceed successfully without generating the failed to verify certificate
error. OADP-4692
Restoring an empty PVC is successful
Previously, downloading data failed while restoring an empty persistent volume claim (PVC). It failed with the following error:
data path restore failed: Failed to run kopia restore: Unable to load snapshot : snapshot not found
With this update, the downloading of data proceeds to correct conclusion when restoring an empty PVC and the error message is not generated. OADP-3106
There is no Velero memory leak in CSI and DataMover plugins
Previously, a Velero memory leak was caused by using the CSI and DataMover plugins. When the backup ended, the Velero plugin instance was not deleted and the memory leak consumed memory until an Out of Memory
(OOM) condition was generated in the Velero pod. With this update, there is no resulting Velero memory leak when using the CSI and DataMover plugins. OADP-4448
Post-hook operation does not start before the related PVs are released
Previously, due to the asynchronous nature of the Data Mover operation, a post-hook might be attempted before the Data Mover persistent volume claim (PVC) releases the persistent volumes (PVs) of the related pods. This problem would cause the backup to fail with a PartiallyFailed
status. With this update, the post-hook operation is not started until the related PVs are released by the Data Mover PVC, eliminating the PartiallyFailed
backup status. OADP-3140
Deploying a DPA works as expected in namespaces with more than 37 characters
When you install the OADP Operator in a namespace with more than 37 characters to create a new DPA, labeling the "cloud-credentials" Secret fails and the DPA reports the following error:
The generated label name is too long.
With this update, creating a DPA does not fail in namespaces with more than 37 characters in the name. OADP-3960
Restore is successfully completed by overriding the timeout error
Previously, in a large scale environment, the restore operation would result in a Partiallyfailed
status with the error: fail to patch dynamic PV, err: context deadline exceeded
. With this update, the resourceTimeout
Velero server argument is used to override this timeout error resulting in a successful restore. OADP-4344
For a complete list of all issues resolved in this release, see the list of OADP 1.4.1 resolved issues in Jira.
1.2.1.3.3. Known issues
Cassandra application pods enter into the CrashLoopBackoff
status after restoring OADP
After OADP restores, the Cassandra application pods might enter CrashLoopBackoff
status. To work around this problem, delete the StatefulSet
pods that are returning the error CrashLoopBackoff
state after restoring OADP. The StatefulSet
controller then recreates these pods and it runs normally. OADP-4407
Deployment referencing ImageStream is not restored properly leading to corrupted pod and volume contents
During a File System Backup (FSB) restore operation, a Deployment
resource referencing an ImageStream
is not restored properly. The restored pod that runs the FSB, and the postHook
is terminated prematurely.
During the restore operation, the OpenShift Container Platform controller updates the spec.template.spec.containers[0].image
field in the Deployment
resource with an updated ImageStreamTag
hash. The update triggers the rollout of a new pod, terminating the pod on which velero
runs the FSB along with the post-hook. For more information about image stream trigger, see Triggering updates on image stream changes.
The workaround for this behavior is a two-step restore process:
Perform a restore excluding the
Deployment
resources, for example:$ velero restore create <RESTORE_NAME> \ --from-backup <BACKUP_NAME> \ --exclude-resources=deployment.apps
Once the first restore is successful, perform a second restore by including these resources, for example:
$ velero restore create <RESTORE_NAME> \ --from-backup <BACKUP_NAME> \ --include-resources=deployment.apps
1.2.1.4. OADP 1.4.0 release notes
The OpenShift API for Data Protection (OADP) 1.4.0 release notes lists resolved issues and known issues.
1.2.1.4.1. Resolved issues
Restore works correctly in Red Hat OpenShift Service on AWS 4.16
Previously, while restoring the deleted application namespace, the restore operation partially failed with the resource name may not be empty
error in Red Hat OpenShift Service on AWS 4.16. With this update, restore works as expected in Red Hat OpenShift Service on AWS 4.16. OADP-4075
Data Mover backups work properly in the Red Hat OpenShift Service on AWS 4.16 cluster
Previously, Velero was using the earlier version of SDK where the Spec.SourceVolumeMode
field did not exist. As a consequence, Data Mover backups failed in the Red Hat OpenShift Service on AWS 4.16 cluster on the external snapshotter with version 4.2. With this update, external snapshotter is upgraded to version 7.0 and later. As a result, backups do not fail in the Red Hat OpenShift Service on AWS 4.16 cluster. OADP-3922
For a complete list of all issues resolved in this release, see the list of OADP 1.4.0 resolved issues in Jira.
1.2.1.4.2. Known issues
Backup fails when checksumAlgorithm is not set for MCG
While performing a backup of any application with Noobaa as the backup location, if the checksumAlgorithm
configuration parameter is not set, backup fails. To fix this problem, if you do not provide a value for checksumAlgorithm
in the Backup Storage Location (BSL) configuration, an empty value is added. The empty value is only added for BSLs that are created using Data Protection Application (DPA) custom resource (CR), and this value is not added if BSLs are created using any other method. OADP-4274
For a complete list of all known issues in this release, see the list of OADP 1.4.0 known issues in Jira.
1.2.1.4.3. Upgrade notes
Always upgrade to the next minor version. Do not skip versions. To update to a later version, upgrade only one channel at a time. For example, to upgrade from OpenShift API for Data Protection (OADP) 1.1 to 1.3, upgrade first to 1.2, and then to 1.3.
1.2.1.4.3.1. Changes from OADP 1.3 to 1.4
The Velero server has been updated from version 1.12 to 1.14. Note that there are no changes in the Data Protection Application (DPA).
This changes the following:
-
The
velero-plugin-for-csi
code is now available in the Velero code, which means aninit
container is no longer required for the plugin. - Velero changed client Burst and QPS defaults from 30 and 20 to 100 and 100, respectively.
The
velero-plugin-for-aws
plugin updated default value of thespec.config.checksumAlgorithm
field inBackupStorageLocation
objects (BSLs) from""
(no checksum calculation) to theCRC32
algorithm. The checksum algorithm types are known to work only with AWS. Several S3 providers require themd5sum
to be disabled by setting the checksum algorithm to""
. Confirmmd5sum
algorithm support and configuration with your storage provider.In OADP 1.4, the default value for BSLs created within DPA for this configuration is
""
. This default value means that themd5sum
is not checked, which is consistent with OADP 1.3. For BSLs created within DPA, update it by using thespec.backupLocations[].velero.config.checksumAlgorithm
field in the DPA. If your BSLs are created outside DPA, you can update this configuration by usingspec.config.checksumAlgorithm
in the BSLs.
1.2.1.4.3.2. Backing up the DPA configuration
You must back up your current DataProtectionApplication
(DPA) configuration.
Procedure
Save your current DPA configuration by running the following command:
Example command
$ oc get dpa -n openshift-adp -o yaml > dpa.orig.backup
1.2.1.4.3.3. Upgrading the OADP Operator
Use the following procedure when upgrading the OpenShift API for Data Protection (OADP) Operator.
Procedure
-
Change your subscription channel for the OADP Operator from
stable-1.3
tostable-1.4
. - Wait for the Operator and containers to update and restart.
Additional resources
1.2.1.4.4. Converting DPA to the new version
To upgrade from OADP 1.3 to 1.4, no Data Protection Application (DPA) changes are required.
1.2.1.4.5. Verifying the upgrade
Use the following procedure to verify the upgrade.
Procedure
Verify the installation by viewing the OpenShift API for Data Protection (OADP) resources by running the following command:
$ oc get all -n openshift-adp
Example output
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/restic-9cq4q 1/1 Running 0 94s pod/restic-m4lts 1/1 Running 0 94s pod/restic-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/restic 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
Verify that the
DataProtectionApplication
(DPA) is reconciled by running the following command:$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
Example output
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
-
Verify the
type
is set toReconciled
. Verify the backup storage location and confirm that the
PHASE
isAvailable
by running the following command:$ oc get backupstoragelocations.velero.io -n openshift-adp
Example output
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
1.3. OADP performance
1.3.1. OADP recommended network settings
For a supported experience with OpenShift API for Data Protection (OADP), you should have a stable and resilient network across OpenShift nodes, S3 storage, and in supported cloud environments that meet OpenShift network requirement recommendations.
To ensure successful backup and restore operations for deployments with remote S3 buckets located off-cluster with suboptimal data paths, it is recommended that your network settings meet the following minimum requirements in such less optimal conditions:
- Bandwidth (network upload speed to object storage): Greater than 2 Mbps for small backups and 10-100 Mbps depending on the data volume for larger backups.
- Packet loss: 1%
- Packet corruption: 1%
- Latency: 100ms
Ensure that your Red Hat OpenShift Service on AWS network performs optimally and meets Red Hat OpenShift Service on AWS network requirements.
Although Red Hat provides supports for standard backup and restore failures, it does not provide support for failures caused by network settings that do not meet the recommended thresholds.
1.4. OADP features and plugins
OpenShift API for Data Protection (OADP) features provide options for backing up and restoring applications.
The default plugins enable Velero to integrate with certain cloud providers and to back up and restore Red Hat OpenShift Service on AWS resources.
1.4.1. OADP features
OpenShift API for Data Protection (OADP) supports the following features:
- Backup
You can use OADP to back up all applications on the OpenShift Platform, or you can filter the resources by type, namespace, or label.
OADP backs up Kubernetes objects and internal images by saving them as an archive file on object storage. OADP backs up persistent volumes (PVs) by creating snapshots with the native cloud snapshot API or with the Container Storage Interface (CSI). For cloud providers that do not support snapshots, OADP backs up resources and PV data with Restic.
NoteYou must exclude Operators from the backup of an application for backup and restore to succeed.
- Restore
You can restore resources and PVs from a backup. You can restore all objects in a backup or filter the objects by namespace, PV, or label.
NoteYou must exclude Operators from the backup of an application for backup and restore to succeed.
- Schedule
- You can schedule backups at specified intervals.
- Hooks
-
You can use hooks to run commands in a container on a pod, for example,
fsfreeze
to freeze a file system. You can configure a hook to run before or after a backup or restore. Restore hooks can run in an init container or in the application container.
1.4.2. OADP plugins
The OpenShift API for Data Protection (OADP) provides default Velero plugins that are integrated with storage providers to support backup and snapshot operations. You can create custom plugins based on the Velero plugins.
OADP also provides plugins for Red Hat OpenShift Service on AWS resource backups, OpenShift Virtualization resource backups, and Container Storage Interface (CSI) snapshots.
OADP plugin | Function | Storage location |
---|---|---|
| Backs up and restores Kubernetes objects. | AWS S3 |
Backs up and restores volumes with snapshots. | AWS EBS | |
| Backs up and restores Red Hat OpenShift Service on AWS resources. [1] | Object store |
| Backs up and restores OpenShift Virtualization resources. [2] | Object store |
| Backs up and restores volumes with CSI snapshots. [3] | Cloud storage that supports CSI snapshots |
| VolumeSnapshotMover relocates snapshots from the cluster into an object store to be used during a restore process to recover stateful applications, in situations such as cluster deletion. [4] | Object store |
- Mandatory.
- Virtual machine disks are backed up with CSI snapshots or Restic.
The
csi
plugin uses the Kubernetes CSI snapshot API.-
OADP 1.1 or later uses
snapshot.storage.k8s.io/v1
-
OADP 1.0 uses
snapshot.storage.k8s.io/v1beta1
-
OADP 1.1 or later uses
- OADP 1.2 only.
1.4.3. About OADP Velero plugins
You can configure two types of plugins when you install Velero:
- Default cloud provider plugins
- Custom plugins
Both types of plugin are optional, but most users configure at least one cloud provider plugin.
1.4.3.1. Default Velero cloud provider plugins
You can install any of the following default Velero cloud provider plugins when you configure the oadp_v1alpha1_dpa.yaml
file during deployment:
-
aws
(Amazon Web Services) -
openshift
(OpenShift Velero plugin) -
csi
(Container Storage Interface) -
kubevirt
(KubeVirt)
You specify the desired default plugins in the oadp_v1alpha1_dpa.yaml
file during deployment.
Example file
The following .yaml
file installs the openshift
, aws
, azure
, and gcp
plugins:
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: dpa-sample spec: configuration: velero: defaultPlugins: - openshift - aws - azure - gcp
1.4.3.2. Custom Velero plugins
You can install a custom Velero plugin by specifying the plugin image
and name
when you configure the oadp_v1alpha1_dpa.yaml
file during deployment.
You specify the desired custom plugins in the oadp_v1alpha1_dpa.yaml
file during deployment.
Example file
The following .yaml
file installs the default openshift
, azure
, and gcp
plugins and a custom plugin that has the name custom-plugin-example
and the image quay.io/example-repo/custom-velero-plugin
:
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: dpa-sample spec: configuration: velero: defaultPlugins: - openshift - azure - gcp customPlugins: - name: custom-plugin-example image: quay.io/example-repo/custom-velero-plugin
1.4.3.3. Velero plugins returning "received EOF, stopping recv loop" message
Velero plugins are started as separate processes. After the Velero operation has completed, either successfully or not, they exit. Receiving a received EOF, stopping recv loop
message in the debug logs indicates that a plugin operation has completed. It does not mean that an error has occurred.
1.4.4. OADP plugins known issues
The following section describes known issues in OpenShift API for Data Protection (OADP) plugins:
1.4.4.1. Velero plugin panics during imagestream backups due to a missing secret
When the backup and the Backup Storage Location (BSL) are managed outside the scope of the Data Protection Application (DPA), the OADP controller, meaning the DPA reconciliation does not create the relevant oadp-<bsl_name>-<bsl_provider>-registry-secret
.
When the backup is run, the OpenShift Velero plugin panics on the imagestream backup, with the following panic error:
024-02-27T10:46:50.028951744Z time="2024-02-27T10:46:50Z" level=error msg="Error backing up item" backup=openshift-adp/<backup name> error="error executing custom action (groupResource=imagestreams.image.openshift.io, namespace=<BSL Name>, name=postgres): rpc error: code = Aborted desc = plugin panicked: runtime error: index out of range with length 1, stack trace: goroutine 94…
1.4.4.1.1. Workaround to avoid the panic error
To avoid the Velero plugin panic error, perform the following steps:
Label the custom BSL with the relevant label:
$ oc label backupstoragelocations.velero.io <bsl_name> app.kubernetes.io/component=bsl
After the BSL is labeled, wait until the DPA reconciles.
NoteYou can force the reconciliation by making any minor change to the DPA itself.
When the DPA reconciles, confirm that the relevant
oadp-<bsl_name>-<bsl_provider>-registry-secret
has been created and that the correct registry data has been populated into it:$ oc -n openshift-adp get secret/oadp-<bsl_name>-<bsl_provider>-registry-secret -o json | jq -r '.data'
1.4.4.2. OpenShift ADP Controller segmentation fault
If you configure a DPA with both cloudstorage
and restic
enabled, the openshift-adp-controller-manager
pod crashes and restarts indefinitely until the pod fails with a crash loop segmentation fault.
You can have either velero
or cloudstorage
defined, because they are mutually exclusive fields.
-
If you have both
velero
andcloudstorage
defined, theopenshift-adp-controller-manager
fails. -
If you have neither
velero
norcloudstorage
defined, theopenshift-adp-controller-manager
fails.
For more information about this issue, see OADP-1054.
1.4.4.2.1. OpenShift ADP Controller segmentation fault workaround
You must define either velero
or cloudstorage
when you configure a DPA. If you define both APIs in your DPA, the openshift-adp-controller-manager
pod fails with a crash loop segmentation fault.
1.5. OADP use cases
1.5.1. Backing up workloads on OADP with ROSA STS
1.5.1.1. Performing a backup with OADP and ROSA STS
The following example hello-world
application has no persistent volumes (PVs) attached. Perform a backup with OpenShift API for Data Protection (OADP) with Red Hat OpenShift Service on AWS (ROSA) STS.
Either Data Protection Application (DPA) configuration will work.
Create a workload to back up by running the following commands:
$ oc create namespace hello-world
$ oc new-app -n hello-world --image=docker.io/openshift/hello-openshift
Expose the route by running the following command:
$ oc expose service/hello-openshift -n hello-world
Check that the application is working by running the following command:
$ curl `oc get route/hello-openshift -n hello-world -o jsonpath='{.spec.host}'`
Example output
Hello OpenShift!
Back up the workload by running the following command:
$ cat << EOF | oc create -f - apiVersion: velero.io/v1 kind: Backup metadata: name: hello-world namespace: openshift-adp spec: includedNamespaces: - hello-world storageLocation: ${CLUSTER_NAME}-dpa-1 ttl: 720h0m0s EOF
Wait until the backup is completed and then run the following command:
$ watch "oc -n openshift-adp get backup hello-world -o json | jq .status"
Example output
{ "completionTimestamp": "2022-09-07T22:20:44Z", "expiration": "2022-10-07T22:20:22Z", "formatVersion": "1.1.0", "phase": "Completed", "progress": { "itemsBackedUp": 58, "totalItems": 58 }, "startTimestamp": "2022-09-07T22:20:22Z", "version": 1 }
Delete the demo workload by running the following command:
$ oc delete ns hello-world
Restore the workload from the backup by running the following command:
$ cat << EOF | oc create -f - apiVersion: velero.io/v1 kind: Restore metadata: name: hello-world namespace: openshift-adp spec: backupName: hello-world EOF
Wait for the Restore to finish by running the following command:
$ watch "oc -n openshift-adp get restore hello-world -o json | jq .status"
Example output
{ "completionTimestamp": "2022-09-07T22:25:47Z", "phase": "Completed", "progress": { "itemsRestored": 38, "totalItems": 38 }, "startTimestamp": "2022-09-07T22:25:28Z", "warnings": 9 }
Check that the workload is restored by running the following command:
$ oc -n hello-world get pods
Example output
NAME READY STATUS RESTARTS AGE hello-openshift-9f885f7c6-kdjpj 1/1 Running 0 90s
Check the JSONPath by running the following command:
$ curl `oc get route/hello-openshift -n hello-world -o jsonpath='{.spec.host}'`
Example output
Hello OpenShift!
For troubleshooting tips, see the OADP team’s troubleshooting documentation.
1.5.1.2. Cleaning up a cluster after a backup with OADP and ROSA STS
If you need to uninstall the OpenShift API for Data Protection (OADP) Operator together with the backups and the S3 bucket from this example, follow these instructions.
Procedure
Delete the workload by running the following command:
$ oc delete ns hello-world
Delete the Data Protection Application (DPA) by running the following command:
$ oc -n openshift-adp delete dpa ${CLUSTER_NAME}-dpa
Delete the cloud storage by running the following command:
$ oc -n openshift-adp delete cloudstorage ${CLUSTER_NAME}-oadp
WarningIf this command hangs, you might need to delete the finalizer by running the following command:
$ oc -n openshift-adp patch cloudstorage ${CLUSTER_NAME}-oadp -p '{"metadata":{"finalizers":null}}' --type=merge
If the Operator is no longer required, remove it by running the following command:
$ oc -n openshift-adp delete subscription oadp-operator
Remove the namespace from the Operator:
$ oc delete ns openshift-adp
If the backup and restore resources are no longer required, remove them from the cluster by running the following command:
$ oc delete backups.velero.io hello-world
To delete backup, restore and remote objects in AWS S3 run the following command:
$ velero backup delete hello-world
If you no longer need the Custom Resource Definitions (CRD), remove them from the cluster by running the following command:
$ for CRD in `oc get crds | grep velero | awk '{print $1}'`; do oc delete crd $CRD; done
Delete the AWS S3 bucket by running the following commands:
$ aws s3 rm s3://${CLUSTER_NAME}-oadp --recursive
$ aws s3api delete-bucket --bucket ${CLUSTER_NAME}-oadp
Detach the policy from the role by running the following command:
$ aws iam detach-role-policy --role-name "${ROLE_NAME}" --policy-arn "${POLICY_ARN}"
Delete the role by running the following command:
$ aws iam delete-role --role-name "${ROLE_NAME}"
1.5.2. OpenShift API for Data Protection (OADP) restore use case
Following is a use case for using OADP to restore a backup to a different namespace.
1.5.2.1. Restoring an application to a different namespace using OADP
Restore a backup of an application by using OADP to a new target namespace, test-restore-application
. To restore a backup, you create a restore custom resource (CR) as shown in the following example. In the restore CR, the source namespace refers to the application namespace that you included in the backup. You then verify the restore by changing your project to the new restored namespace and verifying the resources.
Prerequisites
- You installed the OADP Operator.
- You have the backup of an application to be restored.
Procedure
Create a restore CR as shown in the following example:
Example restore CR
apiVersion: velero.io/v1 kind: Restore metadata: name: test-restore 1 namespace: openshift-adp spec: backupName: <backup_name> 2 restorePVs: true namespaceMapping: <application_namespace>: test-restore-application 3
Apply the restore CR by running the following command:
$ oc apply -f <restore_cr_filename>
Verification
Verify that the restore is in the
Completed
phase by running the following command:$ oc describe restores.velero.io <restore_name> -n openshift-adp
Change to the restored namespace
test-restore-application
by running the following command:$ oc project test-restore-application
Verify the restored resources such as persistent volume claim (pvc), service (svc), deployment, secret, and config map by running the following command:
$ oc get pvc,svc,deployment,secret,configmap
Example output
NAME STATUS VOLUME persistentvolumeclaim/mysql Bound pvc-9b3583db-...-14b86 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/mysql ClusterIP 172....157 <none> 3306/TCP 2m56s service/todolist ClusterIP 172.....15 <none> 8000/TCP 2m56s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/mysql 0/1 1 0 2m55s NAME TYPE DATA AGE secret/builder-dockercfg-6bfmd kubernetes.io/dockercfg 1 2m57s secret/default-dockercfg-hz9kz kubernetes.io/dockercfg 1 2m57s secret/deployer-dockercfg-86cvd kubernetes.io/dockercfg 1 2m57s secret/mysql-persistent-sa-dockercfg-rgp9b kubernetes.io/dockercfg 1 2m57s NAME DATA AGE configmap/kube-root-ca.crt 1 2m57s configmap/openshift-service-ca.crt 1 2m57s
1.6. Installing and configuring OADP
1.6.1. Installing OADP
You can use OpenShift API for Data Protection (OADP) with Red Hat OpenShift Service on AWS (ROSA) clusters to back up and restore application data.
Before installing OpenShift API for Data Protection (OADP), you must set up role and policy credentials for OADP so that it can use the Amazon Web Services API.
This process is performed in the following two stages:
- Prepare AWS credentials
- Install the OADP Operator and give it an IAM role
1.6.1.1. Preparing AWS credentials for OADP
An Amazon Web Services account must be prepared and configured to accept an OpenShift API for Data Protection (OADP) installation.
Procedure
Create the following environment variables by running the following commands:
ImportantChange the cluster name to match your ROSA cluster, and ensure you are logged into the cluster as an administrator. Ensure that all fields are outputted correctly before continuing.
$ export CLUSTER_NAME=my-cluster 1 export ROSA_CLUSTER_ID=$(rosa describe cluster -c ${CLUSTER_NAME} --output json | jq -r .id) export REGION=$(rosa describe cluster -c ${CLUSTER_NAME} --output json | jq -r .region.id) export OIDC_ENDPOINT=$(oc get authentication.config.openshift.io cluster -o jsonpath='{.spec.serviceAccountIssuer}' | sed 's|^https://||') export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text) export CLUSTER_VERSION=$(rosa describe cluster -c ${CLUSTER_NAME} -o json | jq -r .version.raw_id | cut -f -2 -d '.') export ROLE_NAME="${CLUSTER_NAME}-openshift-oadp-aws-cloud-credentials" export SCRATCH="/tmp/${CLUSTER_NAME}/oadp" mkdir -p ${SCRATCH} echo "Cluster ID: ${ROSA_CLUSTER_ID}, Region: ${REGION}, OIDC Endpoint: ${OIDC_ENDPOINT}, AWS Account ID: ${AWS_ACCOUNT_ID}"
- 1
- Replace
my-cluster
with your ROSA cluster name.
On the AWS account, create an IAM policy to allow access to AWS S3:
Check to see if the policy exists by running the following command:
$ POLICY_ARN=$(aws iam list-policies --query "Policies[?PolicyName=='RosaOadpVer1'].{ARN:Arn}" --output text) 1
- 1
- Replace
RosaOadp
with your policy name.
Enter the following command to create the policy JSON file and then create the policy in ROSA:
NoteIf the policy ARN is not found, the command creates the policy. If the policy ARN already exists, the
if
statement intentionally skips the policy creation.$ if [[ -z "${POLICY_ARN}" ]]; then cat << EOF > ${SCRATCH}/policy.json 1 { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:CreateBucket", "s3:DeleteBucket", "s3:PutBucketTagging", "s3:GetBucketTagging", "s3:PutEncryptionConfiguration", "s3:GetEncryptionConfiguration", "s3:PutLifecycleConfiguration", "s3:GetLifecycleConfiguration", "s3:GetBucketLocation", "s3:ListBucket", "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucketMultipartUploads", "s3:AbortMultipartUpload", "s3:ListMultipartUploadParts", "ec2:DescribeSnapshots", "ec2:DescribeVolumes", "ec2:DescribeVolumeAttribute", "ec2:DescribeVolumesModifications", "ec2:DescribeVolumeStatus", "ec2:CreateTags", "ec2:CreateVolume", "ec2:CreateSnapshot", "ec2:DeleteSnapshot" ], "Resource": "*" } ]} EOF POLICY_ARN=$(aws iam create-policy --policy-name "RosaOadpVer1" \ --policy-document file:///${SCRATCH}/policy.json --query Policy.Arn \ --tags Key=rosa_openshift_version,Value=${CLUSTER_VERSION} Key=rosa_role_prefix,Value=ManagedOpenShift Key=operator_namespace,Value=openshift-oadp Key=operator_name,Value=openshift-oadp \ --output text) fi
- 1
SCRATCH
is a name for a temporary directory created for the environment variables.
View the policy ARN by running the following command:
$ echo ${POLICY_ARN}
Create an IAM role trust policy for the cluster:
Create the trust policy file by running the following command:
$ cat <<EOF > ${SCRATCH}/trust-policy.json { "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_ENDPOINT}" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { "${OIDC_ENDPOINT}:sub": [ "system:serviceaccount:openshift-adp:openshift-adp-controller-manager", "system:serviceaccount:openshift-adp:velero"] } } }] } EOF
Create the role by running the following command:
$ ROLE_ARN=$(aws iam create-role --role-name \ "${ROLE_NAME}" \ --assume-role-policy-document file://${SCRATCH}/trust-policy.json \ --tags Key=rosa_cluster_id,Value=${ROSA_CLUSTER_ID} \ Key=rosa_openshift_version,Value=${CLUSTER_VERSION} \ Key=rosa_role_prefix,Value=ManagedOpenShift \ Key=operator_namespace,Value=openshift-adp \ Key=operator_name,Value=openshift-oadp \ --query Role.Arn --output text)
View the role ARN by running the following command:
$ echo ${ROLE_ARN}
Attach the IAM policy to the IAM role by running the following command:
$ aws iam attach-role-policy --role-name "${ROLE_NAME}" \ --policy-arn ${POLICY_ARN}
1.6.1.2. Installing the OADP Operator and providing the IAM role
AWS Security Token Service (AWS STS) is a global web service that provides short-term credentials for IAM or federated users. Red Hat OpenShift Service on AWS (ROSA) with STS is the recommended credential mode for ROSA clusters. This document describes how to install OpenShift API for Data Protection (OADP) on ROSA with AWS STS.
Restic is unsupported.
Kopia file system backup (FSB) is supported when backing up file systems that do not have Container Storage Interface (CSI) snapshotting support.
Example file systems include the following:
- Amazon Elastic File System (EFS)
- Network File System (NFS)
-
emptyDir
volumes - Local volumes
For backing up volumes, OADP on ROSA with AWS STS supports only native snapshots and Container Storage Interface (CSI) snapshots.
In an Amazon ROSA cluster that uses STS authentication, restoring backed-up data in a different AWS region is not supported.
The Data Mover feature is not currently supported in ROSA clusters. You can use native AWS S3 tools for moving data.
Prerequisites
-
An Red Hat OpenShift Service on AWS ROSA cluster with the required access and tokens. For instructions, see the previous procedure Preparing AWS credentials for OADP. If you plan to use two different clusters for backing up and restoring, you must prepare AWS credentials, including
ROLE_ARN
, for each cluster.
Procedure
Create an Red Hat OpenShift Service on AWS secret from your AWS token file by entering the following commands:
Create the credentials file:
$ cat <<EOF > ${SCRATCH}/credentials [default] role_arn = ${ROLE_ARN} web_identity_token_file = /var/run/secrets/openshift/serviceaccount/token region = <aws_region> 1 EOF
- 1
- Replace
<aws_region>
with the AWS region to use for the STS endpoint.
Create a namespace for OADP:
$ oc create namespace openshift-adp
Create the Red Hat OpenShift Service on AWS secret:
$ oc -n openshift-adp create secret generic cloud-credentials \ --from-file=${SCRATCH}/credentials
NoteIn Red Hat OpenShift Service on AWS versions 4.15 and later, the OADP Operator supports a new standardized STS workflow through the Operator Lifecycle Manager (OLM) and Cloud Credentials Operator (CCO). In this workflow, you do not need to create the above secret, you only need to supply the role ARN during the installation of OLM-managed operators using the Red Hat OpenShift Service on AWS web console, for more information see Installing from OperatorHub using the web console.
The preceding secret is created automatically by CCO.
Install the OADP Operator:
- In the Red Hat OpenShift Service on AWS web console, browse to Operators → OperatorHub.
- Search for the OADP Operator.
- In the role_ARN field, paste the role_arn that you created previously and click Install.
Create AWS cloud storage using your AWS credentials by entering the following command:
$ cat << EOF | oc create -f - apiVersion: oadp.openshift.io/v1alpha1 kind: CloudStorage metadata: name: ${CLUSTER_NAME}-oadp namespace: openshift-adp spec: creationSecret: key: credentials name: cloud-credentials enableSharedConfig: true name: ${CLUSTER_NAME}-oadp provider: aws region: $REGION EOF
Check your application’s storage default storage class by entering the following command:
$ oc get pvc -n <namespace>
Example output
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE applog Bound pvc-351791ae-b6ab-4e8b-88a4-30f73caf5ef8 1Gi RWO gp3-csi 4d19h mysql Bound pvc-16b8e009-a20a-4379-accc-bc81fedd0621 1Gi RWO gp3-csi 4d19h
Get the storage class by running the following command:
$ oc get storageclass
Example output
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE gp2 kubernetes.io/aws-ebs Delete WaitForFirstConsumer true 4d21h gp2-csi ebs.csi.aws.com Delete WaitForFirstConsumer true 4d21h gp3 ebs.csi.aws.com Delete WaitForFirstConsumer true 4d21h gp3-csi (default) ebs.csi.aws.com Delete WaitForFirstConsumer true 4d21h
NoteThe following storage classes will work:
- gp3-csi
- gp2-csi
- gp3
- gp2
If the application or applications that are being backed up are all using persistent volumes (PVs) with Container Storage Interface (CSI), it is advisable to include the CSI plugin in the OADP DPA configuration.
Create the
DataProtectionApplication
resource to configure the connection to the storage where the backups and volume snapshots are stored:If you are using only CSI volumes, deploy a Data Protection Application by entering the following command:
$ cat << EOF | oc create -f - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: ${CLUSTER_NAME}-dpa namespace: openshift-adp spec: backupImages: true 1 features: dataMover: enable: false backupLocations: - bucket: cloudStorageRef: name: ${CLUSTER_NAME}-oadp credential: key: credentials name: cloud-credentials prefix: velero default: true config: region: ${REGION} configuration: velero: defaultPlugins: - openshift - aws - csi nodeAgent: 2 enable: false uploaderType: kopia 3 EOF
- 1
- ROSA supports internal image backup. Set this field to
false
if you do not want to use image backup. - 2
- See the important note regarding the
nodeAgent
attribute. - 3
- The type of uploader. The possible values are
restic
orkopia
. The built-in Data Mover uses Kopia as the default uploader mechanism regardless of the value of theuploaderType
field.
If you are using CSI or non-CSI volumes, deploy a Data Protection Application by entering the following command:
$ cat << EOF | oc create -f - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: ${CLUSTER_NAME}-dpa namespace: openshift-adp spec: backupImages: true 1 backupLocations: - bucket: cloudStorageRef: name: ${CLUSTER_NAME}-oadp credential: key: credentials name: cloud-credentials prefix: velero default: true config: region: ${REGION} configuration: velero: defaultPlugins: - openshift - aws nodeAgent: 2 enable: false uploaderType: restic snapshotLocations: - velero: config: credentialsFile: /tmp/credentials/openshift-adp/cloud-credentials-credentials 3 enableSharedConfig: "true" 4 profile: default 5 region: ${REGION} 6 provider: aws EOF
- 1
- ROSA supports internal image backup. Set this field to false if you do not want to use image backup.
- 2
- See the important note regarding the
nodeAgent
attribute. - 3
- The
credentialsFile
field is the mounted location of the bucket credential on the pod. - 4
- The
enableSharedConfig
field allows thesnapshotLocations
to share or reuse the credential defined for the bucket. - 5
- Use the profile name set in the AWS credentials file.
- 6
- Specify
region
as your AWS region. This must be the same as the cluster region.
You are now ready to back up and restore Red Hat OpenShift Service on AWS applications, as described in Backing up applications.
The enable
parameter of restic
is set to false
in this configuration, because OADP does not support Restic in ROSA environments.
If you use OADP 1.2, replace this configuration:
nodeAgent: enable: false uploaderType: restic
with the following configuration:
restic: enable: false
If you want to use two different clusters for backing up and restoring, the two clusters must have the same AWS S3 storage names in both the cloud storage CR and the OADP DataProtectionApplication
configuration.
1.6.1.3. Updating the IAM role ARN in the OADP Operator subscription
While installing the OADP Operator on a ROSA Security Token Service (STS) cluster, if you provide an incorrect IAM role Amazon Resource Name (ARN), the openshift-adp-controller
pod gives an error. The credential requests that are generated contain the wrong IAM role ARN. To update the credential requests object with the correct IAM role ARN, you can edit the OADP Operator subscription and patch the IAM role ARN with the correct value. By editing the OADP Operator subscription, you do not have to uninstall and reinstall OADP to update the IAM role ARN.
Prerequisites
- You have a Red Hat OpenShift Service on AWS STS cluster with the required access and tokens.
- You have installed OADP on the ROSA STS cluster.
Procedure
To verify that the OADP subscription has the wrong IAM role ARN environment variable set, run the following command:
$ oc get sub -o yaml redhat-oadp-operator
Example subscription
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: annotations: creationTimestamp: "2025-01-15T07:18:31Z" generation: 1 labels: operators.coreos.com/redhat-oadp-operator.openshift-adp: "" name: redhat-oadp-operator namespace: openshift-adp resourceVersion: "77363" uid: 5ba00906-5ad2-4476-ae7b-ffa90986283d spec: channel: stable-1.4 config: env: - name: ROLEARN value: arn:aws:iam::11111111:role/wrong-role-arn 1 installPlanApproval: Manual name: redhat-oadp-operator source: prestage-operators sourceNamespace: openshift-marketplace startingCSV: oadp-operator.v1.4.2
- 1
- Verify the value of
ROLEARN
you want to update.
Update the
ROLEARN
field of the subscription with the correct role ARN by running the following command:$ oc patch subscription redhat-oadp-operator -p '{"spec": {"config": {"env": [{"name": "ROLEARN", "value": "<role_arn>"}]}}}' --type='merge'
where:
<role_arn>
-
Specifies the IAM role ARN to be updated. For example,
arn:aws:iam::160…..6956:role/oadprosa…..8wlf
.
Verify that the
secret
object is updated with correct role ARN value by running the following command:$ oc get secret cloud-credentials -o jsonpath='{.data.credentials}' | base64 -d
Example output
[default] sts_regional_endpoints = regional role_arn = arn:aws:iam::160.....6956:role/oadprosa.....8wlf web_identity_token_file = /var/run/secrets/openshift/serviceaccount/token
Configure the
DataProtectionApplication
custom resource (CR) manifest file as shown in the following example:apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: test-rosa-dpa namespace: openshift-adp spec: backupLocations: - bucket: config: region: us-east-1 cloudStorageRef: name: <cloud_storage> 1 credential: name: cloud-credentials key: credentials prefix: velero default: true configuration: velero: defaultPlugins: - aws - openshift
- 1
- Specify the
CloudStorage
CR.
Create the
DataProtectionApplication
CR by running the following command:$ oc create -f <dpa_manifest_file>
Verify that the
DataProtectionApplication
CR is reconciled and thestatus
is set to"True"
by running the following command:$ oc get dpa -n openshift-adp -o yaml
Example
DataProtectionApplication
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication ... status: conditions: - lastTransitionTime: "2023-07-31T04:48:12Z" message: Reconcile complete reason: Complete status: "True" type: Reconciled
Verify that the
BackupStorageLocation
CR is in an available state by running the following command:$ oc get backupstoragelocations.velero.io -n openshift-adp
Example
BackupStorageLocation
NAME PHASE LAST VALIDATED AGE DEFAULT ts-dpa-1 Available 3s 6s true
Additional resources
1.7. Uninstalling OADP
1.7.1. Uninstalling the OpenShift API for Data Protection
You uninstall the OpenShift API for Data Protection (OADP) by deleting the OADP Operator. See Deleting Operators from a cluster for details.
1.8. OADP backing up
1.8.1. Backing up applications
Frequent backups might consume storage on the backup storage location. Check the frequency of backups, retention time, and the amount of data of the persistent volumes (PVs) if using non-local backups, for example, S3 buckets. Because all taken backup remains until expired, also check the time to live (TTL) setting of the schedule.
You can back up applications by creating a Backup
custom resource (CR). For more information, see Creating a Backup CR.
The Backup
CR creates backup files for Kubernetes resources and internal images on S3 object storage.
1.8.1.1. Previewing resources before running backup and restore
OADP backs up application resources based on the type, namespace, or label. This means that you can view the resources after the backup is complete. Similarly, you can view the restored objects based on the namespace, persistent volume (PV), or label after a restore operation is complete. To preview the resources in advance, you can do a dry run of the backup and restore operations.
Prerequisites
- You have installed the OADP Operator.
Procedure
To preview the resources included in the backup before running the actual backup, run the following command:
$ velero backup create <backup-name> --snapshot-volumes false 1
- 1
- Specify the value of
--snapshot-volumes
parameter asfalse
.
To know more details about the backup resources, run the following command:
$ velero describe backup <backup_name> --details 1
- 1
- Specify the name of the backup.
To preview the resources included in the restore before running the actual restore, run the following command:
$ velero restore create --from-backup <backup-name> 1
- 1
- Specify the name of the backup created to review the backup resources.
ImportantThe
velero restore create
command creates restore resources in the cluster. You must delete the resources created as part of the restore, after you review the resources.To know more details about the restore resources, run the following command:
$ velero describe restore <restore_name> --details 1
- 1
- Specify the name of the restore.
You can create backup hooks to run commands before or after the backup operation. See Creating backup hooks.
You can schedule backups by creating a Schedule
CR instead of a Backup
CR. See Scheduling backups using Schedule CR.
1.8.1.2. Known issues
Red Hat OpenShift Service on AWS 4 enforces a pod security admission (PSA) policy that can hinder the readiness of pods during a Restic restore process.
This issue has been resolved in the OADP 1.1.6 and OADP 1.2.2 releases, therefore it is recommended that users upgrade to these releases.
Additional resources
1.8.2. Creating a Backup CR
You back up Kubernetes resources, internal images, and persistent volumes (PVs) by creating a Backup
custom resource (CR).
Prerequisites
- You must install the OpenShift API for Data Protection (OADP) Operator.
-
The
DataProtectionApplication
CR must be in aReady
state. Backup location prerequisites:
- You must have S3 object storage configured for Velero.
-
You must have a backup location configured in the
DataProtectionApplication
CR.
Snapshot location prerequisites:
- Your cloud provider must have a native snapshot API or support Container Storage Interface (CSI) snapshots.
-
For CSI snapshots, you must create a
VolumeSnapshotClass
CR to register the CSI driver. -
You must have a volume location configured in the
DataProtectionApplication
CR.
Procedure
Retrieve the
backupStorageLocations
CRs by entering the following command:$ oc get backupstoragelocations.velero.io -n openshift-adp
Example output
NAMESPACE NAME PHASE LAST VALIDATED AGE DEFAULT openshift-adp velero-sample-1 Available 11s 31m
Create a
Backup
CR, as in the following example:apiVersion: velero.io/v1 kind: Backup metadata: name: <backup> labels: velero.io/storage-location: default namespace: openshift-adp spec: hooks: {} includedNamespaces: - <namespace> 1 includedResources: [] 2 excludedResources: [] 3 storageLocation: <velero-sample-1> 4 ttl: 720h0m0s labelSelector: 5 matchLabels: app: <label_1> app: <label_2> app: <label_3> orLabelSelectors: 6 - matchLabels: app: <label_1> app: <label_2> app: <label_3>
- 1
- Specify an array of namespaces to back up.
- 2
- Optional: Specify an array of resources to include in the backup. Resources might be shortcuts (for example, 'po' for 'pods') or fully-qualified. If unspecified, all resources are included.
- 3
- Optional: Specify an array of resources to exclude from the backup. Resources might be shortcuts (for example, 'po' for 'pods') or fully-qualified.
- 4
- Specify the name of the
backupStorageLocations
CR. - 5
- Map of {key,value} pairs of backup resources that have all the specified labels.
- 6
- Map of {key,value} pairs of backup resources that have one or more of the specified labels.
Verify that the status of the
Backup
CR isCompleted
:$ oc get backups.velero.io -n openshift-adp <backup> -o jsonpath='{.status.phase}'
1.8.3. Creating backup hooks
When performing a backup, it is possible to specify one or more commands to execute in a container within a pod, based on the pod being backed up.
The commands can be configured to performed before any custom action processing (Pre hooks), or after all custom actions have been completed and any additional items specified by the custom action have been backed up (Post hooks).
You create backup hooks to run commands in a container in a pod by editing the Backup
custom resource (CR).
Procedure
Add a hook to the
spec.hooks
block of theBackup
CR, as in the following example:apiVersion: velero.io/v1 kind: Backup metadata: name: <backup> namespace: openshift-adp spec: hooks: resources: - name: <hook_name> includedNamespaces: - <namespace> 1 excludedNamespaces: 2 - <namespace> includedResources: [] - pods 3 excludedResources: [] 4 labelSelector: 5 matchLabels: app: velero component: server pre: 6 - exec: container: <container> 7 command: - /bin/uname 8 - -a onError: Fail 9 timeout: 30s 10 post: 11 ...
- 1
- Optional: You can specify namespaces to which the hook applies. If this value is not specified, the hook applies to all namespaces.
- 2
- Optional: You can specify namespaces to which the hook does not apply.
- 3
- Currently, pods are the only supported resource that hooks can apply to.
- 4
- Optional: You can specify resources to which the hook does not apply.
- 5
- Optional: This hook only applies to objects matching the label. If this value is not specified, the hook applies to all objects.
- 6
- Array of hooks to run before the backup.
- 7
- Optional: If the container is not specified, the command runs in the first container in the pod.
- 8
- This is the entry point for the
init
container being added. - 9
- Allowed values for error handling are
Fail
andContinue
. The default isFail
. - 10
- Optional: How long to wait for the commands to run. The default is
30s
. - 11
- This block defines an array of hooks to run after the backup, with the same parameters as the pre-backup hooks.
1.8.4. Scheduling backups using Schedule CR
The schedule operation allows you to create a backup of your data at a particular time, specified by a Cron expression.
You schedule backups by creating a Schedule
custom resource (CR) instead of a Backup
CR.
Leave enough time in your backup schedule for a backup to finish before another backup is created.
For example, if a backup of a namespace typically takes 10 minutes, do not schedule backups more frequently than every 15 minutes.
Prerequisites
- You must install the OpenShift API for Data Protection (OADP) Operator.
-
The
DataProtectionApplication
CR must be in aReady
state.
Procedure
Retrieve the
backupStorageLocations
CRs:$ oc get backupStorageLocations -n openshift-adp
Example output
NAMESPACE NAME PHASE LAST VALIDATED AGE DEFAULT openshift-adp velero-sample-1 Available 11s 31m
Create a
Schedule
CR, as in the following example:$ cat << EOF | oc apply -f - apiVersion: velero.io/v1 kind: Schedule metadata: name: <schedule> namespace: openshift-adp spec: schedule: 0 7 * * * 1 template: hooks: {} includedNamespaces: - <namespace> 2 storageLocation: <velero-sample-1> 3 defaultVolumesToFsBackup: true 4 ttl: 720h0m0s EOF
- 1
cron
expression to schedule the backup, for example,0 7 * * *
to perform a backup every day at 7:00.NoteTo schedule a backup at specific intervals, enter the
<duration_in_minutes>
in the following format:schedule: "*/10 * * * *"
Enter the minutes value between quotation marks (
" "
).- 2
- Array of namespaces to back up.
- 3
- Name of the
backupStorageLocations
CR. - 4
- Optional: In OADP version 1.2 and later, add the
defaultVolumesToFsBackup: true
key-value pair to your configuration when performing backups of volumes with Restic. In OADP version 1.1, add thedefaultVolumesToRestic: true
key-value pair when you back up volumes with Restic.Verify that the status of the
Schedule
CR isCompleted
after the scheduled backup runs:$ oc get schedule -n openshift-adp <schedule> -o jsonpath='{.status.phase}'
1.8.5. Deleting backups
You can delete a backup by creating the DeleteBackupRequest
custom resource (CR) or by running the velero backup delete
command as explained in the following procedures.
The volume backup artifacts are deleted at different times depending on the backup method:
- Restic: The artifacts are deleted in the next full maintenance cycle, after the backup is deleted.
- Container Storage Interface (CSI): The artifacts are deleted immediately when the backup is deleted.
- Kopia: The artifacts are deleted after three full maintenance cycles of the Kopia repository, after the backup is deleted.
1.8.5.1. Deleting a backup by creating a DeleteBackupRequest CR
You can delete a backup by creating a DeleteBackupRequest
custom resource (CR).
Prerequisites
- You have run a backup of your application.
Procedure
Create a
DeleteBackupRequest
CR manifest file:apiVersion: velero.io/v1 kind: DeleteBackupRequest metadata: name: deletebackuprequest namespace: openshift-adp spec: backupName: <backup_name> 1
- 1
- Specify the name of the backup.
Apply the
DeleteBackupRequest
CR to delete the backup:$ oc apply -f <deletebackuprequest_cr_filename>
1.8.5.2. Deleting a backup by using the Velero CLI
You can delete a backup by using the Velero CLI.
Prerequisites
- You have run a backup of your application.
- You downloaded the Velero CLI and can access the Velero binary in your cluster.
Procedure
To delete the backup, run the following Velero command:
$ velero backup delete <backup_name> -n openshift-adp 1
- 1
- Specify the name of the backup.
1.8.5.3. About Kopia repository maintenance
There are two types of Kopia repository maintenance:
- Quick maintenance
- Runs every hour to keep the number of index blobs (n) low. A high number of indexes negatively affects the performance of Kopia operations.
- Does not delete any metadata from the repository without ensuring that another copy of the same metadata exists.
- Full maintenance
- Runs every 24 hours to perform garbage collection of repository contents that are no longer needed.
-
snapshot-gc
, a full maintenance task, finds all files and directory listings that are no longer accessible from snapshot manifests and marks them as deleted. - A full maintenance is a resource-costly operation, as it requires scanning all directories in all snapshots that are active in the cluster.
1.8.5.3.1. Kopia maintenance in OADP
The repo-maintain-job
jobs are executed in the namespace where OADP is installed, as shown in the following example:
pod/repo-maintain-job-173...2527-2nbls 0/1 Completed 0 168m pod/repo-maintain-job-173....536-fl9tm 0/1 Completed 0 108m pod/repo-maintain-job-173...2545-55ggx 0/1 Completed 0 48m
You can check the logs of the repo-maintain-job
for more details about the cleanup and the removal of artifacts in the backup object storage. You can find a note, as shown in the following example, in the repo-maintain-job
when the next full cycle maintenance is due:
not due for full maintenance cycle until 2024-00-00 18:29:4
Three successful executions of a full maintenance cycle are required for the objects to be deleted from the backup object storage. This means you can expect up to 72 hours for all the artifacts in the backup object storage to be deleted.
1.8.5.4. Deleting a backup repository
After you delete the backup, and after the Kopia repository maintenance cycles to delete the related artifacts are complete, the backup is no longer referenced by any metadata or manifest objects. You can then delete the backuprepository
custom resource (CR) to complete the backup deletion process.
Prerequisites
- You have deleted the backup of your application.
- You have waited up to 72 hours after the backup is deleted. This time frame allows Kopia to run the repository maintenance cycles.
Procedure
To get the name of the backup repository CR for a backup, run the following command:
$ oc get backuprepositories.velero.io -n openshift-adp
To delete the backup repository CR, run the following command:
$ oc delete backuprepository <backup_repository_name> -n openshift-adp 1
- 1
- Specify the name of the backup repository from the earlier step.
1.9. OADP restoring
1.9.1. Restoring applications
You restore application backups by creating a Restore
custom resource (CR). See Creating a Restore CR.
You can create restore hooks to run commands in a container in a pod by editing the Restore
CR. See Creating restore hooks.
1.9.1.1. Previewing resources before running backup and restore
OADP backs up application resources based on the type, namespace, or label. This means that you can view the resources after the backup is complete. Similarly, you can view the restored objects based on the namespace, persistent volume (PV), or label after a restore operation is complete. To preview the resources in advance, you can do a dry run of the backup and restore operations.
Prerequisites
- You have installed the OADP Operator.
Procedure
To preview the resources included in the backup before running the actual backup, run the following command:
$ velero backup create <backup-name> --snapshot-volumes false 1
- 1
- Specify the value of
--snapshot-volumes
parameter asfalse
.
To know more details about the backup resources, run the following command:
$ velero describe backup <backup_name> --details 1
- 1
- Specify the name of the backup.
To preview the resources included in the restore before running the actual restore, run the following command:
$ velero restore create --from-backup <backup-name> 1
- 1
- Specify the name of the backup created to review the backup resources.
ImportantThe
velero restore create
command creates restore resources in the cluster. You must delete the resources created as part of the restore, after you review the resources.To know more details about the restore resources, run the following command:
$ velero describe restore <restore_name> --details 1
- 1
- Specify the name of the restore.
1.9.1.2. Creating a Restore CR
You restore a Backup
custom resource (CR) by creating a Restore
CR.
Prerequisites
- You must install the OpenShift API for Data Protection (OADP) Operator.
-
The
DataProtectionApplication
CR must be in aReady
state. -
You must have a Velero
Backup
CR. - The persistent volume (PV) capacity must match the requested size at backup time. Adjust the requested size if needed.
Procedure
Create a
Restore
CR, as in the following example:apiVersion: velero.io/v1 kind: Restore metadata: name: <restore> namespace: openshift-adp spec: backupName: <backup> 1 includedResources: [] 2 excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io restorePVs: true 3
- 1
- Name of the
Backup
CR. - 2
- Optional: Specify an array of resources to include in the restore process. Resources might be shortcuts (for example,
po
forpods
) or fully-qualified. If unspecified, all resources are included. - 3
- Optional: The
restorePVs
parameter can be set tofalse
to turn off restore ofPersistentVolumes
fromVolumeSnapshot
of Container Storage Interface (CSI) snapshots or from native snapshots whenVolumeSnapshotLocation
is configured.
Verify that the status of the
Restore
CR isCompleted
by entering the following command:$ oc get restores.velero.io -n openshift-adp <restore> -o jsonpath='{.status.phase}'
Verify that the backup resources have been restored by entering the following command:
$ oc get all -n <namespace> 1
- 1
- Namespace that you backed up.
If you restore
DeploymentConfig
with volumes or if you use post-restore hooks, run thedc-post-restore.sh
cleanup script by entering the following command:$ bash dc-restic-post-restore.sh -> dc-post-restore.sh
NoteDuring the restore process, the OADP Velero plug-ins scale down the
DeploymentConfig
objects and restore the pods as standalone pods. This is done to prevent the cluster from deleting the restoredDeploymentConfig
pods immediately on restore and to allow the restore and post-restore hooks to complete their actions on the restored pods. The cleanup script shown below removes these disconnected pods and scales anyDeploymentConfig
objects back up to the appropriate number of replicas.Example 1.1.
dc-restic-post-restore.sh → dc-post-restore.sh
cleanup script#!/bin/bash set -e # if sha256sum exists, use it to check the integrity of the file if command -v sha256sum >/dev/null 2>&1; then CHECKSUM_CMD="sha256sum" else CHECKSUM_CMD="shasum -a 256" fi label_name () { if [ "${#1}" -le "63" ]; then echo $1 return fi sha=$(echo -n $1|$CHECKSUM_CMD) echo "${1:0:57}${sha:0:6}" } if [[ $# -ne 1 ]]; then echo "usage: ${BASH_SOURCE} restore-name" exit 1 fi echo "restore: $1" label=$(label_name $1) echo "label: $label" echo Deleting disconnected restore pods oc delete pods --all-namespaces -l oadp.openshift.io/disconnected-from-dc=$label for dc in $(oc get dc --all-namespaces -l oadp.openshift.io/replicas-modified=$label -o jsonpath='{range .items[*]}{.metadata.namespace}{","}{.metadata.name}{","}{.metadata.annotations.oadp\.openshift\.io/original-replicas}{","}{.metadata.annotations.oadp\.openshift\.io/original-paused}{"\n"}') do IFS=',' read -ra dc_arr <<< "$dc" if [ ${#dc_arr[0]} -gt 0 ]; then echo Found deployment ${dc_arr[0]}/${dc_arr[1]}, setting replicas: ${dc_arr[2]}, paused: ${dc_arr[3]} cat <<EOF | oc patch dc -n ${dc_arr[0]} ${dc_arr[1]} --patch-file /dev/stdin spec: replicas: ${dc_arr[2]} paused: ${dc_arr[3]} EOF fi done
1.9.1.3. Creating restore hooks
You create restore hooks to run commands in a container in a pod by editing the Restore
custom resource (CR).
You can create two types of restore hooks:
An
init
hook adds an init container to a pod to perform setup tasks before the application container starts.If you restore a Restic backup, the
restic-wait
init container is added before the restore hook init container.-
An
exec
hook runs commands or scripts in a container of a restored pod.
Procedure
Add a hook to the
spec.hooks
block of theRestore
CR, as in the following example:apiVersion: velero.io/v1 kind: Restore metadata: name: <restore> namespace: openshift-adp spec: hooks: resources: - name: <hook_name> includedNamespaces: - <namespace> 1 excludedNamespaces: - <namespace> includedResources: - pods 2 excludedResources: [] labelSelector: 3 matchLabels: app: velero component: server postHooks: - init: initContainers: - name: restore-hook-init image: alpine:latest volumeMounts: - mountPath: /restores/pvc1-vm name: pvc1-vm command: - /bin/ash - -c timeout: 4 - exec: container: <container> 5 command: - /bin/bash 6 - -c - "psql < /backup/backup.sql" waitTimeout: 5m 7 execTimeout: 1m 8 onError: Continue 9
- 1
- Optional: Array of namespaces to which the hook applies. If this value is not specified, the hook applies to all namespaces.
- 2
- Currently, pods are the only supported resource that hooks can apply to.
- 3
- Optional: This hook only applies to objects matching the label selector.
- 4
- Optional: Timeout specifies the maximum length of time Velero waits for
initContainers
to complete. - 5
- Optional: If the container is not specified, the command runs in the first container in the pod.
- 6
- This is the entrypoint for the init container being added.
- 7
- Optional: How long to wait for a container to become ready. This should be long enough for the container to start and for any preceding hooks in the same container to complete. If not set, the restore process waits indefinitely.
- 8
- Optional: How long to wait for the commands to run. The default is
30s
. - 9
- Allowed values for error handling are
Fail
andContinue
:-
Continue
: Only command failures are logged. -
Fail
: No more restore hooks run in any container in any pod. The status of theRestore
CR will bePartiallyFailed
.
-
During a File System Backup (FSB) restore operation, a Deployment
resource referencing an ImageStream
is not restored properly. The restored pod that runs the FSB, and the postHook
is terminated prematurely.
This happens because, during the restore operation, OpenShift controller updates the spec.template.spec.containers[0].image
field in the Deployment
resource with an updated ImageStreamTag
hash. The update triggers the rollout of a new pod, terminating the pod on which velero
runs the FSB and the post restore hook. For more information about image stream trigger, see "Triggering updates on image stream changes".
The workaround for this behavior is a two-step restore process:
First, perform a restore excluding the
Deployment
resources, for example:$ velero restore create <RESTORE_NAME> \ --from-backup <BACKUP_NAME> \ --exclude-resources=deployment.apps
After the first restore is successful, perform a second restore by including these resources, for example:
$ velero restore create <RESTORE_NAME> \ --from-backup <BACKUP_NAME> \ --include-resources=deployment.apps
Additional resources
Legal Notice
Copyright © 2024 Red Hat, Inc.
OpenShift documentation is licensed under the Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0).
Modified versions must remove all Red Hat trademarks.
Portions adapted from https://github.com/kubernetes-incubator/service-catalog/ with modifications by Red Hat.
Red Hat, Red Hat Enterprise Linux, the Red Hat logo, the Shadowman logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat Software Collections is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation’s permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.