Chapter 15. Backing up and restoring a director Operator deployed overcloud


To back up a Red Hat OpenStack Platform (RHOSP) overcloud that was deployed with director Operator (OSPdO), you must backup the Red Hat OpenShift Container Platform (RHOCP) OSPdO resources, and the use the Relax-and-Recover (ReaR) tool to backup the control plane and overcloud.

15.1. Backing up and restoring director Operator resources

Red Hat OpenStack Platform (RHOSP) director Operator (OSPdO) provides custom resource definitions (CRDs) for backing up and restoring a deployment. You do not have to manually export and import multiple configurations. OSPdO knows which custom resources (CRs), including the ConfigMap and Secret CRs, that it needs to create a complete backup because it is aware of the state of all resources. Therefore, OSPdO does not backup any configuration that is in an incomplete or error state.

To backup and restore an OSPdO deployment, you create an OpenStackBackupRequest CR to initiate the creation or restoration of a backup. Your OpenStackBackupRequest CR creates the OpenStackBackup CR that stores the backup of the custom resources (CRs), the ConfigMap and the Secret configurations for the specified namespace.

15.1.1. Backing up director Operator resources

To create a backup you must create an OpenStackBackupRequest custom resource (CR) for the namespace. The OpenStackBackup CR is created when the OpenStackBackupRequest object is created in save mode.

Procedure

  1. Create a file named openstack_backup.yaml on your workstation.
  2. Add the following configuration to your openstack_backup.yaml file to create the OpenStackBackupRequest custom resource (CR):

    Copy to Clipboard Toggle word wrap
    apiVersion: osp-director.openstack.org/v1beta1
    kind: OpenStackBackupRequest
    metadata:
      name: openstackbackupsave
      namespace: openstack
    spec:
      mode: save 
    1
    
      additionalConfigMaps: [] 
    2
    
      additionalSecrets: [] 
    3
    1
    Set the mode to save to request creation of an OpenStackBackup CR.
    2
    Optional: Include any ConfigMap resources that you created manually.
    3
    Optional: Include any Secret resources that you created manually.
    Note

    OSPdO attempts to include all ConfigMap and Secret objects associated with the OSPdO CRs in the namespace, such as OpenStackControlPlane and OpenStackBaremetalSet. You do not need to include those in the additional lists.

  3. Save the openstack_backup.yaml file.
  4. Create the OpenStackBackupRequest CR:

    Copy to Clipboard Toggle word wrap
    $ oc create -f openstack_backup.yaml -n openstack
  5. Monitor the creation status of the OpenStackBackupRequest CR:

    Copy to Clipboard Toggle word wrap
    $ oc get openstackbackuprequest openstackbackupsave -n openstack
    • The Quiescing state indicates that OSPdO is waiting for the CRs to reach their finished state. The number of CRs can affect how long it takes to finish creating the backup.

      Copy to Clipboard Toggle word wrap
      NAME                     OPERATION   SOURCE   STATUS      COMPLETION TIMESTAMP
      openstackbackupsave      save                 Quiescing

      If the status remains in the Quiescing state for longer than expected, you can investigate the OSPdO logs to check progress:

      Copy to Clipboard Toggle word wrap
      $ oc logs <operator_pod> -c manager -f
      2022-01-11T18:26:15.180Z        INFO    controllers.OpenStackBackupRequest      Quiesce for save for OpenStackBackupRequest openstackbackupsave is waiting for: [OpenStackBaremetalSet: compute, OpenStackControlPlane: overcloud, OpenStackVMSet: controller]
      • Replace <operator_pod> with the name of the Operator pod.
    • The Saved state indicates that the OpenStackBackup CR is created.

      Copy to Clipboard Toggle word wrap
      NAME                     OPERATION   SOURCE   STATUS   COMPLETION TIMESTAMP
      openstackbackupsave      save                 Saved    2022-01-11T19:12:58Z
    • The Error state indicates the backup has failed to create. Review the request contents to find the error:

      Copy to Clipboard Toggle word wrap
      $ oc get openstackbackuprequest openstackbackupsave -o yaml -n openstack
  6. View the OpenStackBackup resource to confirm it exists:

    Copy to Clipboard Toggle word wrap
    $ oc get openstackbackup -n openstack
    NAME                                AGE
    openstackbackupsave-1641928378      6m7s

15.1.2. Restoring director Operator resources from a backup

When you request to restore a backup, Red Hat OpenStack Platform (RHOSP) director Operator (OSPdO) takes the contents of the specified OpenStackBackup resource and attempts to apply them to all existing custom resources (CRs), ConfigMap and Secret resources present within the namespace. OSPdO overwrites any existing resources in the namespace, and creates new resources for those not found within the namespace.

Procedure

  1. List the available backups:

    Copy to Clipboard Toggle word wrap
    $ oc get osbackup
  2. Inspect the details of a specific backup:

    Copy to Clipboard Toggle word wrap
    $ oc get backup <name> -o yaml
    • Replace <name> with the name of the backup you want to inspect.
  3. Create a file named openstack_restore.yaml on your workstation.
  4. Add the following configuration to your openstack_restore.yaml file to create the OpenStackBackupRequest custom resource (CR):

    Copy to Clipboard Toggle word wrap
    apiVersion: osp-director.openstack.org/v1beta1
    kind: OpenStackBackupRequest
    metadata:
      name: openstackbackuprestore
      namespace: openstack
    spec:
      mode: <mode>
      restoreSource: <restore_source>
    • Replace <mode> with one of the following options:

      • restore: Requests a restore from an existing OpenStackBackup.
      • cleanRestore: Completely wipes the existing OSPdO resources within the namespace before restoring and creating new resources from the existing OpenStackBackup.
    • Replace <restore_source> with the ID of the OpenStackBackup to restore, for example, openstackbackupsave-1641928378.
  5. Save the openstack_restore.yaml file.
  6. Create the OpenStackBackupRequest CR:

    Copy to Clipboard Toggle word wrap
    $ oc create -f openstack_restore.yaml -n openstack
  7. Monitor the creation status of the OpenStackBackupRequest CR:

    Copy to Clipboard Toggle word wrap
    $ oc get openstackbackuprequest openstackbackuprestore -n openstack
    • The Loading state indicates that all resources from the OpenStackBackup are being applied against the cluster.

      Copy to Clipboard Toggle word wrap
      NAME                     OPERATION  SOURCE                           STATUS     COMPLETION TIMESTAMP
      openstackbackuprestore   restore    openstackbackupsave-1641928378   Loading
    • The Reconciling state indicates that all resources are loaded and OSPdO has begun reconciling to attempt to provision all resources.

      Copy to Clipboard Toggle word wrap
      NAME                     OPERATION  SOURCE                           STATUS       COMPLETION TIMESTAMP
      openstackbackuprestore   restore    openstackbackupsave-1641928378   Reconciling
    • The Restored state indicates that the OpenStackBackup CR has been restored.

      Copy to Clipboard Toggle word wrap
      NAME                     OPERATION  SOURCE                           STATUS     COMPLETION TIMESTAMP
      openstackbackuprestore   restore    openstackbackupsave-1641928378   Restored   2022-01-12T13:48:57Z
    • The Error state indicates the restoration has failed. Review the request contents to find the error:

      Copy to Clipboard Toggle word wrap
      $ oc get openstackbackuprequest openstackbackuprestore -o yaml -n openstack

15.2. Backing up and restoring a director Operator deployed overcloud with the Relax-and-Recover tool

To back up a director Operator deployed overcloud with the Relax-and-Recover (ReaR) tool, you configure the backup node, install the ReaR tool on the control plane, and create the backup image. You can create backups as a part of your regular environment maintenance.

In addition, you must back up the control plane before performing updates or upgrades. You can use the backups to restore the control plane to its previous state if an error occurs during an update or upgrade.

15.2.1. Supported backup formats and protocols

The backup and restore process uses the open-source tool Relax-and-Recover (ReaR) to create and restore bootable backup images. ReaR is written in Bash and supports multiple image formats and multiple transport protocols.

The following list shows the backup formats and protocols that Red Hat OpenStack Platform supports when you use ReaR to back up and restore a director Operator deployed control plane.

Bootable media formats
  • ISO
File transport protocols
  • SFTP
  • NFS

15.2.2. Configuring the backup storage location

You can install and configure an NFS server to store the backup file. Before you create a backup of the control plane, configure the backup storage location in the bar-vars.yaml environment file. This file stores the key-value parameters that you want to pass to the backup execution.

Important
  • If you previously installed and configured an NFS or SFTP server, you do not need to complete this procedure. You enter the server information when you set up ReaR on the node that you want to back up.
  • By default, the Relax-and-Recover (ReaR) IP address parameter for the NFS server is 192.168.24.1. You must add the parameter tripleo_backup_and_restore_server to set the IP address value that matches your environment.

Procedure

  1. Create an NFS backup directory on your workstation:

    Copy to Clipboard Toggle word wrap
    $ mkdir -p /home/nfs/backup
    $ chmod 777 /home/nfs/backup
    $ cat >/etc/exports.d/backup.exports<<EOF
    /home/nfs/backup *(rw,sync,no_root_squash)
    EOF
    $ exportfs -av
  2. Create the bar-vars.yaml file on your workstation:

    Copy to Clipboard Toggle word wrap
    $ touch /home/stack/bar-vars.yaml
  3. In the bar-vars.yaml file, configure the backup storage location:

    Copy to Clipboard Toggle word wrap
    tripleo_backup_and_restore_server: <ip_address>
    tripleo_backup_and_restore_shared_storage_folder: <backup_dir>
    • Replace <ip_address> with the IP address of your NFS server, for example, 172.22.0.1. The default IP address is 192.168.24.1
    • Replace <backup_dir> with the location of the backup storage folder, for example, /home/nfs/backup.

15.2.3. Performing a backup of the control plane

To create a backup of the control plane, you must install and configure Relax-and-Recover (ReaR) on each of the Controller virtual machines (VMs).

Important

Due to a known issue, the ReaR backup of overcloud nodes continues even if a Controller node is down. Ensure that all your Controller nodes are running before you run the ReaR backup. A fix is planned for a later Red Hat OpenStack Platform (RHOSP) release. For more information, see BZ#2077335 - Back up of the overcloud ctlplane keeps going even if one controller is unreachable.

Procedure

  1. Extract the static Ansible inventory file from the location in which it was saved during installation:

    Copy to Clipboard Toggle word wrap
    $ oc rsh openstackclient
    $ cd
    $ find . -name tripleo-ansible-inventory.yaml
    $ cp ~/overcloud-deploy/<stack>/tripleo-ansible-inventory.yaml .
    • Replace <stack> with the name of your stack, for example, cloud-admin. By default, the name of the stack is overcloud.
  2. Install ReaR on each Controller virtual machine (VM):

    Copy to Clipboard Toggle word wrap
    $ openstack overcloud backup --setup-rear --extra-vars /home/cloud-admin/bar-vars.yaml --inventory /home/cloud-admin/tripleo-ansible-inventory.yaml
  3. Open the /etc/rear/local.conf file on each Controller VM :

    Copy to Clipboard Toggle word wrap
    $ ssh controller-0
    [cloud-admin@controller-0 ~]$ sudo -i
    [root@controller-0 ~]# cat >>/etc/rear/local.conf<<EOF
  4. In the /etc/rear/local.conf file, add the NETWORKING_PREPARATION_COMMANDS parameter to configure the Controller VM networks in the following format:

    Copy to Clipboard Toggle word wrap
    NETWORKING_PREPARATION_COMMANDS=('<command_1>' '<command_2>' ...'<command_n>')
    • Replace <command_1>, <command_2>, and all commands up to <command_n>, with commands that configure the network interface names or IP addresses. For example, you can add the ip link add br-ctlplane type bridge command to configure the control plane bridge name or add the ip link set eth0 up command to set the name of the interface. You can add more commands to the parameter based on your network configuration.
  5. Repeat the following command on each Controller VM to back up their config-drive partitions:

    Copy to Clipboard Toggle word wrap
    [root@controller-0 ~]# dd if=/dev/vda1 of=/mnt/config-drive
  6. Create a backup of the Controller VMs:

    Copy to Clipboard Toggle word wrap
    $ oc rsh openstackclient
    $ openstack overcloud backup --inventory /home/cloud-admin/tripleo-ansible-inventory.yaml

    The backup process runs sequentially on each Controller VM without disrupting the service to your environment.

    Note

    You cannot use cron to schedule backups because cron cannot be used on the openstackclient pod.

15.2.4. Restoring the control plane

If an error occurs during an update or upgrade, you can restore the control plane to its previous state by using the backup ISO image that you created using the Relax-and-Recover (ReaR) tool.

To restore the control plane, you must restore all Controller virtual machines (VMs) to ensure state consistency.

You can find the backup ISO images on the backup node.

Note

Red Hat supports backups of Red Hat OpenStack Platform with native SDNs, such as Open vSwitch (OVS) and the default Open Virtual Network (OVN). For information about third-party SDNs, refer to the third-party SDN documentation.

Prerequisites

  • You have created a backup of the control plane nodes.
  • You have access to the backup node.
  • A vncviewer package is installed on the workstation.

Procedure

  1. Power off each Controller VM. Ensure that all the Controller VMs are powered off completely:

    Copy to Clipboard Toggle word wrap
    $ oc get vm
  2. Upload the backup ISO images for each Controller VM into a cluster PVC:

    Copy to Clipboard Toggle word wrap
    $ virtctl image-upload pvc <backup_image> \
       --pvc-size=<pvc_size> \
       --image-path=<image_path> \
       --insecure
    • Replace <backup_image> with name of the PVC backup image for the Controller VM. For example, backup-controller-0-202310231141.
    • Replace <pvc_size> with the size of PVC required for the image specified with the --image-path option. For example, 4G.
    • Replace <image_path> with the path to the backup ISO image for the Controller VM. For example, /home/nfs/backup/controller-0/controller-0-202310231141.iso.
  3. Disable the director Operator by changing its replicas to 0:

    Copy to Clipboard Toggle word wrap
    $ oc patch csv -n openstack <csv> --type json -p="[{"op": "replace", "path": "/spec/install/spec/deployments/0/spec/replicas", "value": "0"}]"
    • Replace <csv> with the CSV from the environment, for example, osp-director-operator.v1.3.1.
  4. Verify that the osp-director-operator-controller-manager pod is stopped:

    Copy to Clipboard Toggle word wrap
    $ oc pod osp-director-operator-controller-manager
  5. Create a backup of each Controller VM resource:

    Copy to Clipboard Toggle word wrap
    $ oc get vm controller-0 -o yaml > controller-0-bk.yaml
  6. Update the Controller VM resource with bootOrder set to 1 and attach the uploaded PVC as a CD-ROM:

    Copy to Clipboard Toggle word wrap
    $ oc edit vm controller-0
    @@ -96,10 +96,7 @@
             devices:
               disks:
               - bootOrder: 1
    +            cdrom:
    +              bus: sata
    +            name: cdromiso
    +          - dedicatedIOThread: false
    -            dedicatedIOThread: false
                 disk:
                   bus: virtio
                 name: rootdisk
    @@ -177,9 +174,6 @@
             name: tenant
           terminationGracePeriodSeconds: 0
           volumes:
    +      - name: cdromiso
    +        persistentVolumeClaim:
    +          claimName: <backup_image>
           - dataVolume:
               name: controller-0-36a1
             name: rootdisk
    • Replace <backup_image> with name of the PVC backup image uploaded for the Controller VM in step 2. For example, backup-controller-0-202310231141.
  7. Start each Controller VM:

    Copy to Clipboard Toggle word wrap
    $ virtctl start controller-0
  8. Wait until the status of each Controller VM is RUNNING.
  9. Connect to each Controller VM by using VNC:

    Copy to Clipboard Toggle word wrap
    $ virtctl vnc controller-0
    Note

    If you are using SSH to access the Red Hat OpenShift Container Platform (RHOCP) CLI on a remote system, ensure the SSH X11 forwarding is correctly configured. For more information, see the Red Hat Knowledgebase solution How do I configure X11 forwarding over SSH in Red Hat Enterprise Linux?.

  10. ReaR starts automatic recovery after a timeout by default. If recovery does not start automically, you can manually select the Recover option from the Relax-and-Recover boot menu and specify the name of the control plane node to recover.
  11. Wait until the recovery is finished. When the control plane node restoration process completes, the console displays the following message:

    Copy to Clipboard Toggle word wrap
    Finished recovering your system
    Exiting rear recover
    Running exit tasks
  12. Enter the recovery shell as root.
  13. When the command line console is available, restore the config-drive partition of each control plane node:

    Copy to Clipboard Toggle word wrap
    # once completed, restore the config-drive partition (which is ISO9660)
    RESCUE <control_plane_node>:~ $ dd if=/mnt/local/mnt/config-drive of=<config_drive_partition>
  14. Power off each node:

    Copy to Clipboard Toggle word wrap
    $ RESCUE <control_plane_node>:~ #  poweroff
  15. Update the Controller VM resource and deattach the CD-ROM. Make sure the rootDisk has bootOrder: 1.
  16. Enable the director Operator by changing its replicas to 1:

    Copy to Clipboard Toggle word wrap
    $ oc patch csv -n openstack <csv> --type json -p="[{"op": "replace", "path": "/spec/install/spec/deployments/0/spec/replicas", "value": "1"}]"
  17. Verify that the osp-director-operator-controller-manager pod is started.
  18. Start each Controller VM:

    Copy to Clipboard Toggle word wrap
    $ virtctl start controller-0
    $ virtctl start controller-1
    $ virtctl start controller-2
  19. Wait until the Controller VMs are running. SELinux is relabelled on first boot.
  20. Check the cluster status:

    Copy to Clipboard Toggle word wrap
    $ pcs status

    If the Galera cluster does not restore as part of the restoration procedure, you must restore Galera manually. For more information, see Restoring the Galera cluster manually.

Back to top
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2025 Red Hat, Inc.