Chapter 15. Backing up and restoring a director Operator deployed overcloud


To back up a Red Hat OpenStack Platform (RHOSP) overcloud that was deployed with director Operator (OSPdO), you must backup the Red Hat OpenShift Container Platform (RHOCP) OSPdO resources, and the use the Relax-and-Recover (ReaR) tool to backup the control plane and overcloud.

Red Hat OpenStack Platform (RHOSP) director Operator (OSPdO) provides custom resource definitions (CRDs) for backing up and restoring a deployment. You do not have to manually export and import multiple configurations. OSPdO knows which custom resources (CRs), including the ConfigMap and Secret CRs, that it needs to create a complete backup because it is aware of the state of all resources. Therefore, OSPdO does not backup any configuration that is in an incomplete or error state.

To backup and restore an OSPdO deployment, you create an OpenStackBackupRequest CR to initiate the creation or restoration of a backup. Your OpenStackBackupRequest CR creates the OpenStackBackup CR that stores the backup of the custom resources (CRs), the ConfigMap and the Secret configurations for the specified namespace.

15.1.1. Backing up director Operator resources

To create a backup you must create an OpenStackBackupRequest custom resource (CR) for the namespace. The OpenStackBackup CR is created when the OpenStackBackupRequest object is created in save mode.

Procedure

  1. Create a file named openstack_backup.yaml on your workstation.
  2. Add the following configuration to your openstack_backup.yaml file to create the OpenStackBackupRequest custom resource (CR):

    apiVersion: osp-director.openstack.org/v1beta1
    kind: OpenStackBackupRequest
    metadata:
      name: openstackbackupsave
      namespace: openstack
    spec:
      mode: save 
    1
    
      additionalConfigMaps: [] 
    2
    
      additionalSecrets: [] 
    3
    1
    Set the mode to save to request creation of an OpenStackBackup CR.
    2
    Optional: Include any ConfigMap resources that you created manually.
    3
    Optional: Include any Secret resources that you created manually.
    Note

    OSPdO attempts to include all ConfigMap and Secret objects associated with the OSPdO CRs in the namespace, such as OpenStackControlPlane and OpenStackBaremetalSet. You do not need to include those in the additional lists.

  3. Save the openstack_backup.yaml file.
  4. Create the OpenStackBackupRequest CR:

    $ oc create -f openstack_backup.yaml -n openstack
  5. Monitor the creation status of the OpenStackBackupRequest CR:

    $ oc get openstackbackuprequest openstackbackupsave -n openstack
    • The Quiescing state indicates that OSPdO is waiting for the CRs to reach their finished state. The number of CRs can affect how long it takes to finish creating the backup.

      NAME                     OPERATION   SOURCE   STATUS      COMPLETION TIMESTAMP
      openstackbackupsave      save                 Quiescing

      If the status remains in the Quiescing state for longer than expected, you can investigate the OSPdO logs to check progress:

      $ oc logs <operator_pod> -c manager -f
      2022-01-11T18:26:15.180Z        INFO    controllers.OpenStackBackupRequest      Quiesce for save for OpenStackBackupRequest openstackbackupsave is waiting for: [OpenStackBaremetalSet: compute, OpenStackControlPlane: overcloud, OpenStackVMSet: controller]
      • Replace <operator_pod> with the name of the Operator pod.
    • The Saved state indicates that the OpenStackBackup CR is created.

      NAME                     OPERATION   SOURCE   STATUS   COMPLETION TIMESTAMP
      openstackbackupsave      save                 Saved    2022-01-11T19:12:58Z
    • The Error state indicates the backup has failed to create. Review the request contents to find the error:

      $ oc get openstackbackuprequest openstackbackupsave -o yaml -n openstack
  6. View the OpenStackBackup resource to confirm it exists:

    $ oc get openstackbackup -n openstack
    NAME                                AGE
    openstackbackupsave-1641928378      6m7s

When you request to restore a backup, Red Hat OpenStack Platform (RHOSP) director Operator (OSPdO) takes the contents of the specified OpenStackBackup resource and attempts to apply them to all existing custom resources (CRs), ConfigMap and Secret resources present within the namespace. OSPdO overwrites any existing resources in the namespace, and creates new resources for those not found within the namespace.

Procedure

  1. List the available backups:

    $ oc get osbackup
  2. Inspect the details of a specific backup:

    $ oc get backup <name> -o yaml
    • Replace <name> with the name of the backup you want to inspect.
  3. Create a file named openstack_restore.yaml on your workstation.
  4. Add the following configuration to your openstack_restore.yaml file to create the OpenStackBackupRequest custom resource (CR):

    apiVersion: osp-director.openstack.org/v1beta1
    kind: OpenStackBackupRequest
    metadata:
      name: openstackbackuprestore
      namespace: openstack
    spec:
      mode: <mode>
      restoreSource: <restore_source>
    • Replace <mode> with one of the following options:

      • restore: Requests a restore from an existing OpenStackBackup.
      • cleanRestore: Completely wipes the existing OSPdO resources within the namespace before restoring and creating new resources from the existing OpenStackBackup.
    • Replace <restore_source> with the ID of the OpenStackBackup to restore, for example, openstackbackupsave-1641928378.
  5. Save the openstack_restore.yaml file.
  6. Create the OpenStackBackupRequest CR:

    $ oc create -f openstack_restore.yaml -n openstack
  7. Monitor the creation status of the OpenStackBackupRequest CR:

    $ oc get openstackbackuprequest openstackbackuprestore -n openstack
    • The Loading state indicates that all resources from the OpenStackBackup are being applied against the cluster.

      NAME                     OPERATION  SOURCE                           STATUS     COMPLETION TIMESTAMP
      openstackbackuprestore   restore    openstackbackupsave-1641928378   Loading
    • The Reconciling state indicates that all resources are loaded and OSPdO has begun reconciling to attempt to provision all resources.

      NAME                     OPERATION  SOURCE                           STATUS       COMPLETION TIMESTAMP
      openstackbackuprestore   restore    openstackbackupsave-1641928378   Reconciling
    • The Restored state indicates that the OpenStackBackup CR has been restored.

      NAME                     OPERATION  SOURCE                           STATUS     COMPLETION TIMESTAMP
      openstackbackuprestore   restore    openstackbackupsave-1641928378   Restored   2022-01-12T13:48:57Z
    • The Error state indicates the restoration has failed. Review the request contents to find the error:

      $ oc get openstackbackuprequest openstackbackuprestore -o yaml -n openstack

To back up a director Operator deployed overcloud with the Relax-and-Recover (ReaR) tool, you configure the backup node, install the ReaR tool on the control plane, and create the backup image. You can create backups as a part of your regular environment maintenance.

In addition, you must back up the control plane before performing updates or upgrades. You can use the backups to restore the control plane to its previous state if an error occurs during an update or upgrade.

15.2.1. Supported backup formats and protocols

The backup and restore process uses the open-source tool Relax-and-Recover (ReaR) to create and restore bootable backup images. ReaR is written in Bash and supports multiple image formats and multiple transport protocols.

The following list shows the backup formats and protocols that Red Hat OpenStack Platform supports when you use ReaR to back up and restore a director Operator deployed control plane.

Bootable media formats
  • ISO
File transport protocols
  • SFTP
  • NFS

15.2.2. Configuring the backup storage location

You can install and configure an NFS server to store the backup file. Before you create a backup of the control plane, configure the backup storage location in the bar-vars.yaml environment file. This file stores the key-value parameters that you want to pass to the backup execution.

Important
  • If you previously installed and configured an NFS or SFTP server, you do not need to complete this procedure. You enter the server information when you set up ReaR on the node that you want to back up.
  • By default, the Relax-and-Recover (ReaR) IP address parameter for the NFS server is 192.168.24.1. You must add the parameter tripleo_backup_and_restore_server to set the IP address value that matches your environment.

Procedure

  1. Create an NFS backup directory on your workstation:

    $ mkdir -p /home/nfs/backup
    $ chmod 777 /home/nfs/backup
    $ cat >/etc/exports.d/backup.exports<<EOF
    /home/nfs/backup *(rw,sync,no_root_squash)
    EOF
    $ exportfs -av
  2. Create the bar-vars.yaml file on your workstation:

    $ touch /home/stack/bar-vars.yaml
  3. In the bar-vars.yaml file, configure the backup storage location:

    tripleo_backup_and_restore_server: <ip_address>
    tripleo_backup_and_restore_shared_storage_folder: <backup_dir>
    • Replace <ip_address> with the IP address of your NFS server, for example, 172.22.0.1. The default IP address is 192.168.24.1
    • Replace <backup_dir> with the location of the backup storage folder, for example, /home/nfs/backup.

15.2.3. Performing a backup of the control plane

To create a backup of the control plane, you must install and configure Relax-and-Recover (ReaR) on each of the Controller virtual machines (VMs).

Important

Due to a known issue, the ReaR backup of overcloud nodes continues even if a Controller node is down. Ensure that all your Controller nodes are running before you run the ReaR backup. A fix is planned for a later Red Hat OpenStack Platform (RHOSP) release. For more information, see BZ#2077335 - Back up of the overcloud ctlplane keeps going even if one controller is unreachable.

Procedure

  1. Extract the static Ansible inventory file from the location in which it was saved during installation:

    $ oc rsh openstackclient
    $ cd
    $ find . -name tripleo-ansible-inventory.yaml
    $ cp ~/overcloud-deploy/<stack>/tripleo-ansible-inventory.yaml .
    • Replace <stack> with the name of your stack, for example, cloud-admin. By default, the name of the stack is overcloud.
  2. Install ReaR on each Controller virtual machine (VM):

    $ openstack overcloud backup --setup-rear --extra-vars /home/cloud-admin/bar-vars.yaml --inventory /home/cloud-admin/tripleo-ansible-inventory.yaml
  3. Open the /etc/rear/local.conf file on each Controller VM :

    $ ssh controller-0
    [cloud-admin@controller-0 ~]$ sudo -i
    [root@controller-0 ~]# cat >>/etc/rear/local.conf<<EOF
  4. In the /etc/rear/local.conf file, add the NETWORKING_PREPARATION_COMMANDS parameter to configure the Controller VM networks in the following format:

    NETWORKING_PREPARATION_COMMANDS=('<command_1>' '<command_2>' ...'<command_n>')
    • Replace <command_1>, <command_2>, and all commands up to <command_n>, with commands that configure the network interface names or IP addresses. For example, you can add the ip link add br-ctlplane type bridge command to configure the control plane bridge name or add the ip link set eth0 up command to set the name of the interface. You can add more commands to the parameter based on your network configuration.
  5. Repeat the following command on each Controller VM to back up their config-drive partitions:

    [root@controller-0 ~]# dd if=/dev/vda1 of=/mnt/config-drive
  6. Create a backup of the Controller VMs:

    $ oc rsh openstackclient
    $ openstack overcloud backup --inventory /home/cloud-admin/tripleo-ansible-inventory.yaml

    The backup process runs sequentially on each Controller VM without disrupting the service to your environment.

    Note

    You cannot use cron to schedule backups because cron cannot be used on the openstackclient pod.

15.2.4. Restoring the control plane

If an error occurs during an update or upgrade, you can restore the control plane to its previous state by using the backup ISO image that you created using the Relax-and-Recover (ReaR) tool.

To restore the control plane, you must restore all Controller virtual machines (VMs) to ensure state consistency.

You can find the backup ISO images on the backup node.

Note

Red Hat supports backups of Red Hat OpenStack Platform with native SDNs, such as Open vSwitch (OVS) and the default Open Virtual Network (OVN). For information about third-party SDNs, refer to the third-party SDN documentation.

Prerequisites

  • You have created a backup of the control plane nodes.
  • You have access to the backup node.
  • A vncviewer package is installed on the workstation.

Procedure

  1. Power off each Controller VM. Ensure that all the Controller VMs are powered off completely:

    $ oc get vm
  2. Upload the backup ISO images for each Controller VM into a cluster PVC:

    $ virtctl image-upload pvc <backup_image> \
       --pvc-size=<pvc_size> \
       --image-path=<image_path> \
       --insecure
    • Replace <backup_image> with name of the PVC backup image for the Controller VM. For example, backup-controller-0-202310231141.
    • Replace <pvc_size> with the size of PVC required for the image specified with the --image-path option. For example, 4G.
    • Replace <image_path> with the path to the backup ISO image for the Controller VM. For example, /home/nfs/backup/controller-0/controller-0-202310231141.iso.
  3. Disable the director Operator by changing its replicas to 0:

    $ oc patch csv -n openstack <csv> --type json -p="[{"op": "replace", "path": "/spec/install/spec/deployments/0/spec/replicas", "value": "0"}]"
    • Replace <csv> with the CSV from the environment, for example, osp-director-operator.v1.3.1.
  4. Verify that the osp-director-operator-controller-manager pod is stopped:

    $ oc pod osp-director-operator-controller-manager
  5. Create a backup of each Controller VM resource:

    $ oc get vm controller-0 -o yaml > controller-0-bk.yaml
  6. Update the Controller VM resource with bootOrder set to 1 and attach the uploaded PVC as a CD-ROM:

    $ oc edit vm controller-0
    @@ -96,10 +96,7 @@
             devices:
               disks:
               - bootOrder: 1
    +            cdrom:
    +              bus: sata
    +            name: cdromiso
    +          - dedicatedIOThread: false
    -            dedicatedIOThread: false
                 disk:
                   bus: virtio
                 name: rootdisk
    @@ -177,9 +174,6 @@
             name: tenant
           terminationGracePeriodSeconds: 0
           volumes:
    +      - name: cdromiso
    +        persistentVolumeClaim:
    +          claimName: <backup_image>
           - dataVolume:
               name: controller-0-36a1
             name: rootdisk
    • Replace <backup_image> with name of the PVC backup image uploaded for the Controller VM in step 2. For example, backup-controller-0-202310231141.
  7. Start each Controller VM:

    $ virtctl start controller-0
  8. Wait until the status of each Controller VM is RUNNING.
  9. Connect to each Controller VM by using VNC:

    $ virtctl vnc controller-0
    Note

    If you are using SSH to access the Red Hat OpenShift Container Platform (RHOCP) CLI on a remote system, ensure the SSH X11 forwarding is correctly configured. For more information, see the Red Hat Knowledgebase solution How do I configure X11 forwarding over SSH in Red Hat Enterprise Linux?.

  10. ReaR starts automatic recovery after a timeout by default. If recovery does not start automically, you can manually select the Recover option from the Relax-and-Recover boot menu and specify the name of the control plane node to recover.
  11. Wait until the recovery is finished. When the control plane node restoration process completes, the console displays the following message:

    Finished recovering your system
    Exiting rear recover
    Running exit tasks
  12. Enter the recovery shell as root.
  13. When the command line console is available, restore the config-drive partition of each control plane node:

    # once completed, restore the config-drive partition (which is ISO9660)
    RESCUE <control_plane_node>:~ $ dd if=/mnt/local/mnt/config-drive of=<config_drive_partition>
  14. Power off each node:

    $ RESCUE <control_plane_node>:~ #  poweroff
  15. Update the Controller VM resource and deattach the CD-ROM. Make sure the rootDisk has bootOrder: 1.
  16. Enable the director Operator by changing its replicas to 1:

    $ oc patch csv -n openstack <csv> --type json -p="[{"op": "replace", "path": "/spec/install/spec/deployments/0/spec/replicas", "value": "1"}]"
  17. Verify that the osp-director-operator-controller-manager pod is started.
  18. Start each Controller VM:

    $ virtctl start controller-0
    $ virtctl start controller-1
    $ virtctl start controller-2
  19. Wait until the Controller VMs are running. SELinux is relabelled on first boot.
  20. Check the cluster status:

    $ pcs status

    If the Galera cluster does not restore as part of the restoration procedure, you must restore Galera manually. For more information, see Restoring the Galera cluster manually.

Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top