Search

Chapter 3. Backing up control plane nodes that use composable roles

download PDF

If you deployed the control plane with composable roles, also known as custom roles, configure the backup process to capture each type of node based on your composable role configuration. To back up the control plane nodes, you configure the backup node, install the Relax-and-Recover tool on the control plane nodes, and create the backup image.

You must back up the control plane nodes before performing updates or upgrades. You can use the backups to restore the control plane nodes to their previous state if an error occurs during an update or upgrade. You can also create backups as a part of your regular environment maintenance.

3.1. Supported backup formats and protocols

The undercloud and backup and restore process uses the open-source tool Relax-and-Recover (ReaR) to create and restore bootable backup images. ReaR is written in Bash and supports multiple image formats and multiple transport protocols.

The following list shows the backup formats and protocols that Red Hat OpenStack Platform supports.

Bootable media formats
  • ISO
File transport protocols
  • SFTP
  • NFS

3.2. Installing and configuring an NFS server on the backup node

You can install and configure a new NFS server to store the backup file. To install and configure an NFS server on the backup node, create an inventory file, set up an SSH key, and run the openstack undercloud backup command with the NFS server options.

Important
  • If you previously installed and configured an NFS or SFTP server, you do not need to complete this procedure. You enter the server information when you set up ReaR on the node that you want to back up.
  • By default, the Relax and Recover (ReaR) configuration assumes that the IP address of the NFS server is 192.168.24.1. If your NFS server has a different IP address, add the parameter tripleo_backup_and_restore_server to the setup ReaR command.

Procedure

  1. On the undercloud node, source the undercloud credentials:

    [stack@undercloud ~]$ source stackrc
    (undercloud) [stack@undercloud ~]$
  2. On the undercloud node, create an inventory file for the backup node and replace the <ip_address> and <user> with the values that apply to your environment:

    (undercloud) [stack@undercloud ~]$ cat <<'EOF'> ~/nfs-inventory.yaml
    [BackupNode]
    <backup_node> ansible_host=<ip_address> ansible_user=<user>
    EOF
  3. On the undercloud node, create the following Ansible playbook and replace <backup_node> with the host name of the backup node:

    (undercloud) [stack@undercloud ~]$ cat <<'EOF' > ~/bar_nfs_setup.yaml
    # Playbook
    # Substitute <backup_node> with the host name of your backup node.
    - become: true
      hosts: <backup_node>
      name: Setup NFS server for ReaR
      roles:
      - role: backup-and-restore
    EOF
  4. Copy the public SSH key from the undercloud node to the backup node.

    (undercloud) [stack@undercloud ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub <backup_node>

    Replace <backup_node> with the path and name of the backup node.

  5. On the undercloud node, enter the following ansible-playbook commands to configure the backup node:

    (undercloud) [stack@undercloud ~]$ ansible-playbook \
        -v -i ~/nfs-inventory.yaml \
        --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
        --become \
        --become-user root \
        --tags bar_setup_nfs_server \
        ~/bar_nfs_setup.yaml

3.3. Installing ReaR on the control plane nodes

Before you create a backup of the control plane nodes, install and configure Relax and Recover (ReaR) on each of the control plane nodes.

Important

Due to a known issue, the ReaR backup of overcloud nodes continues even if a Controller node is down. Ensure that all your Controller nodes are running before you run the ReaR backup. A fix is planned for a later Red Hat OpenStack Platform (RHOSP) release. For more information, see BZ#2077335 - Back up of the overcloud ctlplane keeps going even if one controller is unreachable.

Prerequisites

Procedure

  1. On the undercloud node, create the following Ansible playbook:

    (undercloud) [stack@undercloud ~]$ cat <<'EOF' > ~/bar_rear_setup-controller.yaml
    # Playbook
    # Install and configuring ReaR on the control plane nodes
    - become: true
      hosts: Controller
      name: Install ReaR
      roles:
      - role: backup-and-restore
    EOF
    Note

    If you deployed the control plane nodes with composable roles, replace the host type Controller with the types of nodes in your control plane. For example, if you deployed the database, messaging, and networking on separate nodes, enter ControllerOpenstack,Database,Messaging,Networker.

  2. Choose one of the following options:

    1. If you use NFS and the IP address of the NFS server is the default value 192.168.24.1, on the undercloud node, enter the following Ansible command to install ReaR on the control plane nodes:

      (undercloud) [stack@undercloud ~]$ ansible-playbook \
          -v -i ~/tripleo-inventory.yaml \
          --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
          --become \
          --become-user root \
          --tags bar_setup_rear \
          ~/bar_rear_setup-controller.yaml
    2. If you use SFTP and the IP address of the NFS server is not the default value 192.168.24.1, enter the following Ansible command to install ReaR on the control plane nodes:

      (undercloud) [stack@undercloud ~]$ ansible-playbook \
          -v -i ~/tripleo-inventory.yaml \
          --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
          --become \
          --become-user root \
          -e tripleo_backup_and_restore_server=<nfs_ip> \
          --tags bar_setup_rear \
          ~/bar_rear_setup-controller.yaml

      Replace <nfs_ip> with the IP address of your NFS server.

    3. If you use SFTP, enter the following Ansible command do install ReaR on the control plane nodes:

      (undercloud) [stack@undercloud ~]$ ansible-playbook \
          -v -i ~/tripleo-inventory.yaml \
          --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
          --become \
          --become-user root \
          -e tripleo_backup_and_restore_output_url=sftp://<user>:<password>@<backup_node_ip>/ \
          -e tripleo_backup_and_restore_backup_url=iso:///backup/ \
          --tags bar_setup_rear \
          ~/bar_rear_setup-undercloud.yaml
  3. If your system uses the UEFI boot loader, perform the following steps on the control plane nodes:

    1. Install the following tools:

      $ sudo dnf install dosfstools efibootmgr
    2. Enable UEFI backup in the ReaR configuration file located in /etc/rear/local.conf by replacing the USING_UEFI_BOOTLOADER parameter value 0 with the value 1.

3.4. Configuring Open vSwitch (OVS) interfaces for backup

If you use an Open vSwitch (OVS) bridge in your environment, you must manually configure the OVS interfaces before you create a backup of the undercloud or control plane nodes. The restoration process uses this information to restore the network interfaces.

Procedure

  • In the /etc/rear/local.conf file, add the NETWORKING_PREPARATION_COMMANDS parameter in the following format:

    NETWORKING_PREPARATION_COMMANDS=('<command_1>' '<command_2>' ...')

    Replace <command_1> and <command_2> with commands that configure the network interface names or IP addresses. For example, you can add the ip link add br-ctlplane type bridge command to configure the control plane bridge name or add the ip link set eth0 up command to set the name of the interface. You can add more commands to the parameter based on your network configuration.

3.5. Creating a backup of control plane nodes that use composable roles

To create a backup of control plane nodes that use composable roles, use the backup-and-restore Ansible role. You can then use the backup to restore the control plane nodes to their previous state in case the nodes become corrupted or inaccessible. The backup of the control plane nodes includes the backup of the database that runs on the control plane nodes.

Prerequisites

Procedure

  1. On each Controller node, back up the config-drive partition of each node:

    [heat-admin@controller-x ~]$ mkdir /mnt/config-drive
    [heat-admin@controller-x ~]$ dd if=<config_drive_partition> of=/mnt/config-drive
    Note

    You only need to perform this step on the Controller nodes.

  2. On the undercloud node, create the following Ansible playbook:

    (undercloud) [stack@undercloud ~]$ cat <<'EOF' > ~/bar_rear_create_restore_images-controller.yaml
    # Playbook
    # Using ReaR on the Contorl-Plane - Composable Roles
    
    - become: true
      hosts: ControllerOpenstack,Database,Messaging,Networker
      name: Stop service management
      tasks:
        - include_role:
            name: backup-and-restore
            tasks_from: ../backup/tasks/service_manager_pause
          when:
           - tripleo_backup_and_restore_service_manager
    
    - become: true
      hosts: Database
      name: Database Backup
      tasks:
        - include_role:
            name: backup-and-restore
            tasks_from: ../backup/tasks/db_backup
    
    - become: true
      hosts: pacemaker
      name: Backup pacemaker configuration
      tasks:
        - include_role:
            name: backup-and-restore
            tasks_from: pacemaker_backup
    
    - become: true
      hosts: ControllerOpenstack,Database,Messaging,Networker
      name: Create recovery images with ReaR
      tasks:
        - include_role:
            name: backup-and-restore
            tasks_from: ../backup/tasks/main
    
    - become: true
      hosts: pacemaker
      name: Enabled pacemaker
      tasks:
        - name: Enable pacemaker
          command: pcs cluster start --all
          when: enabled_galera
          run_once: true
          tags:
           - bar_create_recover_image
    
    - become: true
      hosts: Database
      name: Restart galera
      tasks:
        - name: unPause database container
          command: "{{ tripleo_container_cli }} unpause {{ tripleo_backup_and_restore_mysql_container }}"
          when:
           - tripleo_container_cli is defined
            - not enabled_galera
            - tripleo_backup_and_restore_mysql_container is defined
          tags:
           - bar_create_recover_image
    
    - become: true
      hosts: ControllerOpenstack,Database,Messaging,Networker
      name: Unpause everything
      tasks:
        - name: Gather Container Service Name
          shell: |
           set -o pipefail
            /usr/bin/{{ tripleo_container_cli }} ps -a --filter='status=paused' --format '{{ '{{' }}.Names {{ '}}' }} '
          register: container_services
          changed_when: container_services.stdout is defined
          tags:
           - bar_create_recover_image
    
        - name: Unpause containers for database backup.
          command: "{{ tripleo_container_cli }} unpause {{ item }}"
          with_items: "{{ container_services.stdout_lines }}"
          when: tripleo_container_cli is defined
          tags:
           - bar_create_recover_image
  3. On the undercloud node, enter the following ansible-playbook command to create a backup of the control plane nodes:

    Important

    Do not operate the stack. When you stop the pacemaker cluster and the containers, this results in the temporary interruption of control plane services to Compute nodes. There is also disruption to network connectivity, Ceph, and the NFS or SFTP data plane service. You cannot create instances, migrate instances, authenticate requests, or monitor the health of the cluster until the pacemaker cluster and the containers return to service following the final step of this procedure.

    (undercloud) [stack@undercloud ~]$ ansible-playbook \
        -v -i ~/tripleo-inventory.yaml \
        --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
        --become \
        --become-user root \
        --tags bar_create_recover_image \
        ~/bar_rear_create_restore_images-controller.yaml

3.6. Scheduling control plane node backups with cron

Important

This feature is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.

You can configure a cron job to create backups of the control plane nodes with ReaR using the Ansible backup-and-restore role. You can view the logs in the /var/log/rear-cron directory.

Prerequisites

Procedure

  1. On the undercloud node, enter the following command to create the backup script:

    [stack@undercloud ~]$ cat <<'EOF' > /home/stack/execute-rear-cron.sh
    
    #!/bin/bash
    
    OWNER="stack"
    TODAY=`date +%Y%m%d`
    FILE="/var/log/rear-cron.${TODAY}"
    sudo touch ${FILE}
    sudo chown ${OWNER}:${OWNER} ${FILE}
    
    CURRENTTIME=`date`
    echo "[$CURRENTTIME] rear start" >> ${FILE}
    source /home/stack/stackrc && /usr/bin/openstack overcloud backup 2>&1 >> ${FILE}
    CURRENTTIME=`date`
    echo "[$CURRENTTIME] rear end" >> ${FILE}
    EOF
  2. Set executable privileges for the /home/stack/execute-rear-cron.sh script:

    [stack@undercloud ~]$ chmod 755 /home/stack/execute-rear-cron.sh
  3. Edit the crontab file with the crontab -e command and use an editor of your choice to add the following cron job. Ensure you save the changes to the file:

    [stack@undercloud ~]# $ crontab -e
    #adding the following line
    0 0 * * * /home/stack/execute-rear-cron.sh

    The /home/stack/execute-rear-cron.sh script is scheduled to be executed by the stack user at midnight.

  4. To verify that the cron job is scheduled, enter the following command:

    [stack@undercloud ~]$ crontab -l

    The command output displays the scheduled cron jobs:

    0 0 * * * /home/stack/execute-rear-cron.sh

3.7. Additional resources

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.