Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Chapter 5. Executing the restore procedure
If an error occurs during an update or upgrade, you can restore either the undercloud or overcloud control plane nodes or both so that they assume their previous state. If the Galera cluster does not restore automatically as part of the restoration procedure, you must restore the cluster manually.
You can also restore the undercloud or overcloud control plane nodes with colocated ceph monitors.
When you boot from an ISO file, ensure that the NFS server is reachable by the undercloud and overcloud.
Use the following general steps:
- Burn the bootable ISO image to a DVD or load it through ILO remote access.
- Boot the node that requires restoration from the recovery medium.
-
Select Recover <HOSTNAME>. Replace
<HOSTNAME>with the name of the node to restore. -
Log on as user
root. - Recover the backup.
5.1. Restoring the undercloud Link kopierenLink in die Zwischenablage kopiert!
If an error occurs during a fast-forward upgrade, you can restore the undercloud node to its previously saved state by using the ISO image that you created using the Section 4.2, “Backing up the undercloud” procedure. The backup procedure stores the ISO images on the backup node in the folders that you created during the Section 2.2, “Creating and exporting the backup directory” step.
Procedure
- Shut down the undercloud node. Ensure that the undercloud node is shutdown completely before you proceed.
-
Restore the undercloud node by booting it with the ISO image created during the backup process. The ISO image is located under the
/ctl_plane_backupsdirectory of the Backup node. - When the Relax-and-Recover boot menu appears, select Recover <UNDERCLOUD_NODE> where <UNDERCLOUD_NODE> is the name of the undercloud node.
Log in as user
root.The following message displays:
Welcome to Relax-and-Recover. Run "rear recover" to restore your system! RESCUE <UNDERCLOUD_NODE>:~ # rear recover
Welcome to Relax-and-Recover. Run "rear recover" to restore your system! RESCUE <UNDERCLOUD_NODE>:~ # rear recoverCopy to Clipboard Copied! Toggle word wrap Toggle overflow The image restore progresses quickly. When it is complete, the console echoes the following message:
Finished recovering your system Exiting rear recover Running exit tasks
Finished recovering your system Exiting rear recover Running exit tasksCopy to Clipboard Copied! Toggle word wrap Toggle overflow When the command line interface is available, the image is restored. Switch the node off.
RESCUE <UNDERCLOUD_NODE>:~ # poweroff
RESCUE <UNDERCLOUD_NODE>:~ # poweroffCopy to Clipboard Copied! Toggle word wrap Toggle overflow On boot up, the node resumes with its previous state.
5.2. Restoring the control plane Link kopierenLink in die Zwischenablage kopiert!
If an error occurs during a fast-forward upgrade, you can use the ISO images created using the Section 4.3, “Backing up the control plane” procedure to restore the control plane nodes to their previously saved state. To restore the control plane, you must restore all control plane nodes to the previous state to ensure state consistency.
Red Hat supports backups of Red Hat OpenStack Platform with native SDNs, such as Open vSwitch (OVS) and the default Open Virtual Network (OVN). For information about third-party SDNs, refer to the third-party SDN documentation.
Procedure
- Shut down each control plane node. Ensure that the control plane nodes are shut down completely before you proceed.
-
Restore the control plane nodes by booting them with the ISO image that you created during the backup process. The ISO images are located under the
/ctl_plane_backupsdirectory of the Backup node. When the Relax-and-Recover boot menu appears, select Recover <CONTROL_PLANE_NODE>. Replace <CONTROL_PLANE_NODE> with the name of the control plane node.
The following message displays:
Welcome to Relax-and-Recover. Run "rear recover" to restore your system! RESCUE <CONTROL_PLANE_NODE>:~ # rear recover
Welcome to Relax-and-Recover. Run "rear recover" to restore your system! RESCUE <CONTROL_PLANE_NODE>:~ # rear recoverCopy to Clipboard Copied! Toggle word wrap Toggle overflow The image restore progresses quickly. When the restore completes, the console echoes the following message:
Finished recovering your system Exiting rear recover Running exit tasks
Finished recovering your system Exiting rear recover Running exit tasksCopy to Clipboard Copied! Toggle word wrap Toggle overflow When the command line interface is available, the image is restored. Switch the node off.
RESCUE <CONTROL_PLANE_NODE>:~ # poweroff
RESCUE <CONTROL_PLANE_NODE>:~ # poweroffCopy to Clipboard Copied! Toggle word wrap Toggle overflow Set the boot sequence to the normal boot device. On boot up, the node resumes with its previous state.
To ensure that the services are running correctly, check the status of pacemaker. Log in to a controller as
rootuser and run the following command:pcs status
# pcs statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow - To view the status of the overcloud, use Tempest. For more information about Tempest, see Chapter 4 of the OpenStack Integration Test Suite Guide.
5.3. Troubleshooting the Galera cluster Link kopierenLink in die Zwischenablage kopiert!
If the Galera cluster does not restore as part of the restoration procedure, you must restore Galera manually.
In this procedure, you must perform some steps on one Controller node. Ensure that you perform these steps on the same Controller node as you go through the procedure.
Procedure
On Controller-0, retrieve the Galera cluster virtual IP:
sudo hiera -c /etc/puppet/hiera.yaml mysql_vip
$ sudo hiera -c /etc/puppet/hiera.yaml mysql_vipCopy to Clipboard Copied! Toggle word wrap Toggle overflow Disable the database connections through the virtual IP on all Controller nodes:
sudo iptables -I INPUT -p tcp --destination-port 3306 -d $MYSQL_VIP -j DROP
$ sudo iptables -I INPUT -p tcp --destination-port 3306 -d $MYSQL_VIP -j DROPCopy to Clipboard Copied! Toggle word wrap Toggle overflow On Controller-0, retrieve the MySQL root password:
sudo hiera -c /etc/puppet/hiera.yaml mysql::server::root_password
$ sudo hiera -c /etc/puppet/hiera.yaml mysql::server::root_passwordCopy to Clipboard Copied! Toggle word wrap Toggle overflow On Controller-0, set the Galera resource to
unmanagedmode:sudo pcs resource unmanage galera-bundle
$ sudo pcs resource unmanage galera-bundleCopy to Clipboard Copied! Toggle word wrap Toggle overflow Stop the MySQL containers on all Controller nodes:
sudo docker container stop $(sudo docker container ls --all --format “{{.Names}}” --filter=name=galera-bundle)$ sudo docker container stop $(sudo docker container ls --all --format “{{.Names}}” --filter=name=galera-bundle)Copy to Clipboard Copied! Toggle word wrap Toggle overflow Move the current directory on all Controller nodes:
sudo mv /var/lib/mysql /var/lib/mysql-save
$ sudo mv /var/lib/mysql /var/lib/mysql-saveCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the new directory
/var/lib/mysqon all Controller nodes:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start the MySQL containers on all Controller nodes:
sudo docker container start $(sudo docker container ls --all --format "{{ .Names }}" --filter=name=galera-bundle)$ sudo docker container start $(sudo docker container ls --all --format "{{ .Names }}" --filter=name=galera-bundle)Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the MySQL database on all Controller nodes:
sudo docker exec -i $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysql_install_db --datadir=/var/lib/mysql --user=mysql"$ sudo docker exec -i $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysql_install_db --datadir=/var/lib/mysql --user=mysql"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start the database on all Controller nodes:
sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysqld_safe --skip-networking --wsrep-on=OFF" &$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysqld_safe --skip-networking --wsrep-on=OFF" &Copy to Clipboard Copied! Toggle word wrap Toggle overflow Move the
.my.cnfGalera configuration file on all Controller nodes:sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mv /root/.my.cnf /root/.my.cnf.bck"$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mv /root/.my.cnf /root/.my.cnf.bck"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Reset the Galera root password on all Controller nodes:
sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysql -uroot -e'use mysql;update user set password=PASSWORD(\"$ROOTPASSWORD\")where User=\"root\";flush privileges;'"$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysql -uroot -e'use mysql;update user set password=PASSWORD(\"$ROOTPASSWORD\")where User=\"root\";flush privileges;'"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restore the
.my.cnfGalera configuration file inside the Galera container on all Controller nodes:sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mv /root/.my.cnf.bck /root/.my.cnf"$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mv /root/.my.cnf.bck /root/.my.cnf"Copy to Clipboard Copied! Toggle word wrap Toggle overflow On Controller-0, copy the backup database files to
/var/lib/MySQL:sudo cp $BACKUP_FILE /var/lib/mysql sudo cp $BACKUP_GRANT_FILE /var/lib/mysql
$ sudo cp $BACKUP_FILE /var/lib/mysql $ sudo cp $BACKUP_GRANT_FILE /var/lib/mysqlCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe path to these files is /home/heat-admin/.
On Controller-0, restore the MySQL database:
sudo docker exec $(docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysql -u root -p$ROOT_PASSWORD < \"/var/lib/mysql/$BACKUP_FILE \" " sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysql -u root -p$ROOT_PASSWORD < \"/var/lib/mysql/$BACKUP_GRANT_FILE \" "$ sudo docker exec $(docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysql -u root -p$ROOT_PASSWORD < \"/var/lib/mysql/$BACKUP_FILE \" " $ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysql -u root -p$ROOT_PASSWORD < \"/var/lib/mysql/$BACKUP_GRANT_FILE \" "Copy to Clipboard Copied! Toggle word wrap Toggle overflow Shut down the databases on all Controller nodes:
sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysqladmin shutdown"$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "mysqladmin shutdown"Copy to Clipboard Copied! Toggle word wrap Toggle overflow On Controller-0, start the bootstrap node:
sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" --filter=name=galera-bundle) \ /usr/bin/mysqld_safe --pid-file=/var/run/mysql/mysqld.pid --socket=/var/lib/mysql/mysql.sock --datadir=/var/lib/mysql \ --log-error=/var/log/mysql_cluster.log --user=mysql --open-files-limit=16384 \ --wsrep-cluster-address=gcomm:// &$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" --filter=name=galera-bundle) \ /usr/bin/mysqld_safe --pid-file=/var/run/mysql/mysqld.pid --socket=/var/lib/mysql/mysql.sock --datadir=/var/lib/mysql \ --log-error=/var/log/mysql_cluster.log --user=mysql --open-files-limit=16384 \ --wsrep-cluster-address=gcomm:// &Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verification: On Controller-0, check the status of the cluster:
sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "clustercheck"$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "clustercheck"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure that the following message is displayed: “Galera cluster node is synced”, otherwise you must recreate the node.
On Controller-0, retrieve the cluster address from the configuration:
sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "grep wsrep_cluster_address /etc/my.cnf.d/galera.cnf" | awk '{print $3}'$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "grep wsrep_cluster_address /etc/my.cnf.d/galera.cnf" | awk '{print $3}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow On each of the remaining Controller nodes, start the database and validate the cluster:
Start the database:
sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) /usr/bin/mysqld_safe --pid-file=/var/run/mysql/mysqld.pid --socket=/var/lib/mysql/mysql.sock \ --datadir=/var/lib/mysql --log-error=/var/log/mysql_cluster.log --user=mysql --open-files-limit=16384 \ --wsrep-cluster-address=$CLUSTER_ADDRESS &$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) /usr/bin/mysqld_safe --pid-file=/var/run/mysql/mysqld.pid --socket=/var/lib/mysql/mysql.sock \ --datadir=/var/lib/mysql --log-error=/var/log/mysql_cluster.log --user=mysql --open-files-limit=16384 \ --wsrep-cluster-address=$CLUSTER_ADDRESS &Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the status of the MYSQL cluster:
sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "clustercheck"$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" \ --filter=name=galera-bundle) bash -c "clustercheck"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure that the following message is displayed: “Galera cluster node is synced”, otherwise you must recreate the node.
Stop the MySQL container on all Controller nodes:
sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" --filter=name=galera-bundle) \ /usr/bin/mysqladmin -u root shutdown$ sudo docker exec $(sudo docker container ls --all --format "{{ .Names }}" --filter=name=galera-bundle) \ /usr/bin/mysqladmin -u root shutdownCopy to Clipboard Copied! Toggle word wrap Toggle overflow On all Controller nodes, remove the following firewall rule to allow database connections through the virtual IP address:
sudo iptables -D INPUT -p tcp --destination-port 3306 -d $MYSQL_VIP -j DROP
$ sudo iptables -D INPUT -p tcp --destination-port 3306 -d $MYSQL_VIP -j DROPCopy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the MySQL container on all Controller nodes:
sudo docker container restart $(sudo docker container ls --all --format "{{ .Names }}" --filter=name=galera-bundle)$ sudo docker container restart $(sudo docker container ls --all --format "{{ .Names }}" --filter=name=galera-bundle)Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the
clustercheckcontainer on all Controller nodes:sudo docker container restart $(sudo docker container ls --all --format "{{ .Names }}" --filter=name=clustercheck)$ sudo docker container restart $(sudo docker container ls --all --format "{{ .Names }}" --filter=name=clustercheck)Copy to Clipboard Copied! Toggle word wrap Toggle overflow On Controller-0, set the Galera resource to
managedmode:sudo pcs resource manage galera-bundle
$ sudo pcs resource manage galera-bundleCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.4. Restoring the undercloud and control plane nodes with colocated Ceph monitors Link kopierenLink in die Zwischenablage kopiert!
If an error occurs during an update or upgrade, you can use ReaR backups to restore either the undercloud or overcloud control plane nodes, or both, to their previous state.
Prerequisites
- Install and configure ReaR. For more information, see Install and configure ReaR.
- Prepare the backup node. For more information, see Prepare the backup node.
- Execute the backup procedure. For more information, see Execute the backup procedure.
Procedure
On the backup node, export the NFS directory to host the Ceph backups. Replace
<IP_ADDRESS/24>with the IP address and subnet mask of the network:[root@backup ~]# cat >> /etc/exports << EOF /ceph_backups <IP_ADDRESS/24>(rw,sync,no_root_squash,no_subtree_check) EOF
[root@backup ~]# cat >> /etc/exports << EOF /ceph_backups <IP_ADDRESS/24>(rw,sync,no_root_squash,no_subtree_check) EOFCopy to Clipboard Copied! Toggle word wrap Toggle overflow On the undercloud node, source the undercloud credentials and run the following script:
source stackrc
# source stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow #! /bin/bash for i in `openstack server list -c Name -c Networks -f value | grep controller | awk -F'=' '{print $2}' | awk -F' ' '{print $1}'`; do ssh -q heat-admin@$i 'sudo systemctl stop ceph-mon@$(hostname -s) ceph-mgr@$(hostname -s)'; done#! /bin/bash for i in `openstack server list -c Name -c Networks -f value | grep controller | awk -F'=' '{print $2}' | awk -F' ' '{print $1}'`; do ssh -q heat-admin@$i 'sudo systemctl stop ceph-mon@$(hostname -s) ceph-mgr@$(hostname -s)'; doneCopy to Clipboard Copied! Toggle word wrap Toggle overflow To verify that the
ceph-mgr@controller.servicecontainer has stopped, enter the following command:sudo docker ps | grep ceph
[heat-admin@overcloud-controller-x ~]# sudo docker ps | grep cephCopy to Clipboard Copied! Toggle word wrap Toggle overflow On the undercloud node, source the undercloud credentials and run the following script:
source stackrc
# source stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow On the node that you want to restore, complete the following tasks:
- Power off the node before you proceed.
-
Restore the node with the ReaR backup file that you have created during the backup process. The file is located in the
/ceph_backupsdirectory of the backup node. -
From the
Relax-and-Recoverboot menu, selectRecover <CONTROL_PLANE_NODE>, where<CONTROL_PLANE_NODE>is the name of the control plane node. At the prompt, enter the following command:
RESCUE <CONTROL_PLANE_NODE> :~ # rear recover
RESCUE <CONTROL_PLANE_NODE> :~ # rear recoverCopy to Clipboard Copied! Toggle word wrap Toggle overflow When the image restoration process completes, the console displays the following message:
Finished recovering your system Exiting rear recover Running exit tasks
Finished recovering your system Exiting rear recover Running exit tasksCopy to Clipboard Copied! Toggle word wrap Toggle overflow For the node that you want to restore, copy the Ceph backup from the
/ceph_backupsdirectory into the/var/lib/cephdirectory:Identify the system mount points:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
/dev/vda2file system is mounted on/mnt/local.Create a temporary directory:
RESCUE <CONTROL_PLANE_NODE>:~ # mkdir /tmp/restore RESCUE <CONTROL_PLANE_NODE>:~ # mount -v -t nfs -o rw,noatime <BACKUP_NODE_IP_ADDRESS>:/ceph_backups /tmp/restore/
RESCUE <CONTROL_PLANE_NODE>:~ # mkdir /tmp/restore RESCUE <CONTROL_PLANE_NODE>:~ # mount -v -t nfs -o rw,noatime <BACKUP_NODE_IP_ADDRESS>:/ceph_backups /tmp/restore/Copy to Clipboard Copied! Toggle word wrap Toggle overflow On the control plane node, remove the existing
/var/lib/cephdirectory:RESCUE <CONTROL_PLANE_NODE>:~ # rm -rf /mnt/local/var/lib/ceph/*
RESCUE <CONTROL_PLANE_NODE>:~ # rm -rf /mnt/local/var/lib/ceph/*Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restore the previous Ceph maps. Replace
<CONTROL_PLANE_NODE>with the name of your control plane node:RESCUE <CONTROL_PLANE_NODE>:~ # tar -xvC /mnt/local/ -f /tmp/restore/<CONTROL_PLANE_NODE>/<CONTROL_PLANE_NODE>.tar.gz --xattrs --xattrs-include='*.*' var/lib/ceph
RESCUE <CONTROL_PLANE_NODE>:~ # tar -xvC /mnt/local/ -f /tmp/restore/<CONTROL_PLANE_NODE>/<CONTROL_PLANE_NODE>.tar.gz --xattrs --xattrs-include='*.*' var/lib/cephCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the files are restored:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Power off the node:
RESCUE <CONTROL_PLANE_NODE> :~ # poweroff
RESCUE <CONTROL_PLANE_NODE> :~ # poweroffCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Power on the node. The node resumes its previous state.