Chapter 2. Automated recovery from manual backups
You can automatically restore data from manual backups when MicroShift fails to start by configuring automatic recovery.
2.1. Modifying backup and restore commands to automate data recovery
Use automatic recovery options to store all of your backups in a single directory, then automatically select the latest one to restore. Modifying existing backup
and restore
commands enables you to set up automatic recovery.
The --auto-recovery
option treats the PATH
argument as a path to a directory that holds all the backups for automated recovery, and not just as a path to a particular backup file. You can use the --auto-recovery
option with both backup
and restore
commands.
-
For example, if you use the automatic recovery option with
restore
, such as inmicroshift restore --auto-recovery PATH
, running the modified command automatically selects and restores the most recent backup. -
If you use the same option in the
microshift backup
command, such as inmicroshift backup --auto-recovery PATH
, a new backup is created in the PATH. -
By default,
microshift restore --auto-recovery PATH
creates a backup of the failed MicroShift data inPATH/failed
. You can add the--dont-save-failed
option to disable the creation of failed backup data.
You can only use the --dont-save-failed
option with the restore
command.
2.2. Creating backups using the auto-recovery feature
Use the following procedure to create backups using automatic recovery options.
Creating backups require stopping MicroShift, so you must determine the best time to stop MicroShift.
Prerequisites
- You stopped MicroShift.
Procedure
Create and store backups in the directory you choose by running the following command:
$ sudo microshift backup --auto-recovery <path_of_directory> 1
- 1
- Replace
<path_of_directory>
with the path of the directory that stores backups. For example,/var/lib/microshift-auto-recovery
.
NoteThe
--auto-recovery
option modifies the interpretation of thePATH
argument from the final backup path to a directory that holds all of the backups for automated recovery.Example output
??? I1104 09:18:52.100725 8906 system.go:58] "OSTree deployments" deployments=[{"id":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1","booted":true,"staged":false,"pinned":false},{"id":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","booted":false,"staged":false,"pinned":false}] ??? I1104 09:18:52.100895 8906 data_manager.go:83] "Copying data to backup directory" storage="/var/lib/microshift-auto-recovery" name="20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1" data="/var/lib/microshift" ??? I1104 09:18:52.102296 8906 disk_space.go:33] Calculated size of "/var/lib/microshift": 261M - increasing by 10% for safety: 287M ??? I1104 09:18:52.102321 8906 disk_space.go:44] Calculated available disk space for "/var/lib/microshift-auto-recovery": 1658M ??? I1104 09:18:52.105700 8906 atomic_dir_copy.go:66] "Made an intermediate copy" cmd="/bin/cp --verbose --recursive --preserve --reflink=auto /var/lib/microshift /var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1.tmp.99142" ??? I1104 09:18:52.105732 8906 atomic_dir_copy.go:115] "Renamed to final destination" src="/var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1.tmp.99142" dest="/var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1" ??? I1104 09:18:52.105749 8906 data_manager.go:120] "Copied data to backup directory" backup="/var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1" data="/var/lib/microshift" /var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1
Verification
Verify that the backup you created exists in your customized storage directory by running the following command:
$ sudo ls -la <path_of_directory> 1
- 1
- Replace
<path_of_directory>
with the path of the directory that stores backups. For example,/var/lib/microshift-auto-recovery
.
2.3. Restoring backups using the auto-recovery feature
You can restore backups after system events that remove or damage required data. Use the following procedure to restore backups using automatic recovery. Automatic recovery selects the most recent backup and restores it. Previously restored backups that used automatic recovery are moved to your PATH/restored
directory.
Prerequisites
- You have stopped MicroShift.
Procedure
Restore the latest backup from your backups directory by running the following command:
$ sudo microshift restore --auto-recovery <path_of_directory> 1
- 1
- Replace
<path_of_directory>
with the path of the directory that stores backups. For example,/var/lib/microshift-auto-recovery
.
Note-
The
--auto-recovery
option copies the MicroShift data to/var/lib/microshift-auto-recovery/failed/
for later investigation, selects the most recent backup, and restores it. -
The
--dont-save-failed
option disables the backing up of failed MicroShift data.
Example output
??? I1104 09:19:28.617225 8950 state.go:80] "Read state from the disk" state={"LastBackup":"20241022101528_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"} ??? I1104 09:19:28.617323 8950 storage.go:78] "Auto-recovery backup storage read and parsed" dirs=["20241022101255_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","20241022101520_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","20241022101528_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1","restored"] backups=[{"CreationTime":"2024-10-22T10:12:55Z","Version":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"},{"CreationTime":"2024-10-22T10:15:20Z","Version":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"},{"CreationTime":"2024-10-22T10:15:28Z","Version":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"},{"CreationTime":"2024-11-04T09:18:52Z","Version":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"}] ??? I1104 09:19:28.617350 8950 storage.go:40] "Filtered list of backups - removed previously restored backup" removed="20241022101528_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0" newList=[{"CreationTime":"2024-10-22T10:12:55Z","Version":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"},{"CreationTime":"2024-10-22T10:15:20Z","Version":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"},{"CreationTime":"2024-11-04T09:18:52Z","Version":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"}] ??? I1104 09:19:28.633237 8950 system.go:58] "OSTree deployments" deployments=[{"id":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1","booted":true,"staged":false,"pinned":false},{"id":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","booted":false,"staged":false,"pinned":false}] ??? I1104 09:19:28.633258 8950 storage.go:49] "Filtered list of backups by version" version="default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1" newList=[{"CreationTime":"2024-11-04T09:18:52Z","Version":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"}] ??? I1104 09:19:28.633268 8950 restore.go:170] "Potential backups" bz=[{"CreationTime":"2024-11-04T09:18:52Z","Version":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"}] ??? I1104 09:19:28.633277 8950 restore.go:173] "Candidate backup for restore" b={"CreationTime":"2024-11-04T09:18:52Z","Version":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"} ??? I1104 09:19:28.634007 8950 disk_space.go:33] Calculated size of "/var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1": 261M - increasing by 10% for safety: 287M ??? I1104 09:19:28.634096 8950 disk_space.go:44] Calculated available disk space for "/var/lib": 1658M ??? I1104 09:19:28.634507 8950 disk_space.go:33] Calculated size of "/var/lib/microshift": 261M - increasing by 10% for safety: 287M ??? I1104 09:19:28.634522 8950 disk_space.go:44] Calculated available disk space for "/var/lib/microshift-auto-recovery": 1658M ??? I1104 09:19:28.649719 8950 system.go:58] "OSTree deployments" deployments=[{"id":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1","booted":true,"staged":false,"pinned":false},{"id":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","booted":false,"staged":false,"pinned":false}] ??? I1104 09:19:28.653880 8950 atomic_dir_copy.go:66] "Made an intermediate copy" cmd="/bin/cp --verbose --recursive --preserve --reflink=auto /var/lib/microshift /var/lib/microshift-auto-recovery/failed/20241104091928_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1.tmp.22742" ??? I1104 09:19:28.657362 8950 atomic_dir_copy.go:66] "Made an intermediate copy" cmd="/bin/cp --verbose --recursive --preserve --reflink=auto /var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1 /var/lib/microshift.tmp.482" ??? I1104 09:19:28.657385 8950 state.go:40] "Saving intermediate state" state="{\"LastBackup\":\"20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1\"}" path="/var/lib/microshift-auto-recovery/state.json.tmp.41544" ??? I1104 09:19:28.662438 8950 atomic_dir_copy.go:115] "Renamed to final destination" src="/var/lib/microshift.tmp.482" dest="/var/lib/microshift" ??? I1104 09:19:28.662451 8950 state.go:46] "Moving state file to final path" intermediatePath="/var/lib/microshift-auto-recovery/state.json.tmp.41544" finalPath="/var/lib/microshift-auto-recovery/state.json" ??? I1104 09:19:28.662521 8950 atomic_dir_copy.go:115] "Renamed to final destination" src="/var/lib/microshift-auto-recovery/failed/20241104091928_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1.tmp.22742" dest="/var/lib/microshift-auto-recovery/failed/20241104091928_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1" ??? I1104 09:19:28.662969 8950 atomic_dir_copy.go:115] "Renamed to final destination" src="/var/lib/microshift-auto-recovery/20241022101528_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0" dest="/var/lib/microshift-auto-recovery/restored/20241022101528_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0" ??? I1104 09:19:28.662983 8950 restore.go:141] "Auto-recovery restore completed".
Important-
The
restore
command does not restart MicroShift after restoration. When you execute this command, MicroShift service has already failed or you stopped it. - MicroShift does not monitor the disk space of any filesystem. You must ensure that your automation handles old backup removal. For example, you can add this process to the auto-recovery service or add another service that runs periodically.
Restart MicroShift by running the following command:
$ sudo systemctl restart microshift
Verification
Verify that MicroShift has started successfully by running the following command:
$ oc get pods -A
Example output
NAMESPACE NAME READY STATUS RESTARTS AGE default i-06166fbb376f14a8bus-west-2computeinternal-debug-qtwcr 1/1 Running 0 46m kube-system csi-snapshot-controller-5c6586d546-lprv4 1/1 Running 0 51m openshift-dns dns-default-45jl7 2/2 Running 0 50m openshift-dns node-resolver-7wmzf 1/1 Running 0 51m openshift-ingress router-default-78b86fbf9d-qvj9s 1/1 Running 0 51m openshift-ovn-kubernetes ovnkube-master-5rfhh 4/4 Running 0 51m openshift-ovn-kubernetes ovnkube-node-gcnt6 1/1 Running 0 51m openshift-service-ca service-ca-bf5b7c9f8-pn6rk 1/1 Running 0 51m openshift-storage topolvm-controller-549f7fbdd5-7vrmv 5/5 Running 0 51m openshift-storage topolvm-node-rht2m 3/3 Running 0 50m
NoteThis example output shows basic MicroShift. If you have installed optional RPMs, the status of pods running those services is also expected to be shown in your output.
2.3.1. Using automatic recovery in RPM systems
When MicroShift enters a failed state, the systemd service starts the microshift-auto-recovery.service
unit. This unit executes the auto-recovery
restore process.
As a use case, consider the following example situation in which you want to automate the automatic recovery process for RPM systems that use the systemd service.
Procedure
Create a directory for the
microshift
systemd service by running the following command:$ sudo mkdir -p /usr/lib/systemd/system/microshift.service.d
To instruct
systemd
to runmicroshift-auto-recovery.service
when themicroshift.service
fails, create the10-auto-recovery.conf
file by running the following command:$ sudo tee /usr/lib/systemd/system/microshift.service.d/10-auto-recovery.conf > /dev/null <<'EOF' [Unit] OnFailure=microshift-auto-recovery.service StartLimitIntervalSec=25s 1 EOF
- 1
- Increase the
StartLimitInterval
value from the default10s
to a larger value for slower systems. A value that is too low can result in systemd never marking themicroshift
systemd service as failed, which means that theOnFailure=
service does not get triggered.
Create the
microshift-auto-recovery.service
file by running the following command:$ sudo tee /usr/lib/systemd/system/microshift-auto-recovery.service > /dev/null <<'EOF' [Unit] Description=MicroShift auto-recovery [Service] Type=oneshot ExecStart=/usr/bin/microshift-auto-recovery [Install] WantedBy=multi-user.target EOF
Create the
microshift-auto-recovery
script by running the following command:$ sudo tee /usr/bin/microshift-auto-recovery > /dev/null <<'EOF' #!/usr/bin/env bash set -xeuo pipefail # If greenboot uses a non-default file for clearing boot_counter, use boot_success instead. if grep -q "/boot/grubenv" /usr/libexec/greenboot/greenboot-grub2-set-success; then if grub2-editenv - list | grep -q ^boot_success=0; then echo "Greenboot didn't decide the system is healthy after staging new deployment." echo "Quitting to not interfere with the process" exit 0 fi else if grub2-editenv - list | grep -q ^boot_counter=; then echo "Greenboot didn't decide the system is healthy after staging a new deployment." echo "Quitting to not interfere with the process" exit 0 fi fi /usr/bin/microshift restore --auto-recovery /var/lib/microshift-auto-recovery /usr/bin/systemctl reset-failed microshift /usr/bin/systemctl start microshift echo "DONE" EOF
Make the script executable by running the following command:
$ sudo chmod +x /usr/bin/microshift-auto-recovery
Reload the system configuration by running the following command:
$ sudo systemctl daemon-reload
2.3.2. Using automatic recovery with RHEL for Edge
As a use case, consider the following example situation in which you want to automate the auto-recovery
process for Red Hat Enterprise Linux for Edge (RHEL for Edge) systems that use systemd in the blueprint file.
You must include the entire auto-recovery
process for RHEL for Edge systems that use systemd
in the blueprint file.
Prerequisites
- You installed Podman.
-
You installed the command-line
composer-cli
tool.
Procedure
-
Optional: Because the
composer-cli
can only create files in the/etc
directory, package your files into an RPM that you include the blueprint. Use the following example to create your blueprint file:
[[customizations.directories]] path = "/etc/systemd/system/microshift.service.d" [[customizations.directories]] path = "/etc/bin" [[customizations.files]] path = "/etc/systemd/system/microshift.service.d/10-auto-recovery.conf" data = """ [Unit] OnFailure=microshift-auto-recovery.service """ [[customizations.files]] path = "/etc/systemd/system/microshift-auto-recovery.service" data = """ [Unit] Description=MicroShift auto-recovery [Service] Type=oneshot ExecStart=/etc/bin/microshift-auto-recovery [Install] WantedBy=multi-user.target """ [[customizations.files]] path = "/etc/bin/microshift-auto-recovery" mode = "0755" data = """ #!/usr/bin/env bash set -xeuo pipefail # If greenboot uses a non-default file for clearing boot_counter, use boot_success instead. if grep -q "/boot/grubenv" /usr/libexec/greenboot/greenboot-grub2-set-success; then if grub2-editenv - list | grep -q ^boot_success=0; then echo "Greenboot didn't decide the system is healthy after staging a new deployment." echo "Quitting to not interfere with the process" exit 0 fi else if grub2-editenv - list | grep -q ^boot_counter=; then echo "Greenboot didn't decide the system is healthy after staging a new deployment." echo "Quitting to not interfere with the process" exit 0 fi fi /usr/bin/microshift restore --auto-recovery /var/lib/microshift-auto-recovery /usr/bin/systemctl reset-failed microshift /usr/bin/systemctl start microshift echo "DONE" """
- For the next steps, see Preparing for image building.
2.3.3. Using automatic recovery in image mode for RHEL systems
Image mode for RHEL is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
As a use case, consider the following example situation in which you want to automate the auto-recovery
process for image mode for Red Hat Enterprise Linux (RHEL) systems that use the systemd service.
You must include the entire auto-recovery
process for image mode for RHEL systems that use systemd
in the container file.
Prerequisites
- You created a Containerfile as instructed in Building the bootc image.
You created the
10-auto-recovery.conf
andmicroshift-auto-recovery.service
files as explained in the "Using auto-recovery in RPM systems" section.ImportantThe location of the the
10-auto-recovery.conf
andmicroshift-auto-recovery.service
must be relative to the Containerfile.For example, if the path to the Containerfile is
/home/microshift/my-build/Containerfile
, the systemd files need to be adjacent for proper embedding. The following paths are correct for this example:-
/home/microshift/my-build/auto-rec/10-auto-recovery.conf
-
/home/microshift/my-build/auto-rec/microshift-auto-recovery.service
-
/home/microshift/my-build/auto-rec/microshift-auto-recovery
-
-
You created the
microshift-auto-recovery
script as explained in the "Using auto-recovery in RPM systems" section.
Procedure
Use the following example snippet to update the container file that you use to prepare the image mode for RHEL image.
RUN mkdir -p /usr/lib/systemd/system/microshift.service.d COPY ./auto-rec/10-auto-recovery.conf /usr/lib/systemd/system/microshift.service.d/10-auto-recovery.conf COPY ./auto-rec/microshift-auto-recovery.service /usr/lib/systemd/system/ COPY ./auto-rec/microshift-auto-recovery /usr/bin/ RUN chmod +x /usr/bin/microshift-auto-recovery
ImportantPodman uses the host subscription information and repositories inside the container when building the container image. If the
rhocp
andfast-datapath
repositories are not available on the host, the build fails.Rebuild your local bootc image by running the following image build command:
PULL_SECRET=~/.pull-secret.json USER_PASSWD=<your_redhat_user_password> IMAGE_NAME=microshift-4.18-bootc sudo podman build --authfile "${PULL_SECRET}" -t "${IMAGE_NAME}" \ --build-arg USER_PASSWD="${USER_PASSWD}" \ -f Containerfile
NoteSecrets are used during the image build in the following ways:
-
The podman
--authfile
argument is required to pull the baserhel-bootc:9.4
image from theregistry.redhat.io
registry. -
The build
USER_PASSWD
argument is used to set a password for theredhat user
.
-
The podman
Verification
Verify that the local bootc image was created by running the following command:
$ sudo podman images "${IMAGE_NAME}"
Example output
REPOSITORY TAG IMAGE ID CREATED SIZE localhost/microshift-4.18-bootc latest 193425283c00 2 minutes ago 2.31 GB