Este contenido no está disponible en el idioma seleccionado.
Chapter 4. Restoring the PostgreSQL database for Red Hat Edge Manager on Red Hat OpenShift Container Platform
After you lose data or need to recover the control plane database on Red Hat OpenShift Container Platform, restore the flightctl PostgreSQL database and related state using the backup artifacts and methods that match your backup scope. This topic walks through scaling workloads, running flightctl-restore, and bringing services back. Exact database restore commands before flightctl-restore still depend on your backup format and tooling.
4.1. Prerequisites Copiar enlaceEnlace copiado en el portapapeles!
- Cluster access to the Kubernetes cluster that hosts the Red Hat Edge Manager deployment.
-
The OpenShift project (Kubernetes namespace) where you installed the Red Hat Edge Manager Helm chart. The examples in this topic use the placeholder
rhem-chart-namespace; substitute your real namespace name everywhere it appears (production deployments often use a single namespace for all Red Hat Edge Manager workloads). -
Kubernetes tools:
kubectlinstalled and configured with administrative permissions. -
Flight Control CLI:
flightctlandflightctl-restoreavailable locally or in your restore environment. - Backup artifacts: access to the database backup files required for recovery.
-
Optional verification tools:
redis-cliandpg_isreadyfor validating service readiness and data integrity.
Database restore steps before you run flightctl-restore must match your backup strategy (for example, pg_restore, psql, or volume-level recovery). Follow your organization’s runbooks together with the outline below.
4.2. Restore procedure Copiar enlaceEnlace copiado en el portapapeles!
Verify that the
flightctl-restoreversion matches the Red Hat Edge Manager server version:Run the following command for the
flightctlversion:flightctl versionRun the following command for the
flightctl-restoreversion:flightctl-restore versionConfirm that the server version and the restore version are the same:
echo "Server version: $(flightctl version)" echo "Restore version: $(flightctl-restore version)"ImportantThe server version and restore version must match exactly. If they do not match, update your
flightctl-restorebinary to align with the server version before proceeding.
Scale down the Red Hat Edge Manager services to avoid data conflicts during the restore process. On Red Hat OpenShift Container Platform, all of these workloads run in a single OpenShift project; use the same namespace value on every command (the placeholder
rhem-chart-namespacestands for that project—replace it with yours):# Replace rhem-chart-namespace with your OpenShift project (one namespace for all deployments below) kubectl scale deployment flightctl-api --replicas=0 -n rhem-chart-namespace kubectl scale deployment flightctl-worker --replicas=0 -n rhem-chart-namespace kubectl scale deployment flightctl-periodic --replicas=0 -n rhem-chart-namespace kubectl scale deployment flightctl-alert-exporter --replicas=0 -n rhem-chart-namespace kubectl scale deployment flightctl-alertmanager-proxy --replicas=0 -n rhem-chart-namespace
Wait for pods to terminate, then verify:
+
# Same namespace as above
kubectl get pods -n rhem-chart-namespace
Restore the
flightctlPostgreSQL database from your existing backup.After all Red Hat Edge Manager services are scaled down, restore the database using the method that matches your backup strategy.
-
Target database: Restore the PostgreSQL database instance named
flightctl. -
Supported methods: Use your preferred recovery procedure, such as
pg_restore,psql(for SQL dumps), or infrastructure-level volume snapshots. - Verification: Confirm that the database is fully accessible and that the integrity of the restored data is verified before proceeding.
ImportantThe specific restoration commands depend on your backup strategy and tooling. Ensure the database is fully restored and consistent before you continue.
-
Target database: Restore the PostgreSQL database instance named
Retrieve database and KV store credentials (same namespace as the scale commands):
DB_APP_PASSWORD=$(kubectl get secret flightctl-db-app-secret -n rhem-chart-namespace -o jsonpath='{.data.userPassword}' | base64 -d) echo "Database password retrieved successfully"KV_PASSWORD=$(kubectl get secret flightctl-kv-secret -n rhem-chart-namespace -o jsonpath='{.data.password}' | base64 -d)Set up port forwarding for the database and the KV store. Use separate terminal sessions for each port forward, or run them in the background.
Forward the database service:
# Forward database port (run in a separate terminal or in the background) kubectl port-forward svc/flightctl-db 5432:5432 -n rhem-chart-namespace & DB_PORT_FORWARD_PID=$! # Verify database connectivity (if available) pg_isready -h localhost -p 5432Forward the KV store service:
# Forward KV store port (run in a separate terminal or in the background) kubectl port-forward svc/flightctl-kv 6379:6379 -n rhem-chart-namespace & KV_PORT_FORWARD_PID=$! # Verify KV store connectivity (if available) REDISCLI_AUTH="$KV_PASSWORD" redis-cli -h localhost -p 6379 pingRun the restore command using environment variables for the database and KV store passwords:
DB_PASSWORD="$DB_APP_PASSWORD" KV_PASSWORD="$KV_PASSWORD" ./bin/flightctl-restoreMonitor the restore output for errors or completion messages.
Stop the port-forward processes when the restore finishes:
kill $DB_PORT_FORWARD_PID $KV_PORT_FORWARD_PIDIf you ran the port forwards in separate terminals instead, stop them with Ctrl+C in those terminals.
Restart Red Hat Edge Manager services. Scale the deployments back to their normal replica counts in the same OpenShift project as in step 2:
# Replace rhem-chart-namespace with your OpenShift project (same single namespace for every command) kubectl scale deployment flightctl-api --replicas=1 -n rhem-chart-namespace kubectl scale deployment flightctl-worker --replicas=1 -n rhem-chart-namespace kubectl scale deployment flightctl-periodic --replicas=1 -n rhem-chart-namespace kubectl scale deployment flightctl-alert-exporter --replicas=1 -n rhem-chart-namespace kubectl scale deployment flightctl-alertmanager-proxy --replicas=1 -n rhem-chart-namespace kubectl get deployments -n rhem-chart-namespace kubectl get pods -n rhem-chart-namespace
4.3. After you restore Copiar enlaceEnlace copiado en el portapapeles!
When the restore finishes and Red Hat Edge Manager workloads are healthy again in your Red Hat OpenShift Container Platform namespace, validate the control plane and plan for device reconciliation. Restoring the database changes what the service knows about devices; edge devices must reconnect and compare their live state to the restored specifications.
4.3.1. Operational follow-up Copiar enlaceEnlace copiado en el portapapeles!
- Re-run any validation steps from Testing backups in the backup topic if you need a structured checklist.
- Document deviations, incidents, and command variants in your runbooks so the next restore follows the same path.
4.3.2. Post-restore device status changes Copiar enlaceEnlace copiado en el portapapeles!
After a successful restore, devices move through automatic status transitions while they reconnect and reconcile with the restored control plane data.
- AwaitingReconnect
-
Devices are always placed in
AwaitingReconnectfirst. The service waits for each device to report its current state again. Spec reconciliation for those devices remains paused until they reconnect. - Enrollment requests and post-restore approval
Devices approved after the restored backup was taken do not exist after the restore and must be approved again. After restore:
-
Devices created from a restored enrollment request are placed in
AwaitingReconnectand follow the normalAwaitingReconnectbehavior. -
Devices without an enrollment request before backup, with a non-zero deployed specification version, are placed in
AwaitingReconnectand follow the normalAwaitingReconnectbehavior. - Devices without an enrollment request before backup, with a zero specification version, move to normal status.
-
Devices created from a restored enrollment request are placed in
- ConflictPaused
-
After a device reconnects and reports its current state, the service compares the specification stored in the restored backup with the device-reported version. If the restored backup specification is older (for example, the device had moved forward while backups lagged), the device can enter
ConflictPaused. Rendering of new specifications stops for that device until an operator resolves the mismatch. Human review is required before you force configuration forward. - Normal operation
- When the restored specification and the device-reported state are compatible, the device returns to normal operational statuses (for example, online or updating) and usual reconciliation resumes.
4.3.2.1. Monitor device status Copiar enlaceEnlace copiado en el portapapeles!
Use the flightctl CLI to see which devices need attention:
flightctl get devices
flightctl get devices --field-selector=status.summary.status=AwaitingReconnect
flightctl get devices --field-selector=status.summary.status=ConflictPaused
4.3.2.2. Resolve ConflictPaused devices Copiar enlaceEnlace copiado en el portapapeles!
- Review the specification source: if the device belongs to a fleet, inspect the fleet template and selector; if not, inspect the device spec directly. Review labels and ownership to confirm how the restored specification applies to the device.
When you are confident the restored specification is what you want, resume the device or a group of devices. Replace
example-devicewith your device resource name and adjust selectors to match your environment:flightctl resume device example-device flightctl resume device --selector="environment=production"Use additional
flightctl resume deviceoptions your deployment supports (for example, field selectors) if you need to resume many devices in bulk.