Chapter 4. Recovering applications with RWO storage
Applications that use ReadWriteOnce (RWO) storage have a known behavior described in this Kubernetes issue. Because of this issue, if there is a data zone failure, any application pods in that zone mounting RWO volumes (for example, cephrbd
based volumes) are stuck with Terminating
status after 6-8 minutes and is not re-created on the active zone without manual intervention.
Check the OpenShift Container Platform nodes with a status of NotReady
. There may be an issue that prevents the nodes from communicating with the OpenShift control plane. However, the nodes may still be performing I/O operations against Persistent Volumes (PVs).
If two pods are concurrently writing to the same RWO volume, there is a risk of data corruption. Ensure that processes on the NotReady
node are either terminated or blocked until they are terminated.
Example solutions:
- Use an out of band management system to power off a node, with confirmation, to ensure process termination.
Withdraw a network route that is used by nodes at a failed site to communicate with storage.
NoteBefore restoring service to the failed zone or nodes, confirm that all the pods with PVs have terminated successfully.
To get the Terminating
pods to recreate on the active zone, you can either force delete the pod or delete the finalizer on the associated PV. Once one of these two actions are completed, the application pod should recreate on the active zone and successfully mount its RWO storage.
- Force deleting the pod
Force deletions do not wait for confirmation from the kubelet that the pod has been terminated.
$ oc delete pod <PODNAME> --grace-period=0 --force --namespace <NAMESPACE>
<PODNAME>
- Is the name of the pod
<NAMESPACE>
- Is the project namespace
- Deleting the finalizer on the associated PV
Find the associated PV for the Persistent Volume Claim (PVC) that is mounted by the Terminating pod and delete the finalizer using the
oc patch
command.$ oc patch -n openshift-storage pv/<PV_NAME> -p '{"metadata":{"finalizers":[]}}' --type=merge
<PV_NAME>
Is the name of the PV
An easy way to find the associated PV is to describe the Terminating pod. If you see a multi-attach warning, it should have the PV names in the warning (for example,
pvc-0595a8d2-683f-443b-aee0-6e547f5f5a7c
).$ oc describe pod <PODNAME> --namespace <NAMESPACE>
<PODNAME>
- Is the name of the pod
<NAMESPACE>
Is the project namespace
Example output:
[...] Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 4m5s default-scheduler Successfully assigned openshift-storage/noobaa-db-pg-0 to perf1-mz8bt-worker-d2hdm Warning FailedAttachVolume 4m5s attachdetach-controller Multi-Attach error for volume "pvc-0595a8d2-683f-443b-aee0-6e547f5f5a7c" Volume is already exclusively attached to one node and can't be attached to another