Chapter 7. Known issues
This section describes the known issues in Red Hat OpenShift Data Foundation 4.13.
7.1. Disaster recovery
Failover action reports RADOS block device image mount failed on the pod with RPC error still in use
Failing over a disaster recovery (DR) protected workload might result in pods using the volume on the failover cluster to be stuck in reporting RADOS block device (RBD) image is still in use. This prevents the pods from starting up for a long duration (upto several hours).
Failover action reports RADOS block device image mount failed on the pod with RPC error fsck
Failing over a disaster recovery (DR) protected workload may result in pods not starting with volume mount errors that state the volume has file system consistency check (fsck) errors. This prevents the workload from failing over to the failover cluster.
Creating an application namespace for the managed clusters
Application namespace needs to exist on RHACM managed clusters for disaster recovery (DR) related pre-deployment actions and hence is pre-created when an application is deployed at the RHACM hub cluster. However, if an application is deleted at the hub cluster and its corresponding namespace is deleted on the managed clusters, they reappear on the managed cluster.
Workaround:
openshift-dr
maintains a namespacemanifestwork
resource in the managed cluster namespace at the RHACM hub. These resources need to be deleted after the application deletion. For example, as a cluster administrator, execute the following command on the hub cluster:oc delete manifestwork -n <managedCluster namespace> <drPlacementControl name>-<namespace>-ns-mw
.
RBD mirror scheduling is getting stopped for some images
The Ceph manager daemon gets blocklisted due to different reasons, which causes the scheduled RBD mirror snapshot from being triggered on the cluster where the image(s) are primary. All RBD images that are mirror enabled (hence DR protected) do not list a schedule when examined using
rbd mirror snapshot schedule status -p ocs-storagecluster-cephblockpool
, and hence are not actively mirrored to the peer site.Workaround: Restart the Ceph manager deployment, on the managed cluster where the images are primary, to overcome the blocklist against the currently running instance, this can be done by scaling down and then later scaling up the ceph manager deployment as follows:
$ oc -n openshift-storage scale deployments/rook-ceph-mgr-a --replicas=0 $ oc -n openshift-storage scale deployments/rook-ceph-mgr-a --replicas=1
Result: Images that are DR enabled and denoted as primary on a managed cluster start reporting mirroring schedules when examined using
rbd mirror snapshot schedule status -p ocs-storagecluster-cephblockpool
ceph df
reports an invalid MAX AVAIL value when the cluster is in stretch modeWhen a crush rule for a Red Hat Ceph Storage cluster has multiple "take" steps, the
ceph df
report shows the wrong maximum available size for the map. The issue will be fixed in an upcoming release.
Ceph does not recognize the global IP assigned by Globalnet
Ceph does not recognize global IP assigned by Globalnet, so disaster recovery solution cannot be configured between clusters with overlapping service CIDR using Globalnet. Due to this disaster recovery solution does not work when service
CIDR
overlaps.
Both the DRPCs protect all the persistent volume claims created on the same namespace
The namespaces that host multiple disaster recovery (DR) protected workloads, protect all the persistent volume claims (PVCs) within the namespace for each DRPlacementControl resource in the same namespace on the hub cluster that does not specify and isolate PVCs based on the workload using its
spec.pvcSelector
field.This results in PVCs, that match the DRPlacementControl
spec.pvcSelector
across multiple workloads. Or, if the selector is missing across all workloads, replication management to potentially manage each PVC multiple times and cause data corruption or invalid operations based on individual DRPlacementControl actions.Workaround: Label PVCs that belong to a workload uniquely, and use the selected label as the DRPlacementControl
spec.pvcSelector
to disambiguate which DRPlacementControl protects and manages which subset of PVCs within a namespace. It is not possible to specify thespec.pvcSelector
field for the DRPlacementControl using the user interface, hence the DRPlacementControl for such applications must be deleted and created using the command line.Result: PVCs are no longer managed by multiple DRPlacementControl resources and do not cause any operation and data inconsistencies.
MongoDB pod is in
CrashLoopBackoff
because of permission errors reading data incephrbd
volumeThe OpenShift projects across different managed clusters have different security context constraints (SCC), which specifically differ in the specified UID range and/or
FSGroups
. This leads to certain workload pods and containers failing to start post failover or relocate operations within these projects, due to filesystem access errors in their logs.Workaround: Ensure workload projects are created on all managed clusters with the same project-level SCC labels, allowing them to use the same filesystem context when failed over or relocated. Pods will no longer fail post-DR actions on filesystem-related access errors.
Application is stuck in Relocating state during relocate
Multicloud Object Gateway allowed multiple persistent volume (PV) objects of the same name or namespace to be added to the S3 store on the same path. Due to this, Ramen does not restore the PV because it detected multiple versions pointing to the same
claimRef
.Workaround: Use S3 CLI or equivalent to clean up the duplicate PV objects from the S3 store. Keep only the one that has a timestamp closer to the failover or relocate time.
Result: The restore operation will proceed to completion and the failover or relocate operation proceeds to the next step.
PeerReady
state is set totrue
when a workload is failed over or relocated to the peer cluster until the cluster from where it was failed over or relocated from is cleaned upAfter a disaster recovery (DR) action is initiated, the
PeerReady
condition is initially set totrue
for the duration when the workload is failed over or relocated to the peer cluster. After this it is set tofalse
until the cluster from where it was failed over or relocated from is cleaned up for future actions. A user looking atDRPlacementControl
status conditions for future actions may recognize this intermediatePeerReady
state as a peer is ready for action and perform the same. This will result in the operation pending or failing and may require user intervention to recover from.Workaround: Examine both
Available
andPeerReady
states before performing any actions. Both should betrue
for a healthy DR state for the workload. Actions performed when both states are true will result in the requested operation progressing
Disaster recovery workloads remain stuck when deleted
When deleting a workload from a cluster, the corresponding pods might not terminate with events such as
FailedKillPod
. This might cause delay or failure in garbage collecting dependent DR resources such as thePVC
,VolumeReplication
, andVolumeReplicationGroup
. It would also prevent a future deployment of the same workload to the cluster as the stale resources are not yet garbage collected.Workaround: Reboot the worker node on which the pod is currently running and stuck in a terminating state. This results in successful pod termination and subsequently related DR API resources are also garbage collected.
Blocklisting can lead to Pods stuck in an error state
Blocklisting due to either network issues or a heavily overloaded or imbalanced cluster with huge tail latency spikes. Because of this, Pods get stuck in
CreateContainerError
with the message:Error: relabel failed /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount: lsetxattr /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount/#ib_16384_0.dblwr: read-only file system.
Workaround: Reboot the node to which these pods are scheduled and failing by following these steps:
- Cordon and then drain the node having the issue
- Reboot the node having the issue
Uncordon the node having the issue
Application failover hangs in
FailingOver
state when the managed clusters are on different versions of OpenShift Container Platform and OpenShift Data FoundationDisaster Recovery solution with OpenShift Data Foundation 4.13 protects and restores persistent volume claim (PVC) data in addition to the persistent volume (PV) data. If the primary cluster is on an older OpenShift Data Foundation version and the target cluster is updated to 4.13 then the failover will be stuck as the S3 store will not have the PVC data.
Workaround: When upgrading the Disaster Recovery clusters, the primary cluster must be upgraded first and then the post-upgrade steps must be run.
When DRPolicy is applied to mutiple applications under same namespace, volume replication group is not created
When a DRPlacementControl (DRPC) is created for applications that are co-located with other applications in the namespace, the DRPC has no label selector set for the applications. If any subsequent changes are made to the label selector, the validating admission webhook in the OpenShift Data Foundation Hub controller rejects the changes.
Workaround: Until the admission webhook is changed to allow such changes, the DRPC
validatingwebhookconfigurations
can be patched to remove the webhook:$ oc patch validatingwebhookconfigurations vdrplacementcontrol.kb.io-lq2kz --type=json --patch='[{"op": "remove", "path": "/webhooks"}]'
Deletion of certain managed resources associated with subscription-based workload
During hub recovery, OpenShift Data Foudation encounters a known issue with Red Hat Advanced Cluster Management version 2.7.4 (or higher) where certain managed resources associated with the subscription-based workload might be unintentionally deleted. Currently, this issue does not have any known workaround.
Peer ready status shown as
Unknown
after hub recoveryAfter the zone failure and hub recovery, occasionally, the peer ready status of the subscription and application set applications in their disaster recovery placement control (DRPC) is shown as
Unknown
. This is a cosmetic issue and does not impact the regular functionality of Ramen and is limited to the visual appearance of the DRPC output when viewed using theoc
command.Workaround: Use the YAML output to know the correct status:
$ oc get drpc -o yaml
7.1.1. DR upgrade
This section describes the issues and workarounds related to upgrading Red Hat OpenShift Data Foundation from version 4.12 to 4.13 in disaster recovery environment.
Failover or relocate is blocked for workloads that existed prior to upgrade
OpenShift Data Foundation Disaster Recovery solution protects persistent volume claim (PVC) data in addition to the persistent volume (PV) data. A failover or a relocate is blocked for workloads that existed prior to the upgrade as they do not have the PVC data backed up.
Workaround:
Perform the following steps on the primary cluster for each workload to ensure that the OpenShift Data Foundation Disaster Recovery solution backs up the PVC data in addition to PV data:
Edit the volume replication group (VRG) status:
$ oc edit --subresource=status vrg -n <namespace> <name>
-
Set the ClusterDataProtected status to
False
for eachvrg.Status.ProtectedPVCs
conditions. This populates the S3 store with applications PVCs. - Restart the Ramen pod by scaling it down to 0 and back to 1.
-
Ensure that the S3 store is populated with the application PVCs by waiting for the
ClusterDataProtected
status of all thevrg.Status.ProtectedPVCs
conditions to turn toTRUE
again.
When you want to initiate the failover or relocate operation, perform the following step:
Clear the
status.preferredDecision.ClusterNamespace
field by editing the DRPC status subresource before initiating a failover or a relocate operation (if not already edited ):$ oc edit --subresource=status drpc -n <namespace> <name>
Incorrect value cached
status.preferredDecision.ClusterNamespace
When OpenShift Data Foundation is upgraded from version 4.12 to 4.13, the disaster recovery placement control (DRPC) might have incorrect value cached in
status.preferredDecision.ClusterNamespace
. As a result, the DRPC incorrectly enters theWaitForFencing
PROGRESSION instead of detecting that the failover is already complete. The workload on the managed clusters is not affected by this issue.Workaround:
-
To identify the affected DRPCs, check for any DRPC that is in the state
FailedOver
as CURRENTSTATE and are stuck in theWaitForFencing
PROGRESSION. To clear the incorrect value edit the DRPC subresource and delete the line,
status.PreferredCluster.ClusterNamespace
:$ oc edit --subresource=status drpc -n <namespace> <name>
To verify the DRPC status, check if the PROGRESSION is in
COMPLETED
state andFailedOver
as CURRENTSTATE.
-
To identify the affected DRPCs, check for any DRPC that is in the state
7.2. Ceph
Poor performance of the stretch clusters on CephFS
Workloads with many small metadata operations might exhibit poor performance because of the arbitrary placement of metadata server (MDS) on multi-site Data Foundation clusters.
SELinux relabelling issue with a very high number of files
When attaching volumes to pods in Red Hat OpenShift Container Platform, the pods sometimes do not start or take an excessive amount of time to start. This behavior is generic and it is tied to how SELinux relabelling is handled by the Kubelet. This issue is observed with any filesystem based volumes having very high file counts. In OpenShift Data Foundation, the issue is seen when using CephFS based volumes with a very high number of files. There are different ways to workaround this issue. Depending on your business needs you can choose one of the workarounds from the knowledgebase solution https://access.redhat.com/solutions/6221251.
Ceph becomes unresponsive during net split between data zones
In a two-site stretch cluster with arbiter, during netsplit between data zones, the Rook operator throws an error stating
failed to get ceph status
. This happens because when netsplit is induced to zone1 and zone2, zone1 and arbiter, the zone1 is cut off from the cluster for about 20 minutes. However, it is found that the error thrown in the rook-operator log is not relevant and OpenShift Data Foundation operates normally in netsplit situation.
CephOSDCriticallyFull
andCephOSDNearFull
alerts not firing as expectedThe alerts
CephOSDCriticallyFull
andCephOSDNearFull
do not fire as expected becauseceph_daemon
value format has changed in Ceph provided metrics and these alerts rely on the old value format.
7.3. CSI Driver
Automatic flattening of snapshots does not work
When there is a single common parent RBD PVC, if volume snapshot, restore, and delete snapshot are performed in a sequence more than 450 times, it is further not possible to take volume snapshot or clone of the common parent RBD PVC.
To workaround this issue, instead of performing volume snapshot, restore, and delete snapshot in a sequence, you can use PVC to PVC clone to completely avoid this issue.
If you hit this issue, contact customer support to perform manual flattening of the final restore PVCs to continue to take volume snapshot or clone of the common parent PVC again.
7.4. OpenShift Data Foundation console
Unable to initiate failover of applicationSet based applications after hub recovery
When a primary managed cluster is down after hub recovery, it is not possible to use the user interface (UI) to trigger failover of
applicationSet
based applications if its last action wasRelocate
.Workaround: To trigger failover from the command-line interface(CLI), set the
DRPC.spec.action
field toFailover
as follows:$ oc edit drpc -n openshift-gitops app-placement-drpc
[...] spec action: Failover [...]
Topology view does not work for external mode
In external mode, the nodes are not labelled with OCS label. As a result, the topology view cannot show nodes at the first level.