Chapter 7. Known issues
This section describes the known issues in Red Hat OpenShift Data Foundation 4.11.
7.1. Disaster recovery
Creating an application namespace for the managed clusters
Application namespace needs to exist on RHACM managed clusters for disaster recovery (DR) related pre-deployment actions and hence is pre-created when an application is deployed at the RHACM hub cluster. However, if an application is deleted at the hub cluster and its corresponding namespace is deleted on the managed clusters, they reappear on the managed cluster.
Workaround:
openshift-dr
maintains a namespacemanifestwork
resource in the managed cluster namespace at the RHACM hub. These resources need to be deleted after the application deletion. For example, as a cluster administrator, execute the following command on the hub cluster:oc delete manifestwork -n <managedCluster namespace> <drPlacementControl name>-<namespace>-ns-mw
.
Failover action reports RADOS block device image mount failed on the pod with RPC error still in use
Failing over a disaster recovery (DR) protected workload might result in pods using the volume on the failover cluster to be stuck in reporting RADOS block device (RBD) image is still in use. This prevents the pods from starting up for a long duration (upto several hours).
Failover action reports RADOS block device image mount failed on the pod with RPC error fsck
Failing over a disaster recovery (DR) protected workload may result in pods not starting with volume mount errors that state the volume has file system consistency check (fsck) errors. This prevents the workload from failing over to the failover cluster.
Relocation fails when failover and relocate is performed within a few minutes of each action
When the user starts relocating an application from one cluster to another before the
PeerReady
condition status isTRUE
, the condition status is seen through the DRPC YAML file or by running the followingoc
command:$ oc get drpc -o yaml -n <application-namespace>
where
<application-namespace>
is the namespace where the workloads are present for deploying the application.If the Relocation is initiated before the peer (target cluster) is in a clean state, then the relocation will stall forever.
Workaround: Change the DRPC
.Spec.Action
back toFailover
, and wait until thePeerReady
condition status is TRUE. After applying the workaround, change the Action to Relocate, and the relocation will take effect.
User is able to set the value to zero minutes as the Sync schedule while creating DR Policy and it reports 'Sync' as Replication policy and gets validated on a Regional-DR setup
The
DRPolicyList
page uses thesync
interval value to display the replication type. If it is set to zero then the replication type is considered as Sync(synchronous) for the metro as well as regional clusters. This issue confuses the users because the backend is consideringAsync
even when the user interface shows it asSync
scheduling type.Workaround: Need to fetch Ceph
Fsid
from DRCluster CR status to decidesync
orasync
.
Deletion of the Application deletes the pods but not PVCs
When deleting an application from the RHACM console, DRPC does not get deleted. Not deleting DRPC leads to not deleting the VRG as well as the VR. If the VRG/VR is not deleted, the PVC finalizer list will not be cleaned up, causing the PVC to stay in a
Terminating
state.Workaround: Manually delete DRPC on the hub cluster using the following command
$ oc delete drpc <name> -n <namespace>
Result:
- DRPC deletes the VRG
- VRG deletes VR
- VR removes its finalizer from the PVC’s finalizer list
- VRG removes its finalizer from the PVC’s finalizer list
Both the DRPCs protect all the persistent volume claims created on the same namespace
The namespaces that host multiple disaster recovery (DR) protected workloads, protect all the persistent volume claims (PVCs) within the namespace for each DRPlacementControl resource in the same namespace on the hub cluster that does not specify and isolate PVCs based on the workload using its
spec.pvcSelector
field.This results in PVCs, that match the DRPlacementControl
spec.pvcSelector
across multiple workloads or if the selector is missing across all workloads, replication management to potentially manage each PVC multiple times and cause data corruption or invalid operations based on individual DRPlacementControl actions.Workaround: Label PVCs that belong to a workload uniquely, and use the selected label as the DRPlacementControl
spec.pvcSelector
to disambiguate which DRPlacementControl protects and manages which subset of PVCs within a namespace. It is not possible to specify thespec.pvcSelector
field for the DRPlacementControl using the user interface, hence the DRPlacementControl for such applications must be deleted and created using the command line.Result: PVCs are no longer managed by multiple DRPlacementControl resources and do not cause any operation and data inconsistencies.
RBD mirror scheduling is getting stopped for some images
The Ceph manager daemon gets blocklisted due to different reasons, which causes the scheduled RBD mirror snapshot from being triggered on the cluster where the image(s) are primary. All RBD images that are mirror enabled (hence DR protected) do not list a schedule when examined using
rbd mirror snapshot schedule status -p ocs-storagecluster-cephblockpool
, and hence are not actively mirrored to the peer site.Workaround: Restart the Ceph manager deployment, on the managed cluster where the images are primary, to overcome the blocklist against the currently running instance, this can be done by scaling down and then later scaling up the ceph manager deployment as follows:
$ oc -n openshift-storage scale deployments/rook-ceph-mgr-a --replicas=0 $ oc -n openshift-storage scale deployments/rook-ceph-mgr-a --replicas=1
Result: Images that are DR enabled and denoted as primary on a managed cluster start reporting mirroring schedules when examined using
rbd mirror snapshot schedule status -p ocs-storagecluster-cephblockpool
Ceph does not recognize the global IP assigned by Globalnet
Ceph does not recognize global IP assigned by Globalnet, so disaster recovery solution cannot be configured between clusters with overlapping service CIDR using Globalnet. Due to this disaster recovery solution does not work when service
CIDR
overlaps.
Volume replication group deletion is stuck on a fresh volume replication created during deletion, which is stuck as the persistent volume claim cannot be updated with a finalizer
Due to a bug in the disaster recovery (DR) reconciler, during deletion of the internal
VolumeReplicaitonGroup
resource on a managed cluster, from where a workload failed over or relocated from, a persistent volume claim (PVC) is attempted to be protected. The resulting cleanup operation does not complete and reports thePeerReady
condition on theDRPlacementControl
for the application.This results in the application that was failed over or relocated, cannot be relocated or failed over again due to
DRPlacementControl
resource reporting itsPeerReady
condition asfalse
.Workaround: Before applying the workaround, determine if the cause is due to protecting a PVC during
VolumeReplicationGroup
deletion as follows:Ensure the
VolumeReplicationGroup
resource in the workload namespace on the managed cluster from where it was relocated or failed over from has the following values:-
VRG
metadata.deletionTimestamp
isnon-zero
-
VRG
spec.replicationState
isSecondary
-
VRG
List the
VolumeReplication
resources in the workload namespace as above, and ensure the resource have the following values:-
metadata.generation
is set to1
-
spec.replicationState
is set toSecondary
- The VolumeReplication resource reports no status
-
-
For each VolumeReplication resource in the above state, their corresponding PVC resource (as seen in the VR
spec.dataSource
field) should have the valuesmetadata.deletionTimestamp
asnon-zero
To recover, remove the finalizer
-
volumereplicationgroups.ramendr.openshift.io/vrg-protection
from the VRG resource -
volumereplicationgroups.ramendr.openshift.io/pvc-vr-protection
from the respective PVC resources
-
Result:
DRPlacementControl
at the hub cluster reportsPeerReady
condition astrue
and enables further workload relocation or failover actions. (BZ#2116605)
MongoDB pod is in
CrashLoopBackoff
because of permission errors reading data incephrbd
volumeThe Openshift projects across different managed clusters have different security Context constraints (SCC), which specifically differ in the specified UID range and/or
FSGroups
. This leads to certain workload pods and containers failing to start post failover or relocate operations within these projects, due to filesystem access errors in their logs.Workaround: Ensure workload projects are created on all managed clusters with the same project-level SCC labels, allowing them to use the same filesystem context when failed over or relocated. Pods will no longer fail post-DR actions on filesystem-related access errors.
While failover to secondary cluster, some of PVC remains in Primary cluster
The behavior before Kubernetes v1.23 was that the Kubernetes control plane never cleaned up the PVCs created for StatefulSets. That’s left to the cluster administrator or a software operator managing the StatefulSets. Due to this, the PVCs of the StatefulSets were left untouched when their Pods are deleted. This prevents Ramen from failing back an application to its original cluster.
Workaround: If the workload uses StatefulSets, then do the following before failing back or relocating to another cluster
-
Run
oc get drpc -n <namespace> -o wide
If the PeerReady column shows "TRUE" then you can proceed with the failback or relocation. Otherwise, do the following on the peer cluster:
-
Run
oc get pvc -n <namespace>
-
For each bounded PVC for that namespace that belongs to the StatefulSet, run
oc delete pvc <pvcname> -n namespace
- Once all PVCs are deleted, Volume Replication Group (VRG) transitions to secondary, and then gets deleted.
-
Run
-
Run the following command again
oc get drpc -n <namespace> -o wide
. After a few seconds to a few minutes, the PeerReady column changes toTRUE
. Then you can proceed with the failback or relocation.
Result: The peer cluster gets cleaned up and ready for new 'Action'. (BZ#2118270)
-
Run
Application is stuck in Relocating state during failback
Multicloud Object Gateway allowed multiple persistent volume (PV) objects of the same name or namespace to be added to the S3 store on the same path. Due to this, Ramen does not restore the PV because it detected multiple versions pointing to the same
claimRef
.Workaround: Use S3 CLI or equivalent to clean up the duplicate PV objects from the S3 store. Keep only the one that has a timestamp closer to the failover or relocate time.
Result: The restore operation will proceed to completion and the failover or relocate operation proceeds to the next step.
Application is stuck in a FailingOver state when a zone is down
At the time of a failover or relocate, if none of the s3 stores are reachable then the failover or relocate process hangs. If the Openshift DR logs indicate that the S3 store is not reachable, then troubleshooting and getting the s3 store operational will allow the OpenShift DR to proceed with the failover or relocate operation.
ceph df
reports an invalid MAX AVAIL value when the cluster is in stretch modeWhen a crush rule for a Red Hat Ceph Storage cluster has multiple "take" steps, the
ceph df
report shows the wrong maximum available size for the map. The issue will be fixed in an upcoming release.
7.2. Multicloud Object Gateway
rook-ceph-operator-config
ConfigMap is not updated when OpenShift Container Storage is upgraded from version 4.5 to other versionocs-operator
uses the rook-ceph-operator-config ConfigMap to configure rook-ceph-operator behaviors, however it only creates it once and then does not reconcile it. This raises the problem that it will not update the default values for the product as they evolve.Workaround: Administrators can manually change the rook-ceph-operator-config values.
Storage cluster and storage system
ocs-storagecluster
is in an error state for a few minutes when installing the storage systemDuring storage cluster creation, there is a small window of time where it will appear in an error state before moving on to a successful or ready state. This is an intermittent state, so it will usually resolve by itself and become successful or ready.
Workaround: Wait and watch status messages or logs for more information.
7.3. CephFS
Ceph OSD snap trimming is no longer blocked by a running scrub
Previously, OSD snap trimming, once blocked by a running scrub, was not restarted. As a result, no trimming was performed until an OSD reset. This release fixes the handling of restarting the trimming if blocked after the scrub and snap trimming works as expected.
Poor performance of the stretch clusters on CephFS
Workloads with many small metadata operations might exhibit poor performance because of the arbitrary placement of metadata server (MDS) on multi-site OpenShift Data Foundation clusters.
Restoring snapshot fails with size constraint when the parent PVC is expanded
If we create a new restored persistent volume claim (PVC) from a volume snapshot whose size is the same as the volume snapshot, then it will fail if the parent PVC is resized after taking the volume snapshot and before creating the new restored PVC.
Workaround: You can use any one of the following workarounds
- Do not resize the parent PVC if you have any volume snapshot created from it and you have a plan to restore the volume snapshot to a new PVC.
- Create a restored PVC of the same size as the parent PVC.
- If the restored PVC is already created and is in the pending state, delete the PVC and recreate it with the same size as the parent PVC.
7.4. OpenShift Data Foundation operator
PodSecurityViolation alert starts to fire when the OpenShift Data Foundation operator is installed
OpenShift introduced Pod Security Admission to enforce security restrictions on Pods when scheduled such that OpenShift 4.11 has an audit and warn events with enforcing privileged (same as 4.10).
As a result, you will see warnings in events since the
openshift-storage
namespace doesn’t have the required enforcement labels for Pod Security Admission.(BZ#2110628)