Chapter 7. Known issues

This section describes known issues in Red Hat OpenShift Data Foundation 4.9.

odf-operator is missing when OpenShift Container Storage is upgraded from version 4.8 to 4.9

Currently, while upgrading the ocs-operator, if you change the channel in the OpenShift Container Storage subscription without installing the odf-operator, the cluster will only have the OpenShift Data Foundation and Multicloud Object Gateway (MCG) installed, and the 'odf-operator' will be missing from the cluster.

Workaround: Install the odf-operator from the graphical user interface (GUI) or backend. Ensure that the subscription name is odf-operator if you create it via the backend.

(BZ#2050251)

Multicloud Object Gateway insecure storage account does not support TLS 1.2

Multicloud Object Gateway (MCG) does not support Microsoft Azure storage account configured with Transport Layer Security (TLS) 1.2. As a result, you cannot create the default backing store or any new backing store on a storage account which is with 1.2 only policy.

(BZ#1970123)

Critical alert notification is sent after installation of arbiter storage cluster, when Ceph object user for cephobjectstore fails to be created during storage cluster reinstallation

In a storage cluster containing a CephCluster and one or more CephObjectStores, if the CephCluster resource is deleted before all of the CephObjectStore resources are fully deleted, the Rook Operator can still keep connection details about the CephObjectStores in memory. If the same CephCluster and CephObjectStores are re-created, the CephObjectStores might enter Failed state.

To avoid this issue, you can delete the CephObjectStores completely before removing the CephCluster. If you do not want to wait for the CephObjectStores to be deleted, restart the Rook Operator (by deleting the Operator Pod) to avoid the issue if done after uninstall. If you are actively experiencing this issue, restart the Rook Operator to resolve it by clearing the Operator’s memory of old CephObjectStore connection details.

(BZ#1974344)

Poor performance of stretch clusters on CephFS

Workloads with many small metadata operations might exhibit poor performance because of the arbitrary placement of metadata server (MDS) on multi-site OpenShift Data Foundation clusters.

(BZ#1982116)

rook-ceph-operator-config ConfigMap is not updated when OpenShift Container Storage is upgraded from version 4.5 to other version

ocs-operator uses the rook-ceph-operator-config ConfigMap to configure rook-ceph-operator behaviors, however it only creates it once and then does not reconcile it. This raises the problem that it will not update the default values for the product as they evolve.

Workaround: Administrators can manually change the rook-ceph-operator-config values.

(BZ#1986016)

Automate the creation of cephobjectstoreuser for object bucket claim metrics collector

Currently, the object bucket claim (OBC) metrics collection fails because the ocs-metrics-exporter expects the Ceph object store user named prometheus-user.

Workaround: Manually, create prometheus-user and provide appropriate permissions after the storage cluster creation. Refer to the Prerequisites section of the Knowledge Base article https://access.redhat.com/articles/6541861 for more information.

(BZ#1999952)

StorageCluster and StorageSystem ocs-storagecluster are in error state for a few minutes when installing StorageSystem

During StorageCluster creation, there is a small window of time where it will appear in an error state before moving on to a successful/ready state. This is an intermittent but expected behavior, and will usually resolve itself.

Workaround: Wait and watch status messages or logs for more information.

(BZ#2004027)

Tenant config does not override backendpath if the key is specified in upper case

Key Management Service (KMS) provider options set in a Tenants namespace is more advanced than the key/value settings that the OpenShift Container Storage user interface supports. As a result, the configuration options for KMS providers set in the Tenants namespace need to be formatted as camel case, instead of upper case. It might be confusing for the users that have access to the KMS provider configuration in the openshift-storage namespace, and the configuration in a Tenants namespace as options in the openshift-storage namespace are in upper case, whereas the options in the Tenants namespace are in camel case.

Workaround: Use camel case formatting for the KMS provider options.

(BZ#2005801)

Deleting a protected application that has been failed over and later relocated does not delete the RADOS block device image on the secondary or failover site

Deleting a disaster recovery (DR) protected workload may leak RADOS block device (RBD) images on the secondary DR cluster. The deleted images would then occupy space on the secondary cluster. To resolve this issue, use a toolbox pod to detect and clean up the images on the secondary cluster that are no longer in use for DR protection. This workaround ensures space reclamation on the secondary cluster.

(BZ#2005919)

Failover action reports RADOS block device image mount failed on the pod with RPC error still in use

Failing over a disaster recovery (DR) protected workload may result in pods using the volume on the failover cluster to be stuck in reporting RADOS block device (RBD) image is still in use. This prevents the pods from starting up for a long duration (upto several hours).

(BZ#2007376)

Relocate action results in PVC’s in Termination state and workload is not moved to a preferred cluster

While relocating a disaster recovery (DR) protected workload, results in workload not stopping on the current primary cluster and PVCs of the workload remaining in the terminating state. This prevents pods and PVCs from being relocated to the preferred cluster. To recover the issue perform a failover action to move the workload to the preferred cluster. The workload would be recovered on the preferred cluster but may include a loss of data as the action is a failover.

(BZ#2019931)

Failover action reports RADOS block device image mount failed on the pod with RPC error fsck

Failing over a disaster recovery (DR) protected workload may result in pods not starting with volume mount errors that state the volume has file system consistency check (fsck) errors. This prevents the workload from failing over to the failover cluster.

(BZ#2021460)

Overprovision Level Policy Control does not support custom storage class

OpenShift Data Foundation limits the allowed storage classes in overprovision-control to Ceph sub-types only. As a result, if a user-defined storage class is used in the overprovision-control, the StorageCluster CRD is defined as invalid and that storage class cannot have the overprovision-control.

(BZ#2024545)

Chapter 7. Known issues

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Red Hat legal and privacy links

Red Hat legal and privacy links