Chapter 7. Known issues
This section describes known issues in Red Hat OpenShift Container Storage 4.8.
Arbiter nodes can not be labelled with the OpenShift Container Storage node label
Arbiter nodes are considered as valid non-arbiter nodes if they are labelled with the OpenShift Container Storage node label, cluster.ocs.openshift.io/openshift-storage
. This means the placement for the non-arbiter resources becomes undetermined. To work around this issue, do not label the arbiter nodes with the OpenShift Container Storage node label so that only arbiter resources are placed on the arbiter nodes.
Ceph status is HEALTH_WARN after disk replacement
After disk replacement, a warning 1 daemons have recently crashed
is seen even if all OSD pods are up and running. This warning causes a change in Ceph’s status. The Ceph status should be HEALTH_OK instead of HEALTH_WARN. To workaround this issue, rsh to the ceph-tools pod and silence the warning, the Ceph health will then be back to HEALTH_OK.
Monitoring spec is reset in CephCluster
resource
Monitoring spec becomes empty whenever ocs-operator
is restarted or during an upgrade. This has no impact on the functionality but If you are looking for the monitoring endpoint details, you will find it empty.
To resolve this issue, update the rook-ceph-external-cluster-details
secret after upgrading from 4.7 to 4.8 so that the details of all endpoints (such as comma-separated IP addresses of active and standby MGRs) are updated into the "MonitoringEndpoint" data key. This also helps to avoid any problems in the future raised due to differences in the number of endpoints in fresh versus upgraded clusters.
Issue with noobaa-db-pg-0
noobaa-db-pg-0 pod does not migrate to other nodes when the hosting node goes down. NooBaa will not work when a node is down as migration of noobaa-db-pg-0 pod is blocked.
If a plugin pod is deleted, the data becomes inaccessible until a node restart takes place
The issue is caused because netns
of the mount gets destroyed when the csi-cephfsplugin
pod is restarted which results in csi-cephfsplugin
locking up all mounted volumes. This issue is seen only in clusters enabled with multus.
The issue is resolved when you restart the node on which csi-cephfsplugin
was restarted after the deletion.
Encryption passphrase is stored in the source KMS for restoring volumes from the snapshot
When the parent and the restored PVC have different StorageClasses with different backend paths in the KMS settings, the restored PVC goes into the Bound
state and the encryption passphrase is created in the backend path of the KMS settings from the snapshot. The restored PVC can fails to get attached to a Pod as the checks for the encryption passphrase use the settings linked from the 2nd StorageClass path, where the encryption passphrase can not be found in the backend path.
To prevent the issue, PVCs should always use the same KMS settings when creating snapshots and restoring them.
Keys are still listed in Vault after deleting encrypted PVCs while using the kv-v2
secret engine
Hashicorp Vault added a feature for the key-value store v2 where deletion of the stored keys makes it possible to recover the contents in case the metadata of the deleted key is not removed in a separate step. When using key-value v2 storage for secrets in Hashicorp Vault, deletion of volumes will not remove the metadata of the encryption passphrase from the KMS. Though it is possible to restore the encryption passphrase at a later time. These partially deleted keys are not automatically cleaned up by the KMS.
You can resolve this issue by manually deleting the metadata of the removed keys. Any key that has the deletion_time
set in the metadata can be assumed to have been deleted when key-value storage v1 was used but kept available with v2.
Restore Snapshot/Clone operations with greater size than parent PVC results in endless loop
Ceph CSI does not support restoring a snapshot or creating clones with a size greater than the parent PVC. Therefore, Restore Snapshot/Clone operations with a greater size results in an endless loop. To workaround this issue, delete the pending PVC. In order to get a larger PVC, complete one of the following based on the operation you are using: If using Snapshots, restore the existing snapshot to create a volume of the same size as the parent PVC, then attach it to a pod and expand the PVC to the required size. For more information, see Volume snapshots. If using Clone, clone the parent PVC to create a volume of the same size as the parent PVC, then attach it to a pod and expand the PVC to the required size. For more information, see Volume cloning.
PVC restored from a snapshot or cloned from a thick provisioned PVC, is not thick provisioned
When the snapshot of a thick provisioned PVC is restored using thick provisioning
enabled storage class, the restored volume is not thick provisioned. The restored PVC reaches the Bound
state without thick provisioning. This can only be fixed when RHCS-5.x is used. Older Ceph versions do not support copying of zero-filled data blocks (used when thick-provisioning).
Currently, to resolve the issue with RHCS-4.x based deployments is to mark PVC-cloning and snapshot-restoring of thick-provisioned volumes as a limitation. The newly created volumes will become thin-provisioned.
Deleting the pending PVC and RBD provisioner leader pod while the thick provisioning is progressing, will leave a stale image and OMAP metadata
When an RBD PVC is being thick provisioned, the Persistent Volume Claim (PVC) is in a Pending
state. If the RBD provisioner leader and the PVC itself are deleted, the RBD image and OMAP metadata will not be deleted.
To address this issue, do not delete the PVC while the thick provisioning is in progress.
Provisioning attempts did not stop once the storage cluster utilization reached 85% or even after deleting the PVC.
If the storage cluster utilization reaches 85% while an RBD thick PVC is being provisioned, the provisioning attempt will not stop automatically by deleting the pending PVC and the RBD image will not get deleted even after deleting the pending PVC.
The best approach is not to start provisioning if the requested size is beyond the available storage.
Keys for OSDs in the Vault are not deleted during uninstall when kv-v2
is used
Key encryption keys data are soft-deleted from Vault during cluster deletion when the Vault K/V Secret engine is version 2. This means any version of the Key can be retrieved and so the deletion is undone. The metadata is still visible so the key can be restored. If this is causing inconvenience, the key can still be deleted manually using the vault command with the "destroy" argument.
Deletion of CephBlockPool
gets stuck and blocks the creation of new pools
In a Multus enabled cluster, the Rook Operator does not have the network annotations and thus does not have access to the OSD network. This means that when running "rbd" type commands during pool cleanup, the command hangs since it cannot contact the OSDs. The workaround is to delete the CephBlockPool
manually using the toolbox.
Device replacement action cannot be performed through the user interface for an encrypted OpenShift Container Storage cluster
On an encrypted OpenShift Container Storage cluster, the discovery result CR discovers the device backed by a Ceph OSD (Object Storage Daemon) differently from the one reported in the Ceph alerts. When clicking the alert, the user is presented with Disk not found message. Due to the mismatch, console UI cannot enable the disk replacement option for an OpenShift Container Storage user. To workaround this issue, use the CLI procedure for failed device replacement in the Replacing Devices guide.
False Alert for PVCs with volumeMode as block
Due to a change in Kubernetes, there is a regression in a Prometheus alert of OpenShift Container Platform. This change has impacted the following:
Alert: KubePersistentVolumeFillingUp.
PVCs: PVCs in volumeMode: Block
Matching regular expression in the namespaces: "(openshift-.|kube-.|default|logging)"
Metric: kubelet_volume_stats_available_bytes
As a result, the alert, kubelet_volume_stats_available_bytes reports the available size as 0 from the time of PVC creation, and a false alert is fired for all the PVCs in volumeMode: Block in namespaces that match the regular expression: "(openshift-.|kube-.|default|logging)". This impacts all the PVCs created for OSD device sets of OpenShift Container Storage deployed in internal and internal-attached modes and on different infrastructures like Amazon Web Services, VMware, Baremetal, and so on. This also impacts the customer workload PVCs.
Currently, there is no workaround available for this issue until it is fixed in one of the upcoming minor releases of OpenShift Container Platform 4.8.z. Hence, address any alert regarding storage capacity by OpenShift Container Storage very promptly and with urgency.
Critical alert notification is sent after installation of arbiter storage cluster, when ceph object user for cephobjectstore
fails to be created during storage cluster reinstallation.
In a storage cluster containing a CephCluster
and one or more CephObjectStores
, if the CephCluster
resource is deleted before all of the CephObjectStore
resources are fully deleted, the Rook Operator can still keep connection details about the CephObjectStore(s) in memory. If the same CephCluster
and CephObjectStore(s) are re-created, the CephObjectStore(s) may enter "Failed" state.
To avoid this issue, delete the CephObjectStore(s) completely before removing the CephCluster.
- If you do not wish to wait for the CephObjectStore(s) to be deleted, restarting the Rook Operator (by deleting the Operator Pod) will avoid the issue if done after uninstall.
-
If you are actively experiencing this issue, restarting the Rook Operator will resolve it by clearing the Operator’s memory of old
CephObjectStore
connection details.