Chapter 6. Bug fixes
This section describes the notable bug fixes introduced in Red Hat OpenShift Data Foundation 4.13.
6.1. Multicloud Object Gateway
Reconcile of
disableLoadBalancerService
field is ignored in OpenShift Data Foundation operatorPreviously, any change to the
disableLoadBalancerService
field for Multicloud Object Gateway (MCG) was overridden due to the OpenShift Data Foundation operator reconciliation.With this fix, reconcile of
disableLoadBalancerService
field is ignored in OpenShift Data Foundation operator and, as a result, any value set for this field in NooBaa CR is retained and not overridden.
Performance improvement for non optimized database related flows on deletions
Previously, non optimized database related flows on deletions caused Multicloud Object Gateway to spike in CPU usage and perform slowly on mass delete scenarios. For example, reclaiming a deleted object bucket claim (OBC).
With this fix, indexes for the bucket reclaimer process are optimized, a new index is added to the database to speed up the database cleaner flows, and bucket reclaimer changes are introduced to work on batches of objects.
OpenShift generated certificates used for MCG internal flows to avoid errors
Previously, there were errors in some of the Multicloud Object Gateway (MCG) internal flows due to self-signed certificate that resulted in failed client operations. This was due to the use of self-signed certification in internal communication between MCG components.
With this fix, OpenShift Container Platform generated certificate is used for internal communications between MCG components, thereby avoiding the errors in the internal flows.
Metric for number of bytes used by Multicloud Object Gateway bucket
Previously, there was no metric to show the number of bytes used by Multicloud Object Gateway bucket.
With this fix, a new metric
NooBaa_bucket_used_bytes
is added, which shows the number of bytes used by the Multicloud Object Gateway bucket.
Public access disabled for Microsoft Azure blob storage
Previously, the default container created in Microsoft Azure was with public access enabled and caused security concerns.
With this fix, the default container created will not have the public access enabled by default which means
AllowBlobPublicAccess
is set to false.
Multicloud Object Gateway bucket buckets are deleted even when the replication rules are set
Previously, if replication rules were set for a Multicloud Object Gateway bucket, the bucket was not considered to be eligible for deletion, thereby the buckets would stay without getting deleted. With this fix, the replication rules on a specific bucket are updated when the bucket is being deleted and as a result the bucket is deleted.
Database
init
container ownership replaced with Kubernetes FSGroupPreviously, Multicloud Object Gateway (MCG) failed to come up and serve when
init
container for MCG database (DB) pod failed to change ownership.With this fix, DB
init
container ownership is replaced with Kubernetes FSGroup. (BZ#2115616)
6.2. CephFS
cephfs-top
is able to display more than 100 clientsPreviously, when you tried to load more than 100 clients to
cephfs-top
, in a few instances, it showed a blank screen and went into hung state ascephfs-top
could not accommodate the clients in the display due to less or no space. Because the clients were displayed based onx_coord_map
calculations,cephfs-top
could not accommodate more clients in the display.This issue is fixed as a part of another BZ in Ceph when
ncurses
scrolling and a new way of displaying clients were introduced incephfs-top
. Thex_coord_map
calculation was also dropped. So,cephfs-top
now displays 200 or more clients.
6.3. Ceph container storage interface (CSI)
RBD Filesystem PVC expands even when the StagingTargetPath is missing
Previously, the RADOS block device (RBD) Filesystem persistent volume claim (PVC) expansion was not successful when the
StagingTartgetPath
was missing in theNodeExpandVolume
remote procedure call (RPC) and Ceph CSI was not able to get the device details to expand.With this fix, Ceph CSI goes through all the mount references to identify the
StageingTargetPath
where the RBD image is mounted. As a result, RBD Filesystem PVC expands successfully even when theStagingTargetPath
is missing.
Default memory and CPU resource limit increased
Previously,
odf-csi-addons-operator
had low memory resource limit and as a result theodf-csi-addons-operator
pod wasOOMKilled
(out of memory).With this fix, the default memory and the CPU resource limit has been increased and
odf-csi-addons-operator
OOMKills
are not observed.
6.4. OpenShift Data Foundation operator
Two separate routes for secure and insecure ports
Previously, http request failures occured as route ended up using the secure port because the port in RGW service for its
openshiftroute
was not defined.With this fix, insecure port for the existing OpenShift for RGW are defined properly and a new route with secure port is created, thereby avoiding the http request failures. Now, two routes are available for RGW, the existing route uses the insecure port and the new separate route uses the secure port.
Reflects correct state of the Ceph cluster in external mode
Previously, when OpenShift Data Foundation is deployed in external mode with a Ceph cluster, the negative conditions such as storagecluster
ExternalClusterStateConnected
were not cleared from the storage cluster even when the associated Ceph cluster was in a good state.With this fix, the negative conditions are removed from the storage cluster when the Ceph cluster is in a positive state, thereby reflecting the correct state of the Ceph cluster.
nginx
configurations are added through the ConfigMapPreviously, when IPv6 was disabled at node’s kernel level,
IPv6 listen
directive ofnginx
configuration for theodf-console
pod gave an error. As a result, OpenShift Data Foundation was stuck withodf-console
not available andodf-console
is inCrashLoopBackOff
errors.With this fix, all the
nginx
configurations are added through the ConfigMap created by theodf-operator
.
6.5. OpenShift Data Foundation console
User interface correctly passes the PVC name to the CR
Previously, while creating NamespaceStore in the user interface (UI) using file system, the UI would pass the entire persistent volume claim (PVC) object to the CR instead of just the PVC name that is required to be passed to the CR’s
spec.nsfs.pvcName
field. As a result, an error was seen on the UI.With this fix, only the PVC name is passed to the CR instead of the entire PVC object.
Refresh popup is shown when OpenShfit Data Foundation is upgraded
Previously, when OpenShfit Data Foundation was upgraded, OpenShift Container Platform did not show the Refresh button due to lack of awareness about the changes. OpenShift used to not perform checks to know the changes in the
version
field of theplugin-manifest.json
file present in theodf-console
pod.With this fix, OpenShift Container Platform and OpenShift Data Foundation are configured to poll the manifest for OpenShift Data Foundation user interface. Based on the change in version a Refresh popup is shown.
6.6. Rook
StorageClasses are created even if the RGW endpoint is not reachable Previously, in OpenShift Data Foundation external mode deployment, if the RADOS gateway (RGW) endpoints were not reachable and Rook fails to configure the CephObjectStore, creation of RADOS block device (RBD) and CephFS also would fail as these were tightly coupled in the python script,
create-external-cluster-resources.py
.With this fix, the issues in the python script was fixed to make separate calls instead of failing or showing errors and the StorageClasses are created.