Chapter 4. Bug fixes
This section describes bugs with significant impact on users that were fixed in this release of Red Hat Ceph Storage. In addition, the section includes descriptions of fixed known issues found in previous versions.
4.1. The ceph-ansible
Utility
osd_scenario: lvm
now works when deploying Ceph in containers
Previously, the lvm
installation scenario did not work when deploying a Ceph cluster in containers. With this update, the osd_scenario: lvm
installation method is supported as expected in this situation.
The --limit mdss
option now creates CephFS pools as expected
Previously, when deploying the Metadata Server (MDS) nodes by using the Ansible and the --limit mdss
option, Ansible did not create the Ceph File System (CephFS) pools. This bug has been fixed, and Ansible creates the CephFS pools as expected.
Ceph Ansible no longer fails if network interface names include dashes
When ceph-ansible
makes an inventory of network interfaces if they have a dash (-
) in the name the inventory must convert the dashes to undescores (_
) in order to use them. In some cases conversion did not occur and Ceph installation failed. With this update to Red Hat Ceph Storage, all dashes in the names of network interfaces are converted in the facts and installation completes successfully.
Ansible now sets container and service names that correspond with OSD numbers
When containerized Ceph OSDs were deployed with the ceph-ansible
utility, the resulting container names and service names of the OSDs did not correspond in any way to the OSD number and were thus difficult to find and use. With this update, ceph-ansible
has been improved to set container and service names that correspond with OSD numbers. Note that this change does not affect existing deployed OSDs.
Expanding clusters deployed with osd_scenario: lvm
works
Previously, the ceph-ansible
utility could not expand a cluster that was deployed by using the osd_scenario: lvm
option. The underlying source code has been modified, and clusters deployed with osd_scenario: lvm
can be expanded as expected.
Ansible now stops and disables the iSCSI gateway services when purging the Ceph iSCSI gateway
Previously, the ceph-ansible
utility did not stop and disable the Ceph iSCSI gateway services when using the purge-iscsi-gateways.yml
playbook. Consequently, the services had to be stopped manually. The playbook has been improved, and the iSCSI services are now stopped and disabled as expected when purging the iSCSI gateway.
The values passed into devices in osds.yml
are now validated
Previously in the osds.yml
of the Ansible playbook, the values passed into the devices
parameter were not validated. This caused errors when ceph-disk
, parted
, or other device preparation tools failed to operate on devices that did not exist. It also caused errors if the number of values passed into the dedicated_devices
parameter was not equal to the number of values passed into devices
. With this update, the values are validated as expected, and none of the above mentioned errors occur.
Purging clusters using ceph-ansible
deletes logical volumes as expected
When using the ceph-ansible
utility to purge a cluster that deployed OSDs with the ceph-volume
utility, the logical volumes were not deleted. This behavior caused logical volumes to remain in the system after the purge process completed. This bug has been fixed, and purging clusters using ceph-ansible
deletes logical volumes as expected.
The --limit osds
option now works as expected
Previously, an attempt to add OSDs by using the --limit osds
option failed on container setup. The underlying source code has been modified, and adding OSDs with --limit osds
works as expected.
Increased CPU CGroup limit for containerized Ceph Object Gateway
The default CPU CGroup limit for containerized Ceph Object Gateway (RGW) was very low and has been increased with this update to be more reasonable for typical Hard Disk Drive (HDD) production environments. However, consider evaluating what limit to set for the site’s configuration and workload. To customize the limit, adjust the ceph_rgw_docker_cpu_limit
parameter in the Ansible group_vars/rgws.yml
file.
SSL works as expected with containerized Ceph Object Gateways
Previously, the SSL configuration in containerized Ceph Object Gateways did not work because the Certificate Authority (CA) certificate was only added to the TLS bundle on the hypervisor and was not propagated to the Ceph Object Gateway container due to missing container bind mounts on the /etc/pki/ca-trusted/
directory. This bug has been fixed, and SSL works as expected with containerized Ceph Object Gateways.
The rolling-upgrade.yml
playbook now restarts all OSDs as expected
Due to a bug in a regular expression, the rolling-upgrade.yml
playbook did not restart OSDs that used Non-volatile Memory Express devices. The regular expression has been fixed, and rolling-upgrade.yml
now restarts all OSDs as expected.
4.2. Ceph Management Dashboard
The OSD node details are now displayed in the Host OSD Breakdown
panel as expected
Previously, in the Red Hat Ceph Storage Dashboard, the Host OSD Breakdown
information was not displayed on the OSD Node Detail
panel under the All OSD Overview
section. With this update, the underlying issue has been fixed, and the OSD node details are displayed as expected.
4.3. Ceph File System
The Ceph Metadata Server no longer allows recursive stat rctime to go backwards
Previously, the Ceph Metadata Server used the client’s time to update rctime. But because client time may not be synchronized with the MDS, the inode rctime could go backwards. The underlying source code has been modified, and the Ceph Metadata Server no longer allows recursive stat rctime to go backwards.
The ceph-fuse
client no longer indicates incorrect recursive change time
Previously, the ceph-fuse
client did not update change time when file content was modified. Consequently, incorrect recursive change time was indicated. With this update, the bug has been fixed, and the client now indicates the correct change time.
The Ceph MDS no longer allows dumping of cache larger than 1 GB
Previously, if you attempted to dump a Ceph Metadata Server (MDS) cache with a size of around 1 GB or larger, the MDS could terminate unexpectedly. With this update, MDS no longer allows dumping of cache that size so the MDS no longer terminates in the described situation.
When Monitors cannot reach an MDS, they no longer incorrectly mark its rank as damaged
Previously, Monitors were evicting and fencing an unreachable Metadata Server (MDS), then MDS was signaling that its rank was damaged due to improper handling of blacklist errors. Consequently, Monitors were incorrectly marking the rank as damaged, and the file system became unavailable because of one or more damaged ranks. In this release, the Monitors are setting the correct rank.
The reconnect timeout for MDS clients has been extended
When the Metadata Server (MDS) daemon was handling a large number of reconnecting clients with a huge number of capabilities to aggregate, the reconnect timeout was reached. Consequently, the MDS rejected clients that attempted to reconnect. With this update, the reconnect timeout has been extended, and MDS now handles reconnecting clients as expected in the described situation.
Shrinking large MDS cache no longer causes the MDS daemon to appear to hang
Previously, an attempt to shrink a large Metadata Server (MDS) cache caused the primary MDS daemon to become unresponsive. Consequently, Monitors removed the unresponsive MDS and a standby MDS became the primary MDS. With this update, shrinking large MDS cache no longer causes the primary MDS daemon to hang.
4.4. Ceph Manager Plugins
HDD and SSD devices can now be mixed when accessing the /osd
endpoint
Previously, the Red Hat Ceph Storage RESTful API did not handle when HDD and SSD devices were mixed when accessing the /osd
endpoint and returned an error. With this update, the OSD traversal algorithm has been improved to handle this scenario as expected.
4.5. The ceph-volume
Utility
ceph-volume does not break custom named clusters
When using a custom storage cluster name other than ceph
, the OSDs could not start after a reboot. With this update, ceph-volume
provisions OSDs in a way that allows them to boot properly when a custom name is used.
Despite this fix, Red Hat does not support clusters with custom names. This is because the upstream Ceph project removed support for custom names in the Ceph OSD, Monitor, Manager, and Metadata server daemons. The Ceph project removed this support because it added complexities to systemd unit files. This fix was created before the decision to remove support for custom cluster names was made.
4.6. Containers
Deploying encrypted OSDs in containers by using ceph-disk
works as expected
When attempting to deploy a containerized OSD by using the ceph-disk
and dmcrypt
utilities, the container process failed to start because the OSD ID could not be found by the mounts table. With this update, the OSD ID is correctly determined, and the container process no longer fails.
4.7. Object Gateway
CivetWeb was rebased to upstream version 1.10 and the enable_keep_alive
CivetWeb option works as expected
When using the Ceph Object Gateway with the CivetWeb front end, the CivetWeb connections timed out despite the enable_keep_alive
option enabled. Consequently, S3 clients that did not reconnect or retry were not reliable. With this update, CivetWeb has been updated, and the enable_keep_alive
option works as expected. As a result, CivetWeb connections no longer time out in this case.
In addition, the new CivetWeb version introduces more strict header checks. This new behavior can cause certain return codes to change because invalid requests are detected sooner. For example, in previous version CivetWeb returned the 403 Forbidden
error on an invalid HTTP request, but in the new version it returns the 400 Bad Request
error instead.
Red Hat Ceph Storage passes the Swift Tempest test in the RefStack 15.0 toolset
Various improvements have been made to the Ceph Object Gateway Swift service. As a result, when configured correctly, Red Hat Ceph Storage 3.2, which includes the ceph-12.2.8
package, passes the Swift Tempest tempest.api.object_storage
test suite with the exception of the test_container_synchronization
test case. Red Hat Ceph Storage includes a different synchronization model, multisite operations, for users who require that feature.
Mounting the NFS Ganesha file server in a containerized IPv6 cluster no longer fails
When a containerized IPv6 Red Hat Ceph Storage cluster with an nfs-ganesha-rgw
daemon was deployed by using the ceph-ansible
utility, an attempt to mount the NFS Ganesha file server on a client failed with the Connection Refused
error. Consequently, I/O requests were unable to run. This update fixes the default configuration IPv6 connections, and mounting the NFS Ganesha server works as expected in this case.
Stale lifecycle configuration data of deleted buckets no longer persists in OMAP
consuming space
Previously, in the Ceph Object Gateway (RGW), incorrect key formatting in the RGWDeleteLC::execute()`function caused bucket lifecycle configuration metadata to persist after the deletion of the corresponding bucket. This caused stale lifecycle configuration data to persist in `OMAP
consuming space. With this update, the correct name for the lifecycle object is now used in RGWDeleteLC::execute()
, and the lifecycle configuration is removed as expected on removal of the corresponding bucket.
The Keystone credentials were moved to an external file
When using the Keystone identity service to authenticate a Ceph Object Gateway user, the Keystone credentials were set as plain text in the Ceph configuration file. With this update, the Keystone credentials are configured in an external file that only the Ceph user can read.
Wildcard policies match objects with colons in the name
Previously, using colons in the name caused an error in a matching function not allowing wildcards to match beyond colons. In this release, colons can be used to match objects.
Lifecycle rules with multiple tag filters are no longer rejected
Due to a bug in lifecycle rule processing, an attempt to install the lifecycle rules with multiple tag filters was rejected and the InvalidRequest
error message was returned. With this update, other rule forms are used, and lifecycle rules with multiple tag filters are no longer rejected.
An object can no longer be deleted even if a bucket or user policy with DENY s3:DeleteObject exists
Previously, this issue was caused by an incorrect value being returned by a method which evaluates policies. In this release, the correct value is being returned.
The Ubuntu nfs_ganesha
package did not install the systemd unit file properly
When running systemctl enable nfs-ganesha
the following error would be printed: Failed to execute operation: No such file or directory
. This was because the nfs-ganesha-lock.service
file was not created properly. With this release, the file is created properly and the nfs-ganehsa
service can be enabled successfully.
(BZ#1660063)
The Ceph Object Gateway supports a string as a delimiter
Invalid logic was used to find and project a delimiter sequence greater than one character. This was causing the Ceph Object Gateway to fail any request with a string as the delimiter, returning an invalid utf-8 character
message. The logic handling the delimiter has been replaced by an 8-bit shift-carry equivalent. As a result, a string delimiter will work correctly. Red Hat has only tested this against the US-ascii
character set.
Mapping NFS exports to Object Gateway tenant user IDs works as expected
Previously, the NFS server for the Ceph Object Gateway (nfs-ganesha
) did not correctly map Object Gateway tenants into their correct namespace. As a consequence, an attempt to map an NFS export onto Ceph Object Gateway with a tenanted user ID silently failed; the account could authenticate and NFS mounts could succeed, but the namespace did not contain buckets and objects. This bug has been fixed, and tenanted mappings are now set correctly. As a result, NFS exports can now be mapped to Object Gateway tenant user IDs and buckets and objects are visible as expected in the described situation.
An attempt to get bucket ACL for non-existing bucket returns an error as expected
Previously, an attempt to get bucket Access Control Lists (ACL) for a non-existent bucket by calling the GetBucketAcl()
function returned a result instead of returning a NoSuchBucket
error. This bug has been fixed, and the NoSuchBucket
error is returned in the aforementioned situation.
(BZ#1667142)
The log level for gc_iterate_entries
has been changed to 10
Previously, the log level for the gc_iterate_entries
log message was set to 0. As a consequence, OSD log files included unnecessary information and could grow significantly. With this update, the log level for gc_iterate_entries
has been changed to 10.
Garbage collection no longer consumes bandwidth without making forward progress
Previously, some underlying bugs prevented garbage collection (GC) from making forward progress. Specifically, the marker was not always being advanced, GC was unable to process entries with zero-length chains, and the truncated flag was not always being set correctly. This caused GC to consume bandwidth without making any forward progress, thereby not freeing up disk space, slowing down other cluster work, and allowing OMAP entries related to GC to continue to increase. With this update, the underlying bugs have been fixed, and GC is able to make progress as expected freeing up disk space and OMAP entries.
The radosgw-admin utility no longer gets stuck and creates high read operations when creating greater than 999 buckets per user
An issue with a limit check caused the radosgw-admin
utility to never finish when creating 1,000 or more buckets per user. This problem has been fixed and radosgw-admin
no longer gets stuck or creates high read operations.
LDAP authentication is available again
Previously, a logic error caused LDAP authentication checks to be skipped. Consequently, the LDAP authentication was not available. With this update, the checks for a valid LDAP authentication setup and credentials have been fixed, and LDAP authentication is available again.
(BZ#1687800)
NFS Ganesha no longer aborts when an S3 object name contains a //
sequence
Previously, the NFS server for the Ceph Object Gateway (RGW NFS
) would abort when as S3 object name contained a //
sequence. With this update, RGW NFS
ignores such sequence as expected and no longer aborts.
(BZ#1687970)
Expiration time is calculated the same as S3
Previously, a Ceph Object Gateway computed relative object’s life cycle expiration rules from the time of creation, rather than rounded to midnight UTC as in AWS. This could cause the following error: botocore.exceptions.ClientError: An error occurred (InvalidArgument) when calling the PutBucketLifecycleConfiguration operation: 'Date' must be at midnight GMT
. Expiration is now rounded to midnight UTC for greater AWS compatibility.
Operations waiting for resharding to complete are able to complete after resharding
Previously, when using dynamic resharding, some operations that were waiting to complete after resharding failed to complete. This was due to code changes to the Ceph Object Gateway when automatically cleaning up no longer used bucket index shards. While this reduced storage demands and eliminated the need for manual clean up, the process removed one source of an identifier needed for operations to complete after resharding. The code has been updated so that identifier is retrieved from a different source after resharding and operations requiring it can now complete.
radosgw-admin bi put
now sets the correct mtime
time stamp
Previously, the radosgw-admin bi put
command did not set the mtime
time stamp correctly. This bug has been fixed.
Ceph Object Gateway lifecycle works properly after a bucket is resharded
Previously, after a bucket was resharded using the dynamic resharding feature, if a lifecycle policy was applied to the bucket, it did not complete and the policy failed to update the bucket. With this update to Red Hat Ceph Storage, a lifecycle policy is properly applied after resharding of a bucket.
The RGW server no longer returns an incorrect S3 error code NoSuchKey
when asked to return non-existent CORS
rules
Previously, the Ceph Object Gateway (RGW) sever would return an incorrect S3 error code NoSuchKey
when asked to return non-existent CORS
rules. This caused the s3cmd
tool and other programs to misbehave. With this update, the RGW server now returns NoSuchCORSConfiguration
for this case, and the s3cmd
tool and other programs that expect this error behave correctly.
Decrypting multipart uploads was corrupting data
When doing multipart uploads with SSE-C, the part size was not a multiple of the 4k encryption block size. While the multipart uploads were encrypted correctly, the decryption process failed to account for the part boundaries and was returning corrupted data. With this release, the decryption process correctly handles the part boundaries when using SSE-C. As a result, all encrypted multipart uploads can be successfully decrypted.
4.8. Object Gateway Multisite
Redundant multi-site replication sync errors were moved to debug level 10
A few multi-site replication sync errors were logged multiple times at log level 0 and consumed extra space in logs. This update moves the redundant messages to debug level 10 to hide them from the log.
Buckets with false entries can now be deleted as expected
Previously, bucket indices could include "false entries" that did not represent actual objects and that resulted from a prior bug. Consequently, during the process of deleting such buckets, encountering a false entry caused the process to stop and return an error code. With this update, when a false entry is encountered, Ceph ignores it, and deleting buckets with false entries works as expected.
Datalogs are now trimmed regularly as expected
Due to a regression in decoding of the JSON format of data sync status objects, automated datalog trimming logic was unable to query the sync status of its peer zones. Consequently, the datalog trimming process did not progress. This update fixes the JSON decoding and adds more regression test coverage for log trimming. As a result, datalogs are now trimmed regularly as expected.
Objects are now synced correctly in versioning-suspended buckets
Due to a bug in multi-site sync of versioning-suspended buckets, certain object versioning attributes were overwritten with incorrect values. Consequently, the objects failed to sync and attempted to retry endlessly, blocking further sync progress. With this update, the sync process no longer overwrites versioning attributes. In addition, any broken attributes are now detected and repaired. As a result, objects are synced correctly in versioning-suspended buckets.
Objects are now synced correctly in versioning-suspended buckets
Due to a bug in multi-site sync of versioning-suspended buckets, certain object versioning attributes were overwritten with incorrect values. Consequently, the objects failed to sync and attempted to retry endlessly, blocking further sync progress. With this update, the sync process no longer overwrites versioning attributes. In addition, any broken attributes are now detected and repaired. As a result, objects are synced correctly in versioning-suspended buckets.
Buckets with false entries can now be deleted as expected
Previously, bucket indices could include "false entries" that did not represent actual objects and that resulted from a prior bug. Consequently, during the process of deleting such buckets, encountering a false entry caused the process to stop and return an error code. With this update, when a false entry is encountered, Ceph ignores it, and deleting buckets with false entries works as expected.
radosgw-admin sync status
now shows timestamps for master zone
Previously in Ceph Object Gateway multisite, running radosgw-admin sync status
on the master zone did not show timestamps, which made it difficult to tell if data sync was making progress. This bug has been fixed, and timestamps are shown as expected.
Synchronizing a multi-site Ceph Object Gateway was getting stuck
When recovering versioned objects, other operations were unable to finish. These stuck operations were caused by the removing of expired user.rgw.olh.pending
extended attributes (xattrs) all at once on those versioned objects. Another bug was causing too many of the user.rgw.olh.pending
xattrs to be written to those recovering versioned objects. With this release, batches of expired xattrs are removed instead of all at once. This results in versioned objects recovering correctly so other operations can proceed normally.
A multi-site Ceph Object Gateway is not trimming the data and bucket index logs
Configuring zones for a multi-site Ceph Object Gateway without setting the sync_from_all
option, was causing the data and bucket index logs not to be trimmed. With this release, the automated trimming process only consults the synchronization status of peer zones that are configured to synchronize. As result, this allows the data and bucket index logs to be trimmed properly.
4.9. RADOS
A PG repair no longer sets the storage cluster to a warning state
When doing a repair of a placement group (PG) it was considered a damaged PG. This was placing the storage cluster into a warning state. With this release, repairing a PG does not place the storage cluster into a warning state.
The ceph-mgr
daemon no longer crashes after starting balancer module in automatic mode
Previously, due to a CRUSH bug, invalid mappings were created. When an invalid mapping was encountered in the _apply_upmap
function, the code caused a segmentation fault. With this release, the code has been updated to check that the values are within an expected range. If not, the invalid values are ignored.
RocksDB compaction no longer exhausts free space of BlueFS
Previously, the balancing of free space between main storage and storage for RocksDB, managed by BlueFS, happened only when write operations were underway. This caused an ENOSPC
error for BlueFS to be returned when RocksDB compaction was triggered right before long interval without write operations. With this update, the code has been modified to periodically check free space balance even if no write operations are ongoing so that compaction no longer exhausts free space of BlueFS.
PGs per OSD limits have been increased
In some situations, such as widely varying disk sizes, the default limit on placement groups (PGs) per OSD could prevent PGs from going active. These limits have been increased by default to make this situation less likely.
Ceph installation no longer fails when FIPS mode is enabled
Previously, installing Red Hat Ceph Storage using the ceph-ansible
utility failed at TASK [ceph-mon : create monitor initial keyring]
when FIPS mode was enabled. To resolve this bug, the symmetric cipher cryptographic key is now wrapped with a one-shot wrapping key before it is used to instantiate the cipher. This allows Red Hat Ceph Storage to install normally when FIPS mode is enabled.
Slow request messages have been re-added to the OSD logs
Previously, slow request messages were removed from the OSD logs, which made debugging harder. This update re-adds these warnings to the OSD logs.
Force backfill and recovery preempt a lower priority backfill or recovery
Previously, force backfill or force recovery did not preempt an already running recovery or backfill process. As a consequence, although force backfill or recovery set priority to the max value, recovery process for placement groups (PGs) already running at a lower priority was finished first. With this update, force backfill and recovery preempt a lower priority backfill or recovery processes.
Ceph Manager no longer crashes when two or more Ceph Object Gateway daemons use the same name
Previously, when two or more Ceph Object Gateway daemons used the same name in a cluster, Ceph Manager terminated unexpectedly. The underlying source code has been modified, and Ceph Manager no longer crashes in the described scenario.
A race condition was causing threads to deadlock with the standby ceph-mgr
daemon
Some threads can cause a race condition when acquiring a local lock and the Python global interpreter lock, which is causing a deadlock issue for each thread. As the thread holds on to one of the locks, it wants to acquire the other lock, but cannot. In this release, the code was fixed to close the window of opportunity for the race condition to occur. This is done by changing the location of the lock acquisition and releasing the appropriate locks. Doing this results in the threads not causing a deadlock, which allows progress to be made.
An OSD daemon no longer crashes when a block device has read errors
Previously, an OSD daemon would crash when a block device had read errors, because the daemon expected only a general EIO error code, not the more specific errors the kernel generates. With this release, low-level errors are mapped to EIO, resulting in an OSD daemon not crashing because of an unrecognized error code.
Read retries no longer cause the client to hang after a failed sync read
Previously, when an OSD daemon failed to sync read an object, the length of the object to be read was set to 0. This caused the read retry to incorrectly read the entire object. The underlying code has been fixed, and the read retry uses the correct length and does not cause the client to hang.
(BZ#1682966)
4.10. Block Devices (RBD)
The python-rbd list_snaps() method no longer segfaults after an error
This issue was discovered with OpenStack Cinder Backup when rados_connect_timeout
was set. Normally the timeout is not enabled. If the cluster was highly loaded the timeout could be reached, causing the segfault. With this update to Red Hat Ceph Storage, if the timeout is reached a segfault no longer occurs.