Chapter 6. Asynchronous errata updates


This section describes the bug fixes, known issues, and enhancements of the z-stream releases.

6.1. Red Hat Ceph Storage 6.1z7

Red Hat Ceph Storage release 6.1z7 is now available. The bug fixes that are included in the update are listed in the [advisory links] advisories.

6.1.1. Enhancements

6.1.1.1. Ceph File System

New clone creation no longer slows down due to parallel clone limit

Previously, upon reaching the limit of parallel clones, the rest of the clones would queue up, slowing down the cloning.

With this enhancement, upon reaching the limit of parallel clones at a time, the new clone creation requests are rejected. This feature is enabled by default but can be disabled.

Bugzilla:2196829

The Python librados supports iterating object omap key/values

Previously, the iteration would break whenever a binary/unicode key was encountered.

With this release, the Python librados supports iterating object omap key/values with unicode or binary keys and the iteration continues as expected.

Bugzilla:2232161

6.1.1.2. Ceph Object Gateway

Improved temporary file placement and error messages for the with the ‘/usr/bin/rgw-restore-bucket-index' tool

Previously, the ‘/usr/bin/rgw-restore-bucket-index' tool only placed temporary files into the ‘/tmp’ directory. This could lead to issues if the directory ran out of space, resulting in the following error messing being emitted: "ln: failed to access '/tmp/rgwrbi-object-list.XXX': No such file or directory".

With this enhancement, users can now specify a specific directory to place temporary files, by using the ‘-t’ command-line option. Additionally, if the specified directory is full, users now receive an error message specifying the problem: “ERROR: the temporary directory’s partition is full, preventing continuation”.

Bugzilla:2270322

S3 requests are no longer cut off in the middle of transmission during shutdown

Previously, a few clients faced issues with the S3 request being cut off in the middle of transmission during shutdown without waiting.

With this enhancement, the S3 requests can be configured to wait for the duration defined in the rgw_exit_timeout_secs parameter for all outstanding requests to complete before exiting the Ceph Object Gateway process unconditionally. Ceph Object Gateway will wait for up to 120 seconds (configurable) for all on-going S3 requests to complete before exiting unconditionally. During this time, new S3 requests will not be accepted. This configuration is off by default.

Note

In containerized deployments, an additional extra_container_args parameter configuration of --stop-timeout=120 (or the value of rgw_exit_timeout_secs parameter, if not default) is also necessary.top-timeout=120` (or the value of rgw_exit_timeout_secs parameter, if not default) is also necessary.

Bugzilla:2298708

6.1.1.3. RADOS

New ‘mon_cluster_log_level’ command option to control the cluster log level verbosity for external entities

Previously, debug verbosity logs were sent to all external logging systems regardless of their level settings. As a result, the ‘/var/’ filesystem would rapidly fill up.

With this enhancement, ‘mon_cluster_log_file_level’ and ‘mon_cluster_log_to_syslog_level’ command options have been removed. From this release, use only the new generic ‘mon_cluster_log_level’ command option to control the cluster log level verbosity for the cluster log file and all external entities.

Bugzilla:2053021

6.2. Red Hat Ceph Storage 6.1z6

Red Hat Ceph Storage release 6.1z6 is now available. The bug fixes that are included in the update are listed in the RHBA-2024:2631 and RHSA-2024:2633 advisories.

6.3. Red Hat Ceph Storage 6.1z5

Red Hat Ceph Storage release 6.1z5 is now available. The bug fixes that are included in the update are listed in the RHBA-2024:1580 and RHBA-2024:1581 advisories.

6.3.1. Enhancements

6.3.1.1. The Ceph Ansible utility

All bootstrap CLI parameters are now made available for usage in the cephadm-ansible module

Previously, only a subset of the bootstrap CLI parameters were available and it was limiting the module usage.

With this enhancement, all bootstrap CLI parameters are made available for usage in the cephadm-ansible module.

Bugzilla:2269514

6.3.1.2. RBD Mirroring

RBD diff-iterate now executes locally if exclusive lock is available

Previously, when diffing against the beginning of time (fromsnapname == NULL) in fast-diff mode (whole_object == true with fast-diff image feature enabled and valid), RBD diff-iterate wasn’t guaranteed to execute locally.

With this enhancement, rbd_diff_iterate2() API performance improvement is implemented and RBD diff-iterate is now guaranteed to execute locally if exclusive lock is available. This brings a dramatic performance improvement for QEMU live disk synchronization and backup use cases assuming fast-diff image feature is enabled.

Bugzilla:2259053

6.3.1.3. Ceph File System

Snapshot scheduling support is now provided for subvolumes

With this enhancement, snapshot scheduling support is provided for subvolumes. All snapshot scheduling commands accept --subvol and --group arguments to refer to appropriate subvolumes and subvolume groups. If a subvolume is specified without a subvolume group argument, then the default subvolume group is considered. Also, a valid path need not be specified when referring to subvolumes and just a placeholder string is sufficient due to the nature of the argument parsing employed.

Example

# ceph fs snap-schedule add - 15m --subvol sv1 --group g1
# ceph fs snap-schedule status - --subvol sv1 --group g1

Bugzilla:2243783

6.4. Red Hat Ceph Storage 6.1z4

Red Hat Ceph Storage release 6.1z4 is now available. The bug fixes that are included in the update are listed in the RHBA-2024:0747 advisory.

6.4.1. Enhancements

6.4.1.1. Ceph File System

MDS dynamic metadata balancer is off by default.

With this enhancement, MDS dynamic metadata balancer is off by default to improve the poor balancer behavior that could fragment trees in undesirable or unintended ways simply by increasing the max_mds file system setting.

Operators must turn on the balancer explicitly to use it.

Bugzilla:2255435

The resident segment size perf counter in the MDS is tracked with a higher priority.

With this enhancement, the MDS resident segment size (or RSS) perf counter is tracked with a higher priority to allow callers to consume its value to generate useful warnings which is useful for rook to ascertain the MDS RSS size and act accordingly.

Bugzilla:2256731

ceph auth commands give a message when permissions in MDS are incorrect

With this enhancement, permissions in MDS capability now either start with r, rw, `* or all. This results in the ceph auth commands like ceph auth add, ceph auth caps, ceph auth get-or-create and ceph auth get-or-create-key to generate a clear message when the permissions in the MDS caps are incorrect..

Bugzilla:2218189

6.4.1.2. Ceph Object Gateway

The radosgw-admin bucket stats command prints bucket versioning

With this enhancement, the ` radosgw-admin bucket stats` command prints the versioning status for buckets as enabled or off since versioning can be enabled or disabled after creation.

Bugzilla:2256364

6.5. Red Hat Ceph Storage 6.1z3

Red Hat Ceph Storage release 6.1z3 is now available. The bug fixes that are included in the update are listed in the RHSA-2023:7740 advisory.

6.5.1. Enhancements

6.5.1.1. Ceph File System

The snap schedule module now supports a new retention specification

With this enhancement, users can define a new retention specification to retain the number of snapshots.

For example, if a user defines to retain 50 snapshots irrespective of the snapshot creation cadence, the snapshots retained is 1 less than the maximum specified as the pruning happens after a new snapshot is created. In this case, 49 snapshots are retained so that there is a margin of 1 snapshot to be created on the file system on the next iteration to avoid breaching the system configured limit of mds_max_snaps_per_dir.

Note
Configure the `mds_max_snaps_per_dir`  and snapshot scheduling carefully to avoid unintentional deactivation of snapshot schedules due to file system returning a "Too many links" error if the `mds_max_snaps_per_dir limit` is breached.

Bugzilla:2227807

Laggy clients are now evicted only if there are no laggy OSDs

Previously, monitoring performance dumps from the MDS would sometimes show that the OSDs were laggy, objecter.op_laggy and objecter.osd_laggy, causing laggy clients (cannot flush dirty data for cap revokes).

With this enhancement, if defer_client_eviction_on_laggy_osds is set to true and a client gets laggy because of a laggy OSD then client eviction will not take place until OSDs are no longer laggy.

Bugzilla:2228065

6.5.1.2. Ceph Object Gateway

rgw-restore-bucket-index tool can now restore the bucket indices for versioned buckets

With this enhancement, the rgw-restore-bucket-index tool now works as broadly as possible, with the ability to restore the bucket indices for un-versioned as well as for versioned buckets.

Bugzilla:2182385

6.5.1.3. NFS Ganesha

NFS Ganesha version updated to V5.6

With this enhancement of updated version of NFS Ganesha, following issues have been fixed: * FSAL’s state_free` function called by free_state did not actually free. * CEPH: Fixed cmount_path. * CEPH: Currently, the client_oc true was broken, forced it to false.

Bugzilla:2249958

6.5.1.4. RADOS

New reports available for sub-events for delayed operations

Previously, slow operations were marked as delayed but without a detailed description.

With this enhancement, you can view the detailed descriptions of delayed sub-events for operations.

Bugzilla:2240838

Turning the noautoscale flag on/off now retains each pool’s original autoscale mode configuration.

Previously, the pg_autoscaler did not persist in each pool’s autoscale mode configuration when the noautoscale flag was set. Due to this, after turning the noautoscale flag on/off, the user would have to go back and set the autoscale mode for each pool again.

With this enhancement, the pg_autoscaler module persists individual pool configuration for the autoscaler mode after the noautoscale flag is set.

Bugzilla:2241201

BlueStore instance cannot be opened twice

Previously, when using containers, it was possible to create unrelated inodes that targeted the same block device mknod b causing multiple containers to think that they have exclusive access.

With this enhancement, the reinforced advisory locking with O_EXCL open flag dedicated for block devices is now implemented,thereby improving the protection against running OSD twice at the same time on one block device.

Bugzilla:2239449

6.6. Red Hat Ceph Storage 6.1z2

Red Hat Ceph Storage release 6.1z2 is now available. The bug fixes that are included in the update are listed in the RHSA-2023:5693 advisory.

6.6.1. Enhancements

6.6.1.1. Ceph Object Gateway

Additional features and enhancements are added to rgw-gap-list and rgw-orphan-list scripts to enhance end-users' experience

With this enhancement, to improve the end-users' experience with rgw-gap-list and rgw-orphan-list scripts, a number of features and enhancements have been added, including internal checks, more command-line options, and enhanced output.

Bugzilla:2228242

Realm, zone group, and/or zone can be specified when running the rgw-restore-bucket-index command

Previously, the tool could only work with the default realm, zone group, and zone.

With this enhancement, realm, zone group, and/or zone can be specified when running the rgw-restore-bucket-index command. Three additional command-line options are added:

  • "-r <realm>"
  • "-g <zone group>"
  • "-z <zone>"

Bugzilla:2183926

6.6.1.2. Multi-site Ceph Object Gateway

Original multipart uploads can now be identified in multi-site configurations

Previously, a data corruption bug was fixed in 6.1z1 release that effected multi-part uploads with server-side encryption in multi-site configurations.

With this enhancement, a new tool, radosgw-admin bucket resync encrypted multipart, can be used to identify these original multipart uploads. The LastModified timestamp of any identified object is incremented by 1ns to cause peer zones to replicate it again. For multi-site deployments that make any use of Server-Side encryption, users are recommended to run this command against every bucket in every zone after all zones have upgraded.

Bugzilla:2227842

6.6.1.3. Ceph Dashboard

Dashboard host loading speed is improved and pages now load faster

Previously, large clusters of five or more hosts have a linear increase in load time on hosts page and main page.

With this enhancement, the dashboard host loading speed is improved and pages now load orders of magnitude faster.

Bugzilla:2220922

6.7. Red Hat Ceph Storage 6.1z1

Red Hat Ceph Storage release 6.1z1 is now available. The bug fixes that are included in the update are listed in the RHBA-2023:4473 advisory.

6.7.1. Enhancements

6.7.1.1. Ceph File System

Switch the unfair Mutex lock to fair mutex

Previously, the implementations of the Mutex, for example, std::mutex in C++, would not guarantee fairness and would not guarantee that the lock would be acquired by threads in the order called lock(). In most cases, this worked well but in an overloaded case, the client requests handling thread and submit thread would always successfully acquire the submit_mutex in a long time, causing MDLog::trim() to get stuck. That meant the MDS daemons would fill journal logs into the metadata pool, but could not trim the expired segments in time.

With this enhancement, the unfair Mutex lock is switched to fair mutex and all the submit_mutex waiters are woken up one by one in FIFO mode.

Bugzilla:2158304

6.7.1.2. Ceph Object Gateway

The bucket listing feature enables the rgw-restore-bucket-index tool to complete reindexing

Previously, the rgw-restore-bucket-index tool would restore the bucket’s index partially until the next user listed out the bucket. Due to this, the bucket’s statistics would report incorrectly until the reindexing completed.

With this enhancement, the bucket listing feature is added which enables the tool to complete the reindexing and the bucket statistics are reported correctly. Additionally, a small change to the build process is added that would not affect end-users.

Bugzilla:2182456

Lifecycle transition no longer fails for objects with modified metadata

Previously, setting an ACL on an existing object would change its mtime due to which lifecycle transition failed for such objects.

With this fix, unless it is a copy operation, the object’s mtime remains unchanged while modifying just the object metadata, such as setting ACL or any other attributes.

Bugzilla:2213801

Blocksize is changed to 4K

Previously, Ceph Object Gateway GC processing would consume excessive time due to the use of a 1K blocksize that would consume the GC queue. This caused slower processing of large GC queues.

With this fix, blocksize is changed to 4K, which has accelerated the processing of large GC queues.

Bugzilla:2212446

Object map for the snapshot accurately reflects the contents of the snapshot

Previously, due to an implementation defect, a stale snapshot context would be used when handling a write-like operation. Due to this, the object map for the snapshot was not guaranteed to accurately reflect the contents of the snapshot in case the snapshot was taken without quiescing the workload. In differential backup and snapshot-based mirroring, use cases with object-map and/or fast-diff features enabled, the destination image could get corrupted.

With this fix, the implementation defect is fixed and everything works as expected.

Bugzilla:2216186

6.7.1.3. The Cephadm Utility

public_network parameter can now have configuration options, such as global or mon

Previously, in cephadm, the public_network parameter was always set as a part of the mon configuration section during a cluster bootstrap without providing any configuration option to alter this behavior.

With this enhancement, you can specify the configuration options, such as global or mon for the public_network parameter during cluster bootstrap by utilizing the Ceph configuration file.

Bugzilla:2156919

The Cephadm commands that are run on the host from the cephadm Manager module now have timeouts

Previously, one of the Cephadm commands would occasionally hang indefinitely, and it was difficult for users to notice and sort the issue.

With this release, timeouts are introduced in the Cephadm commands that are run on the host from the Cephadm mgr module. Users are now alerted with a health warning about eventual failure if one of the commands hangs. The timeout is configurable with the mgr/cephadm/default_cephadm_command_timeout setting, and defaults to 900 seconds.

Bugzilla:2151908

cephadm support for CA signed keys is implemented

Previously, CA signed keys worked as a deployment setup in Red Hat Ceph Storage 5, although their working was accidental, untested, and broken in changes from Red Hat Ceph Storage 5 to Red Hat Ceph Storage 6.

With this enhancement, cephadm support for CA signed keys is implemented. Users can now use CA signed keys rather than typical pubkeys for SSH authentication scheme.

Bugzilla:2182941

6.7.2. Known issues

6.7.2.1. Multi-site Ceph Object Gateway

Deleting objects in versioned buckets causes statistics mismatch

Due to versioned buckets having a mix of current and non-current objects, deleting objects might cause bucket and user statistics discrepancies on local and remote sites. This does not cause object leaks on either site, just statistics mismatch.

Bugzilla:1871333

Multisite replication may stop during upgrade

Multi-site replication may stop if clusters are on different versions during the process of an upgrade. We would need to suspend sync until both clusters are upgraded to the same version.

Bugzilla:2178909

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.