Chapter 3. New features and Enhancements


This section lists all major updates, enhancements, and new features introduced in this release of Red Hat Ceph Storage.

The main features added by this release are:

Compression on-wire with msgr2 protocol is now available

With this release, in addition to encryption on wire, compression on wire is also supported to secure network opertaions within the storage cluster.

See the Encryption and key management section in the Red Hat Ceph Storage Data Security and Hardening Guide for more details.

Python notifications are more efficient

Previously, there were some unused notifications that no modules needed at the moment. This caused inefficiency.

With this release, the NotifyType parameter is introduced. It is annotated, which events modules consume at the moment, for example NotifyType.mon_map, NotifyType.osd_map, and the like. As a consequence, only events that modules ask for are queued. The events that no modules consume are issued. Because of these changes, python notifications are now more efficient.

The changes to pg_num are limited

Previously, if drastic changes were made to pg_num that outpaced pgp_num, the user could hit the per-osd placement group limits and cause errors.

With this release, the changes to pg_num are limited to avoid the issue with per-osd placement group limits.

New pg_progress item is created to avoid dumping all placement group statistics for progress updates

Previously, the pg_dump item included unnecessary fields that wasted CPU if it was copied to python-land. This tended to lead to long ClusterState::lock hold times, leading to long ms_dispatch delays and generally slowing the processes.

With this release, a new pg_progress item is created to dump only the fields that mgr tasks or progress needs.

The mgr_ip is no longer re-fetched

Previously, the mgr_ip had to be re-fetched during the lifetime of an active Ceph manager module.

With this release, the mgr_ip does not change for the lifetime of an active Ceph manager module, thereby, there is no need to call back into Ceph Manager for re-fetching.

QoS in the Ceph OSD is based on the mClock algorithm, by default

Previously, the scheduler defaulted to the Weighted Priority Queue (WPQ). Quality of service (QoS) based on the mClock algorithm was in an experimental phase and was not yet recommended for production.

With this release, the mClock based operation queue enables QoS controls to be applied to Ceph OSD specific operations, such as client input and output (I/O) and recovery or backfill, as well as other background operations, such as pg scrub, snap trim, and pg deletion. The allocation of resources to each of the services is based on the input and output operations per second (IOPS) capacity of each Ceph OSD and is achieved using built-in mClock profiles.

Also, this release includes the following enhancements:

  • Hands-off automated baseline performance measurements for the OSDs determine Ceph OSD IOPS capacity with safeguards to fallback to default capacity when an unrealistic measurement is detected.
  • Setting sleep throttles for background tasks is eliminated.
  • Higher default values for recoveries and max backfills options with the ability to override them using an override flag.
  • Configuration sets using mClock profiles hide complexity of tuning mClock and Ceph parameters.

See The mClock OSD scheduler in Red Hat Ceph Storage Administration Guide for details.

WORM compliance certification is now supported

Red Hat now supports WORM compliance certification.

See the Enabling object lock for S3 for more details.

Set rate limits on users and buckets

With this release, you can set rate limits on users and buckets based on the operations in a Red Hat Ceph Storage cluster. See the Rate limits for ingesting data for more details.

librbd plugin named persistent write log cache to reduce latency

With this release, the new librbd plugin named Persistent Write Log Cache (PWL) provides a persistent, fault-tolerant write-back cache targeted with SSD devices. It greatly reduces latency and also improves performance at low io_depths. This cache uses a log-ordered write-back design which maintains checkpoints internally, so that writes that get flushed back to the cluster are always crash consistent. Even if the client cache is lost entirely, the disk image is still consistent; but the data will appear to be stale.

Ceph File System (CephFS) now supports high availability asynchronous replication for snapshots

Previously, only one cephfs-mirror daemon would be deployed per storage cluster, thereby a CephFS supported only asynchronous replication of snapshots directories.

With this release, multiple cephfs-mirror daemons can be deployed on two or more nodes to achieve concurrency in snapshot synchronization, thereby providing high availability.

See the Ceph File System mirroring section in the Red Hat Ceph Storage File System Guide for more details.

BlueStore is uprgaded to V3

With this release, BlueStore object store is upgraded to V3. Following are the two features:

  • The allocation metadata is removed from RocksDB and now performas a full destage of the allocator object with the OSD allocation.
  • With cache age binning, older onodes might be assigned a lower priority than the hot workload data. See the Ceph BlueStore for more details.

Use cephadm to manage operating system tuning profiles

With this release, you can use cephadm to create and manage operating susyem tuning profiles for better performance of the Red Hat Ceph Storage cluster. See the Managing operating system tuning profiles with `cephadm` for more details.

A direct upgrade from Red Hat Ceph Storage 5 to Red Hat Ceph Storage 7 will be available

For upgrade planning awareness, directly upgrading Red Hat Ceph Storage 5 to Red Hat Ceph Storage 7 (N=2) will be available.

The new cephfs-shell option is introduced to mount a filesystem by name

Previously, cephfs-shell could only mount the default filesystem.

With this release, a CLI option is added in cephfs-shell that allows the mounting of a different filesystem by name, that is, something analogous to the mds_namespace= or fs= options for kclient and ceph-fuse.

Day-2 tasks can now be performed through the Ceph Dashboard

With this release, in the Ceph Dashboard, a user can perform every day-2 tasks that require daily or weekly frequency of actions. This enhancement improves the Dashboard’s assessment capabilities, customer experience, and strengthens its usability and maturity. In addition to this, new on-screen elements are also included to help and guide the user in retrieving additional information to complete a task.

3.1. The Cephadm utility

Users can now rotate the authentication key for Ceph daemons

For security reasons, some users might desire to occasionally rotate the authentication key used for daemons in the storage cluster.

With this release, the ability to rotate the authentication key for ceph daemons using the ceph orch daemon rotate-key DAEMON_NAME command is introduced. For MDS, OSD, and MGR daemons, this does not require a daemon restart. However, for other daemons, such as Ceph Object Gateway daemons, the daemon might require restarting to switch to the new key.

Bugzilla:1783271

Bootstrap logs are now logged to STDOUT

With this release, to reduce potential errors, bootstrap logs are now logged to STDOUT instead of STDERR in successful bootstrap scenarios.

Bugzilla:1932764

Ceph Object Gateway zonegroup can now be specified in the specification used by the orchestrator

Previously, the orchestrator could handle setting the realm and zone for the Ceph Object Gateway. However, setting the zonegroup was not supported.

With this release, users can specify a rgw_zonegroup parameter in the specification that is used by the orchestrator. Cephadm sets the zonegroup for Ceph Object Gateway daemons deployed from the specification.

Bugzilla:2016288

ceph orch daemon add osd now reports if the hostname specified for deploying the OSD is unknown

Previously, since the ceph orch daemon add osd command gave no output, users would not notice if the hostname was incorrect. Due to this, Cephadm would discard the command.

With this release, the ceph orch daemon add osd command reports to the user if the hostname specified for deploying the OSD on is unknown.

Bugzilla:2016949

cephadm shell command now reports the image being used for the shell on startup

Previously, users would not always know which image was being used for the shell. This would affect the packages that were used for commands being run within the shell.

With this release, cephadm shell command reports the image used for the shell on startup. Users can now see the packages being used within the shell, as they can see the container image being used, and when that image was created as the shell starts up.

Bugzilla:2029714

Cluster logs under` /var/log/ceph` are now deleted

With this release, to better clean up the node as part of removing the Ceph cluster from that node, cluster logs under /var/log/ceph are deleted when cephadm rm-cluster command is run. The cluster logs are removed as long as --keep-logs is not passed to the rm-cluster command.

Note

If the cephadm rm-cluster command is run on a host that is part of a still existent cluster, the host is managed by Cephadm, and the Cephadm mgr module is still enabled and running, then Cephadm might immediately start deploying new daemons, and more logs could appear.

Bugzilla:2036063

Better error handling when daemon names are passed to ceph orch restart command

Previously, in cases where the daemon passed to ceph orch restart command was a haproxy or keepalived daemon, it would return a traceback. This made it unclear to users if they had made a mistake or Cephadm had failed in some other way.

With this release, better error handling is introduced to identify when the users pass a daemon name to ceph orch restart command instead of the expected service name. Upon encountering a daemon name, Cephadm reports and requests the user to check ceph orch ls for valid services to pass.

Bugzilla:2080926

Users can now create Ceph Object Gateway realm,zone, and zonegroup using ceph rgw realm bootstrap -i rgw_spec.yaml command

With this release, to streamline the process of setting up Ceph Object Gateway on a Red Hat Ceph Storage cluster, users can create a Ceph Object Gateway realm, zone, and zonegroup using ceph rgw realm bootstrap -i rgw_spec.yaml command. The specification file should be modeled similar to the one that is used to deploy Ceph Object Gateway daemons using the orchestrator. The command then creates the realm,zone, and, zonegroup, and passes the specification on to the orchestrator, which then deploys the Ceph Object Gateway daemons.

Example

rgw_realm: myrealm
rgw_zonegroup: myzonegroup
rgw_zone: myzone
placement:
  hosts:
   - rgw-host1
   - rgw-host2
spec:
  rgw_frontend_port: 5500

Bugzilla:2109224

crush_device_class and location fields are added to OSD specifications and host specifications respectively

With this release, the crush_device_class field is added to the OSD specifications, and the location field, referring to the initial crush location of the host, is added to host specifications. If a user sets the location field in a host specification, cephadm runs ceph osd crush add-bucket with the hostname, and given location to add it as a bucket in the crush map. For OSDs, they are set with the given crush_device_class in the crush map upon creation.

Note

This is only for OSDs that were created based on the specification with the field set. It does not affect the already deployed OSDs.

Bugzilla:2124441

Users can enable Ceph Object Gateway manager module

With this release, Ceph Object Gateway Manager module is now available, and can be turned on with ceph mgr module enable rgw command to enable users to gain access to the functionality of the Ceph Object Gateway manager module, such as ceph rgw realm bootstrap, and ceph rgw realm tokens commands.

Bugzilla:2133802

Users can enable additional metrics for node-exporter daemons

With this release, to enable users to have more customization of their node-exporter deployments, without requiring explicit support for each individual option, additional metrics are introduced that can now be enabled for node-exporter daemons deployed by Cephadm, using the extra_entrypoint_args field.

service_type: node-exporter service_name: node-exporter placement: label: "node-exporter" extra_entrypoint_args: - "--collector.textfile.directory=/var/lib/node_exporter/textfile_collector2" ---

Bugzilla:2142431

Users can set the crush location for a Ceph Monitor to replace tiebreaker monitors

With this release, users can set the crush location for a monitor deployed on a host. It should be assigned in the mon specification file.

Example

service_type: mon
service_name: mon
placement:
  hosts:
  - host1
  - host2
  - host3
spec:
  crush_locations:
    host1:
    - datacenter=a
    host2:
    - datacenter=b
    - rack=2
    host3:
    - datacenter=a

This is primarily added to make the replacing of a tiebreaker monitor daemon in stretch clusters deployed by Cephadm, more feasible. Without this change, users would have to manually edit the files written by Cephadm to deploy the tiebreaker monitor, as the tiebreaker monitor is not allowed to join without declaring its crush location.

Bugzilla:2149533

crush_device_class can now be specified per path in an OSD specification

With this release, to allow users more flexibility with crush_device_class settings when deploying OSDs through Cephadm, crush_device_class, you can specify per path inside an OSD specification. It is also supported to provide these per-path crush_device_classes along with a service-wide crush_device_class for the OSD service. In cases of service-wide crush_device_class, the setting is considered as default, and the path-specified settings take priority.

Example

 service_type: osd
    service_id: osd_using_paths
    placement:
      hosts:
        - Node01
        - Node02
    crush_device_class: hdd
    spec:
      data_devices:
        paths:
        - path: /dev/sdb
          crush_device_class: ssd
        - path: /dev/sdc
          crush_device_class: nvme
        - /dev/sdd
      db_devices:
        paths:
        - /dev/sde
      wal_devices:
        paths:
        - /dev/sdf

Bugzilla:2151189

Cephadm now raises a specific health warning UPGRADE_OFFLINE_HOST when the host goes offline during upgrade

Previously, when upgrades failed due to a host going offline, a generic UPGRADE_EXCEPTION health warning would be raised that was too ambiguous for users to understand.

With this release, when an upgrade fails due to a host being offline, Cephadm raises a specific health warning - UPGRADE_OFFLINE_HOST, and the issue is now made transparent to the user.

Bugzilla:2152963

All the Cephadm logs are no longer logged into cephadm.log when --verbose is not passed

Previously, some Cephadm commands, such as gather-facts, would spam the log with massive amounts of command output every time they were run. In some cases, it was once per minute.

With this release, in Cephadm, all the logs are no longer logged into cephadm.log when --verbose is not passed. The cephadm.log is now easier to read since most of the spam previously written is no longer present.

Bugzilla:2180110

3.2. Ceph Dashboard

A new metric is added for OSD blocklist count

With this release, to configure a corresponding alert, a new metric ceph_cluster_osd_blocklist_count is added on the Ceph Dashboard.

Bugzilla:2067709

Introduction of ceph-exporter daemon

With this release, ceph-exporter daemon is introduced to collect and expose performance counters of all Ceph daemons as Prometheus metrics. It is deployed on each node of the cluster to be performant at large scale clusters.

Bugzilla:2076709

Support force promote for RBD mirroring through Dashboard

Previously, although RBD mirror promote/demote was implemented on the Ceph Dashboard, there was no option to force promote.

With this release, support for force promoting RBD mirroring through Ceph Dashboard is added. If the promotion fails on the Ceph Dashboard, the user is given the option to force the promotion.

Bugzilla:2133341

Support for collecting and exposing the labeled performance counters

With this release, support for collecting and exposing the labeled performance counters of Ceph daemons as Prometheus metrics with labels is introduced.

Bugzilla:2146544

3.3. Ceph File System

cephfs-top limitation is increased for more client loading

Previously, due to a limitation in the cephfs-top utility, only less than 100 clients were loaded at a time, which could not be scrolled and also hung if more clients were loaded.

With this release, cephfs-top users can scroll vertically, as well as, horizontally. This enables cephfs-top to load nearly 10,000 clients. The users can scroll the loaded clients, and view them on the screen.

Bugzilla:2138793

Users now have the option to sort clients based on the fields of their choice in cephfs-top

With this release, users have the option to sort the clients based on the fields of their choice in cephfs-top and also, limit the number of clients to be displayed. This enables the user to analyze the metrics based on the order of fields as per requirement.

Bugzilla:2158689

Non-head omap entries are now included in the omap entries

Previously, a directory fragment would not split if non-head snapshotted entries were not taken into account when deciding to merge or split a fragment. Due to this, the number of omap entries in a directory object would exceed a certain limit, and result in cluster warnings.

With this release, non-head omap entries are included in the number of omap entries when deciding to merge or split a directory fragment to never exceed the limit.

Bugzilla:2159294

3.4. Ceph Object Gateway

Objects replicated from another zone now returns the header

With this release, in a multi-site configuration, objects that have replicated from another zone, return the header x-amz-replication-status=REPLICA, to allow multi-site users to identify if the object was replicated locally or not.

Bugzilla:1467648

Support for AWS PublicAccessBlock

With this release, Ceph Object Storage supports the AWS public access block S3 APIs such as PutPublicAccessBlock.

Bugzilla:2064260

Swift object storage dialect now includes support for SHA-256 and SHA-512 digest algorithms

Previously, support for digest algorithms was added by OpenStack Swift in 2022, but Ceph Object Gateway had not implemented them.

With this release, Ceph Object Gateway’s Swift object storage dialect now includes support for SHA-256 and SHA-512 digest methods in tempurl operations. Ceph Object Gateway can now correctly handle tempurl operations by recent OpenStack Swift clients.

Bugzilla:2105950

3.5. Multi-site Ceph Object Gateway

Bucket notifications are sent when an object is synced to a zone

With this release, bucket notifications are sent when an object is synced to a zone, to allow external systems to receive information into the zone syncing status at object-level. The following bucket notification event types are added - s3:ObjectSynced:* and s3:ObjectSynced:Created. When configured with the bucket notification mechanism, a notification event is sent from the synced Ceph Object Gateway upon successful sync of an object.

Note

Both topics and the notification configuration should be done separately in each zone from which you would like to see the notification events being sent.

Bugzilla:2053347

Disable per-bucket replication when zones replicate by default

With this release, the ability to disable per-bucket replication when the zones replicate by default, using multisite sync policy, is introduced to ensure that selective buckets can opt out.

Bugzilla:2132554

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.