Chapter 3. New features


This section lists all major updates, enhancements, and new features introduced in this release of Red Hat Ceph Storage.

3.1. The Cephadm utility

Users can now configure various NFS options in idmap.conf

With this enhancement, the ability to configure NFS options, such as "Domain", "Nobody-User", "Nobody-Group" and the like, in idmap.conf is introduced.

Bugzilla:2068026

Client IP restriction is now possible over the new haproxy protocol mode for NFS

Previously, client IP restriction did not work in setups using haproxy over NFS.

With this enhancement, Cephadm deployed NFS supports the haproxy protocol. If users add enable_haproxy_protocol: True to both their ingress and haproxy specification or pass --ingress-mode haproxy-protocol to the ceph nfs cluster create command, the NFS daemon will make use of the haproxy protocol.

Bugzilla:2068030

Users must now enter a username and password to access the Grafana API URL

Previously, anyone who could connect to the Grafana API URL would have access to it without needing any credentials.

With this enhancement, Cephadm deployed Grafana is set up with a username and password for users to access the Grafana API URL.

Bugzilla:2079815

Ingress service with NFS backend can now be set up to use only keepalived to create a virtual IP (VIP) for the NFS daemon to bind to, without the HAProxy layer involved

With this enhancement, ingress service with an NFS backend can be set up to only use keepalived to create a virtual IP for the NFS daemon to bind to, without the HAProxy layer involved. This is useful in cases where the NFS daemon is moved around and clients need not use a different IP to connect to it.

Cephadm deploys keepalived to set up a VIP and then have the NFS daemon bind to that VIP. This can also be setup using the NFS module via the ceph nfs cluster create command, using the flags --ingress --ingress-mode keepalive-only --virtual-ip <VIP>.

The specification file looks as follows:

service_type: ingress
service_id: nfs.nfsganesha
service_name: ingress.nfs.nfsganesha
placement:
  count: 1
  label: foo
spec:
  backend_service: nfs.nfsganesha
  frontend_port: 12049
  monitor_port: 9049
  virtual_ip: 10.8.128.234/24
  virtual_interface_networks: 10.8.128.0/24
  keepalive_only: true

that includes the keepalive_ony: true setting.

An NFS specification looks as below:

networks:
    - 10.8.128.0/21
service_type: nfs
service_id: nfsganesha
placement:
  count: 1
  label: foo
spec:
  virtual_ip: 10.8.128.234
  port: 2049

that includes the virtual_ip field that should match the VIP in the ingress specification.

Bugzilla:2089167

The HAProxy daemon binds to its front-end port only on the VIP created by the accompanying keepalived

With this enhancement, the HAProxy daemon will bind to its front-end port only on the VIP created by the accompanying keepalived, rather than on 0.0.0.0. Cephadm deployed HAProxy will bind its front-end port to the VIP, allowing other services, such as an NFS daemon, to potentially bind to port 2049 on other IPs on the same node.

Bugzilla:2176297

Haproxy health check interval for ingress service is now customizable

Previously, in some cases, the two second default health check interval was too frequent and it caused unnecessary traffic.

With this enhancement, HAProxy health check interval for ingress service is customizable. By applying an ingress specification that includes the health_check_interval field, the HAProxy configuration generated by Cephadm for each HAProxy daemon for the service will include that value for the health check interval.

Ingress specification file:

service_type: ingress
service_id: rgw.my-rgw
placement:
  hosts: ['ceph-mobisht-7-1-07lum9-node2', 'ceph-mobisht-7-1-07lum9-node3']
spec:
  backend_service: rgw.my-rgw
  virtual_ip: 10.0.208.0/22
  frontend_port: 8000
  monitor_port: 1967
  health_check_interval: 3m

Valid units for the interval are: us : microseconds ms : milliseconds s : seconds m : minutes h : hours d : days

Bugzilla:2199129

Grafana now binds to an IP within a specific network on a host, rather that always binding to 0.0.0.0

With this enhancement, using a Grafana specification file that includes both the networks' section with the network that Grafana binds to an IP on, and only_bind_port_on_networks: true included in the "spec" section of the specification, Cephadm configures the Grafana daemon to bind to an IP within that network rather than 0.0.0.0. This enables users to use the same port that Grafana uses for another service but on a different IP on the host. If it is a specification update that does not cause them all to be moved, ceph orch redeploy grafana can be run to pick up the changes to the settings.

Grafana specification file:

service_type: grafana
service_name: grafana
placement:
  count: 1
networks:
  192.168.122.0/24
spec:
  anonymous_access: true
  protocol: https
  only_bind_port_on_networks: true

Bugzilla:2233659

All bootstrap CLI parameters are now made available for usage in the cephadm-ansible module

Previously, only a subset of the bootstrap CLI parameters were available and it was limiting the module usage.

With this enhancement, all bootstrap CLI parameters are made available for usage in the cephadm-ansible module.

Bugzilla:2246266

Prometheus scrape configuration is added to the nfs-ganesha exporter

With this enhancement, the Prometheus scrape configuration is added to the nfs-ganesha exporter. This is done to scrape the metrics exposed by nfs-ganesha prometheus exporter into the Prometheus instance running in Ceph, which would be further consumed by Grafana Dashboards.

Bugzilla:2263898

Prometheus now binds to an IP within a specific network on a host, rather that always binding to 0.0.0.0

With this enhancement, using a Prometheus specification file that includes both the networks section with the network that Prometheus binds to an IP on, and only_bind_port_on_networks: true included in the "spec" section of the specification, Cephadm configures the Prometheus daemon to bind to an IP within that network rather than 0.0.0.0. This enables users to use the same port that Prometheus uses for another service but on a different IP on the host. If it is a specification update that does not cause them all to be moved, ceph orch redeploy prometheus can be run to pick up the changes to the settings.

Prometheus specification file:

service_type: prometheus
service_name: prometheus
placement:
  count: 1
networks:
- 10.0.208.0/22
spec:
  only_bind_port_on_networks: true

Bugzilla:2264812

Users can now mount snapshots (exports within .snap directory)

With this enhancement, users can mount snapshots (exports within .snap directory) to look at in a RO mode. NFS exports created with the NFS MGR module now include the cmount_path setting (this cannot be configured and should be left as "/") which allows snapshots to be mounted.

Bugzilla:2245261

Zonegroup hostnames can now be set using the specification file provided in the ceph rgw realm bootstrap…​ command

With this release, in continuation to the automation of Ceph Object gateway multi-site setup, users can now set zonegroup hostnames through the initial specification file passed in the bootstrap command ceph rgw realm bootstrap…​ instead of requiring additional steps.

For example,

zonegroup_hostnames:
- host1
- host2

If users add the above section to the "specification" section of the Ceph Object gateway specification file passed in the realm bootstrap command, Cephadm will automatically add those hostnames to the zonegroup defined in the specification after the Ceph Object gateway module finishes creation of the realm/zonegroup/zone. Note that this may take a few minutes to occur depending on what other activity the Cephadm module is currently completing.

3.2. Ceph Dashboard

CephFS snapshot schedules management on the Ceph dashboard

Previously, CephFS snapshot schedules could only be managed through the command-line interface.

With this enhancement, CephFS snapshot schedules can be listed, created, edited, activated, deactivated, and removed from the Ceph dashboard.

Bugzilla:2264145

Ceph dashboard now supports NFSv3-based exports in Ceph dashboard

With this enhancement, support is enabled for NFSv3-based export management in the Ceph dashboard.

Bugzilla:2267763

Ability to manage Ceph users for CephFS is added

With this enhancement, the ability to manage the Ceph users for CephFS is added. This provides the ability to manage the users' permissions for volumes, subvolume groups, and subvolumes from the File System view.

Bugzilla:2271110

A new API endpoint for multi-site sync status is added

Previously, multi-site sync status was available only via the CLI command.

With this enhancement, multi-site status is added via an API in the Ceph dashboard. The new API endpoint for multi-site sync status is api/rgw/multisite/sync_status.

Bugzilla:2258951

Improved monitoring of NVMe-oF gateway

With this enhancement, to improve monitoring of NVMe-oF gateway, alerts of NVMe-oF gateway are added based on the metrics emitted and also, metrics from the embedded prometheus exporter are scraped in the NVMe-oF gateway.

Bugzilla:2276038

CephFS clone management in Ceph dashboard

With this enhancement, CephFS clone management functionality is provided in the Ceph dashboard. Users can create and delete subvolume clone through the Ceph dashboard.

Bugzilla:2264142

CephFS snapshot management in Ceph dashboard

With this enhancement, CephFS snapshot management functionality is provided in the Ceph dashboard. Users can create and delete subvolume snapshot through the Ceph dashboard.

Bugzilla:2264141

Labeled Performance Counters per user/bucket

With this enhancement, users can not only obtain information on the operations happening per Ceph Object Gateway node, but can also view the Ceph Object Gateway performance counters per-user and per-bucket in the Ceph dashboard.

Labeled Sync Performance Counters into Prometheus

With this enhancement, users can gather real-time information from Prometheus about the replication health between zones for increased observability of the Ceph Object Gateway multi-site sync operations.

Add and edit bucket in Ceph dashboard

With this enhancement, as part of the Ceph Object Gateway improvements to the Ceph dashboard, the capability to apply, list and edit Buckets from the Ceph dashboard is added.

  • ACL(Public, Private)
  • Tags(adding/removing)

Add, List, Delete, and Apply bucket policies in Ceph dashboard

With this enhancement, as part of the Ceph Object Gateway improvements to the Ceph dashboard, the capability to add, list, delete, and apply bucket policies from the Ceph dashboard is added.

3.3. Ceph File System

MDS dynamic metadata balancer is off by default

Previously, poor balancer behavior would fragment trees in undesirable ways by increasing the max_mds file system setting.

With this enhancement, MDS dynamic metadata balancer is off, by default. Operators must turn on the balancer explicitly to use it.

Bugzilla:2227309

CephFS supports quiescing of subvolumes or directory trees

Previously, multiple clients would interleave reads and writes across a consistent snapshot barrier where out-of-band communication existed between clients. This communication led to clients wrongly believing they have reached a checkpoint that is mutually recoverable via a snapshot.

With this enhancement, CephFS supports quiescing of subvolumes or directory trees to enable the execution of crash-consistent snapshots. Clients are now forced to quiesce all I/O before the MDS executes the snapshot. This enforces a checkpoint across all clients of the subtree.

Bugzilla:2235753

MDS Resident Segment Size (RSS) performance counter is tracked with a higher priority

With this enhancement, the MDS Resident Segment Size performance counter is tracked with a higher priority to allow callers to consume its value to generate useful warnings. This allows Rook to identify the MDS RSS size and act accordingly.

Bugzilla:2256560

Laggy clients are now evicted only if there are no laggy OSDs

Previously, monitoring performance dumps from the MDS would sometimes show that the OSDs were laggy, objecter.op_laggy and objecter.osd_laggy, causing laggy clients (dirty data could not be flushed for cap revokes).

With this enhancement, if the defer_client_eviction_on_laggy_osds option is set to true and a client gets laggy because of a laggy OSD then client eviction will not take place until OSDs are no longer laggy.

Bugzilla:2260003

cephfs-mirror daemon exports snapshot synchronization performance counters via perf dump command

ceph-mds daemon export per-client performance counters included in the already existing perf dump command.

Bugzilla:2264177

A new dump dir command is introduced to dump the directory information

With this enhancement, the dump dir command is introduced to dump the directory information and print the output.

Bugzilla:2269687

Snapshot scheduling support for subvolumes

With this enhancement, snapshot scheduling support is provided for subvolumes. All snapshot scheduling commands accept --subvol and --group arguments to refer to appropriate subvolumes and subvolume groups. If a subvolume is specified without a subvolume group argument, then the default subvolume group is considered. Also, a valid path need not be specified when referring to subvolumes and just a placeholder string is sufficient due to the nature of argument parsing employed.

Example

# ceph fs snap-schedule add - 15m --subvol sv1 --group g1
# ceph fs snap-schedule status - --subvol sv1 --group g1

Bugzilla:2238537

Ceph commands that add or modify MDS caps give an explanation about why the MDS caps passed by user was rejected

Previously, Ceph commands that add or modify MDS caps printed "Error EINVAL: mds capability parse failed, stopped at 'allow w' of 'allow w'".

With this enhancement, the commands give an explanation about why the MDS caps passed by user were rejected and print Error EINVAL: Permission flags in MDS caps must start with 'r' or 'rw' or be '*' or 'all'.

Bugzilla:2247586

3.4. Ceph Object Gateway

Admin interface is now added to manage bucket notification

Previously, the S3 REST APIs were used to manage bucket notifications. However, if an admin wanted to override them, there was no easy way to do that over the radosgw-admin tool.

With this enhancement, an admin interface with the following commands is added to manage bucket notifications:

radosgw-admin notification get --bucket <bucket name> --notification-id <notification id>

radosgw-admin notification list --bucket <bucket name>

radosgw-admin notification rm --bucket <bucket name> [--notification-id <notification id>]

Bugzilla:2130292

RGW labeled user and bucket operation counters are now in different sections when the ceph counter dump is run

Previously, all RGW labeled operation counters were in the rgw_op` section of the output of the ceph counter dump command but would either have a user label or a bucket label.

With this enhancement, RGW labeled user and bucket operation counters are in rgw_op_per_user or rgw_op_per_bucket sections respectively when the ceph counter dump command is executed.

Bugzilla:2265574

Users can now place temporary files into a directory using the -t command-line option

Previously, the /usr/bin/rgw-restore-bucket-index tool just used /tmp and that directory sometimes did not have enough free space to hold all the temporary files.

With this enhancement, the user can specify a directory into which the temporary files can be placed using the -t command-line option and will be notified if they run out of space, thereby knowing what adjustments to make to re-run the tool. Also, users can periodically check if the tool’s temporary files have exhausted the available space on the file system where the temporary files are present.

Bugzilla:2267715

Copying of encrypted objects using copy-object APIs is now supported

Previously, in Ceph Object gateway, copying of encrypted objects using copy-object APIs was unsupported since the inception of its server-side encryption support.

With this enhancement, copying of encrypted objects using copy-object APIs is supported and workloads that rely on copy-object operations can also use server-side encryption.

Bugzilla:2149450

A new Ceph Object Gateway admin-ops capability is added to allow reading user metadata but not their associated authorization keys

With this enhancement, a new Ceph Object Gateway admin-ops capability is added to allow reading Ceph Object gateway user metadata but not their associated authorization keys. This is to reduce the privileges of automation and reporting tools and to avoid impersonating users or view their keys.

Bugzilla:2112325

Cloud Transition: add new supported S3-compatible platforms

With this release, to be able to move object storage to the cloud or other on-premise S3 endpoints, the current lifecycle transition and storage class model is extended. S3-compatible platforms, such as IBM Cloud Object Store (COS) and IBM Storage Ceph are now supported for the cloud archival feature.

NFS with RGW backend

With this release, NFS with Ceph Object Gateway backend is re-GAed with the existing functionalities.

3.5. Multi-site Ceph Object Gateway

A retry mechanism is introduced in the radosgw-admin sync status command

Previously, when the multisite sync sent requests to a remote zone, it used a round robin strategy to choose one of its zone endpoints. If that endpoint was not available, the http client logic used by the radosgw-admin sync status command would not provide a retry mechanism, and thus report input/output error.

With this enhancement, a retry mechanism is introduced in the sync status command by virtue of which, if the chosen endpoint is unavailable, a different endpoint is selected to serve the request.

Bugzilla:1995152

NewerNoncurrentVersions, ObjectSizeGreaterThan, and ObjectSizeLessThan filters are added to the lifecycle

With this enhancement, support for NewerNoncurrentVersions, ObjectSizeGreaterThan, and ObjectSizeLessThan filters are added to the lifecycle.

Bugzilla:2172162

User S3 replication APIs are now supported

With this enhancement, user S3 replication APIs are now supported. With these APIs, users can set replication policies at bucket-level. The API is extended to include additional parameters to specify source and destination zone names.

Bugzilla:2279461

Bucket Granular Sync Replication GA (Part 3)

With this release, the ability to replicate a bucket or a group of buckets to a different Red Hat Ceph Storage cluster is added with bucket granular support. The usability requirements are as Ceph Object Gateway multi-site.

3.6. RADOS

Setting the noautoscale flag on/off retains each pool’s original autoscale mode configuration

Previously, the pg_autoscaler did not persist in each pool’s autoscale mode configuration when the noautoscale flag was set. Due to this, whenever the noautoscale flag was set, the autoscale mode had to be set for each pool repeatedly.

With this enhancement, the pg_autoscaler module persists individual pool configuration for the autoscaler mode after the noautoscale flag is set. Setting the noautoscale flag on/off still retains each pool’s original autoscale mode configuration.

Bugzilla:2136766

reset_purged_snaps_last OSD command is introduced

With this enhancement, reset_purged_snaps_last OSD command is introduced to resolve cases in which the purged_snaps keys (PSN) are missing in the OSD and exist in the monitor. The purged_snaps_last command will be zeroed and as a result, the monitor will share all its purged_snaps information with the OSD on the next boot.

Bugzilla:2251188

BlueStore’s RocksDB compression enabled

With this enhancement, to ensure that the metadata (especially OMAP) takes less space, RocksDB configuration is modified to enable internal compression of its data.

As a result, * database size is smaller * write amplification during compaction is smaller * average I/O is higher * CPU usage is higher

Bugzilla:2253313

OSD is now more resilient to fatal corruption

Previously, special OSD layer object "superblock" would be overwritten due to being located at the beginning of the disk, resulting in a fatal corruption.

With this enhancement, OSD "superblock" is redundant and is migrating on disk. Its copy is stored in the database. OSD is now more resilient to fatal corruption.

Bugzilla:2079897

3.7. RADOS Block Devices (RBD)

Improved rbd_diff_iterate2() API performance

Previously, RBD diff-iterate was not guaranteed to execute locally if exclusive lock was available when diffing against the beginning of time (fromsnapname == NULL) in fast-diff mode (whole_object == true with fast-diff image feature enabled and valid).

With this enhancement, rbd_diff_iterate2() API performance is improved, thereby increasing the performance for QEMU live disk synchronization and backup use cases, where the fast-diff image feature is enabled.

Bugzilla:2258997

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.