Chapter 6. Bug fixes
This section describes bugs with significant impact on users that were fixed in this release of Red Hat Ceph Storage. In addition, the section includes descriptions of fixed known issues found in previous versions.
6.1. The Ceph Ansible utility
Alertmanager does not log errors when self-signed or untrusted certificates are used
Previously, when using untrusted CA certificates, Alertmanager generated many errors in the logs.
With this release, the ceph-ansible
can set the insecure_skip_verify
parameter to true
in the alertmanager.yml
file by setting alertmanager_dashboard_api_no_ssl_verify: true
in the group_vars/all.yml
file when using self-signed or untrusted certificates and the Alertmanager does not log those errors anymore and works as expected.
Use a fully-qualified domain name (FQDN) when HTTPS is enabled in a multi-site configuration
Previously, in a multi-site Ceph configuration, ceph-ansible
would not differentiate between HTTP and HTTPS and set the zone endpoints with the IP address instead of the host name when HTTPS was enabled.
With this release, ceph-ansible uses the fully-qualified domain name (FQDN) instead of the IP address when HTTPS is enabled and the zone endpoints are set with the FQDN and match the TLS certificate CN.
Add the --pid-limits
parameter as -1
for podman and 0
for docker in the systemd file to start the container
Previously, the number of processes allowed to run in containers, 2048 for podman and 4096 for docker, were not sufficient to start some containers which needed to start more processes than these limits.
With this release, you can remove the limit of maximum processes that can be started by adding the --pid-limits
parameter as -1
for podman and as 0
for docker in the systemd unit files. As a result, the containers start even if you customize the internal processes which might need to run more processes than the default limits.
ceph-ansible
pulls the monitoring container images in a dedicated task behind the proxy
Previously, ceph-ansible
would not pull the monitoring container images such as Alertmanager, Prometheus, node-exporter, and Grafana in a dedicated task and would pull images when the systemd service was started.
With this release, ceph-ansible
supports pulling monitoring container images behind a proxy.
The ceph-ansible
playbook creates the radosgw system user and works as expected
Previously, the ceph-ansible
playbook failed to create the radosgw system user and failed to deploy the dashboard when rgw_instances
was set at the host_vars
or group_vars
level in a multi-site deployment. This variable is not set on Ceph Monitor nodes and given that this where the tasks are delegated, it failed.
With this release, ceph-ansible
checks all the Ceph Object Gateway instances that are defined and sets a boolean fact to check if at least one instance has the rgw_zonemaster
set to ‘True'. The radosgw system user is created and the playbook works as expected.
The Ansible playbook does not fail when used with --limit
option
Previously, the dashboard_server_addr
parameter was unset when the Ansible playbook was run with the --limit
option and the playbook would fail if the play target did not match the Ceph Manager hosts in a non-collocated scenario.
With this release, you have to set the dashboard_server_addr
parameter on the Ceph Manager nodes and the playbook works as expected.
6.2. Ceph Management Dashboard
The “Client Connection” panel is replaced with “MGRs” on the Grafana dashboard
Previously, the “Client Connection” panel displayed the Ceph File System information and was not meaningful.
With this release, "MGRs" replaces the "Client Connection" panel and displays the count of the active and standby Ceph Managers.
(BZ#1992178)
The Red Hat Ceph Storage Dashboard displays the values for disk IOPS
Previously, the Red Hat Ceph Storage Dashboard would not display the Ceph OSD disk performance in the Hosts tab.
With this release, the Red Hat Ceph Storage Dashboard displays the expected information about the Ceph OSDs, host details, and the Grafana graphs.
(BZ#1992246)
6.3. The Ceph Volume utility
The add-osd.yml
playbook does not fail anymore while creating new OSDs
Previously, the add-osd.yml
playbook would fail when new OSDs were added using ceph-ansible
. This was due to the ceph-volume lvm batch
limitation which does not allow addition of new OSDs in a non-interactive mode.
With this release, the --yes
and --report
options are not passed to the command-line interface and the add-osd.yml
playbook works as expected when creating new OSDs.
6.4. Ceph Object Gateway
The rgw_bucket_quota_soft_threshold
parameter is disabled
Previously, the Ceph Object Gateway fetched utilization information from the bucket index if the cached utilization reached rgw_bucket_quota_soft_threshold
causing high operations on the bucket index and slower requests.
This release removes the rgw_bucket_quota_soft_threshold
parameter and uses the cached stats resulting in better performance even if the quota limit is almost reached.
(BZ#1965314)
The radosgw-admin datalog trim
command does not crash while trimming a marker
Previously, the radosgw-admin datalog trim
command would crash when trimming a marker in the current generation from radosgw-admin
due to a logic error.
This release fixes a logic error and log trimming occurs without the radosgw-admin datalog trim
command crashing.
6.5. Ceph Manager plugins
The cluster health changes are no longer committed to persistent storage
Previously, rapid changes to the health of the storage cluster caused excessive logging to the ceph.audit.log
.
With this release, the health_history
is not logged to the ceph.audit.log
and cluster health changes are no longer committed to persistent storage.
(BZ#2004738)