Ce contenu n'est pas disponible dans la langue sélectionnée.
Chapter 4. Bug fixes
This section describes bugs with significant impact on users that were fixed in this release of Red Hat Ceph Storage. In addition, the section includes descriptions of fixed known issues found in previous versions.
4.1. Ceph manager plug ins
Python tasks no longer wait for the GIL
Previously, the Ceph manager daemon held the Python global interpreter lock (GIL) during some RPCs with the Ceph MDS, due to which, other Python tasks are starved waiting for the GIL.
With this fix, the GIL is released during all libcephfs
/librbd
calls and other Python tasks may acquire the GIL normally.
4.2. The Cephadm utility
cephadm
can differentiate between a duplicated hostname and no longer adds the same host to a cluster
Previously, cephadm
would consider a host with a shortname and a host with its FQDN as two separate hosts, causing the same host to be added twice to a cluster.
With this fix, cephadm
now recognizes the difference between a host shortname and the FQDN, and does not add the host again to the system.
cephadm
no longer reports that a non-existing label is removed from the host
Previously, in cephadm
, there was no check to verify if a label existed before removing it from a host. Due to this, the ceph orch host label rm
command would report that a label was removed from the host, even when the label was non-existent. For example, a misspelled label.
With this fix, the command now provides clear feedback whether the label specified was successfully removed or not to the user.
The keepalive daemons communicate and enter the main/primary state
Previously, keepalive configurations were populated with IPs that matched the host IP reported from the ceph orch host ls
command. As a result, if the VIP was configured on a different subnet than the host IP listed, the keepalive daemons were not able to communicate, resulting in the keepalive daemons to enter a primary state.
With this fix, the IPs of keepalive peers in the keepalive configuration are now chosen to match the subnet of the VIP. The keepalive daemons can now communicate even if the VIP is in a different subnet than the host IP from ceph orch host ls
command. In this case, only one keepalive daemon enters primary state.
Stopped crash daemons now have the correct state
Previously, when a crash daemon stopped, the return code gave an error
state, rather than the expected stopped
state, causing systemd
to think that the service had failed.
With this fix, the return code gives the expected stopped
state.
HA proxy now binds to the frontend port on the VIP
Previously, in Cephadm, multiple ingress services could not be deployed on the same host with the same frontend port as the port binding occurred across all host networks.
With this fix, multiple ingress services can now be present on the same host with the same frontend port as long as the services use different VIPs and different monitoring ports are set for the ingress service in the specification.
4.3. Ceph File System
User-space Ceph File System (CephFS) work as expected post upgrade
Previously, the user-space CephFS client would sometimes crash during a cluster upgrade. This would occur due to stale feature bits on the MDS side that were held on the user-space side.
With this fix, ensure that the user-space CephFS client has updated MDS feature bits that allows the clients to work as expected after a cluster upgrade.
Blocklist and evict client for large session metadata
Previously, large client metadata buildup in the MDS would sometimes cause the MDS to switch to read-only mode.
With this fix, the client that is causing the buildup is blocklisted and evicted, allowing the MDS to work as expected.
Deadlocks no longer occur between the unlink and reintegration requests
Previously, when fixing async dirop bug, a regression was introduced by previous commits, causing deadlocks between the unlink and reintegration request.
With this fix, the old commits are reverted and there is no longer a deadlock between unlink and reintegration requests.
Client always sends a caps revocation acknowledgement to the MDS daemon
Previously, whenever an MDS daemon sent a caps revocation request to a client and during this time, if the client released the caps and removed the inode, then the client would drop the request directly, but the MDS daemon would need to wait for a caps revoking acknowledgement from the client. Due to this, even when there was no need for caps revocation, the MDS daemon would continue waiting for an acknowledgement from the client, causing a warning in MDS Daemon health status.
With this fix, the client always sends a caps revocation acknowledgement to the MDS Daemon, even when there is no inode existing and the MDS Daemon no longer stays stuck.
MDS locks are obtained in the correct order
Previously, MDS would acquire metadata tree locks in the wrong order, resulting in a create
and getattr
RPC request to deadlock.
With this fix, locks are obtained in the correct order in MDS and the requests no longer deadlock.
Sending split_realms
information is skipped from CephFS MDS
Previously, the split_realms
information would be incorrectly sent from the CephFS MDS which could not be correctly decoded by kclient
. Due to this, the clients would not care about the split_realms
and treat it as a corrupted snaptrace.
With this fix, split_realms
are not sent to kclient
and no crashes take place.
Snapshot data is no longer lost after setting writing
flags
Previously, in clients, if the writing
flag was set to ‘1’ when the Fb
caps were used, it would be skipped in case of any dirty caps and reuse the existing capsnap, which is incorrect. Due to this, two consecutive snapshots would be overwritten and lose data.
With this fix, the writing
flags are correctly set and no snapshot data is lost.
Thread renaming no longer fails
Previously, in a few rare cases, during renaming, if another thread tried to lookup the dst dentry, there were chances for it to get inconsistent result, wherein both the src dentry and dst dentry would link to the same inode simultaneously. Due to this,the rename request would fail as two different dentries were being linked to the same inode.
With this fix, the thread waits for the renaming action to finish and everything works as expected.
Revocation requests no longer get stuck
Previously, before the revoke request was sent out, which would increase the 'seq', if the clients released the corresponding caps and sent out the cap update request with the old seq
, the MDS would miss checking the seq
(s) and cap calculation. Due to this, the revocation requests would be stuck infinitely and would throw warnings about the revocation requests not responding from clients.
With this fix, an acknowledgement is always sent for revocation requests and they no longer get stuck.
Errors are handled gracefully in MDLog::_recovery_thread
Previously, a write would fail if the MDS was already blocklisted due to the fs fail
issued by the QA tests. For instance, the QA test test_rebuild_moved_file
(tasks/data-scan) would fail due to this reason.
With this fix, the write failures are gracefully handled in MDLog::_recovery_thread
.
Ceph client now verifies the cause of lagging before sending out an alarm
Previously, Ceph would sometimes send out false alerts warning of laggy OSDs. For example, X client(s) laggy due to laggy OSDs
. These alerts were sent out without verifying that the lagginess was actually due to the OSD, and not due to some other cause.
With this fix, the X client(s) laggy due to laggy OSDs
message is only sent out if some clients and an OSD is laggy.
4.4. Ceph Dashboard
Grafana panels for performance of daemons in the Ceph Dashboard now show correct data
Previously, the labels exporter were not compatible with the queries used in the Grafana dashboard. Due to this, the Grafana panels were empty for Ceph daemons performance in the Ceph Dashboard.
With this fix, the label names are made compatible with the Grafana dashboard queries and the Grafana panels for performance of daemons show correct data.
Edit layering and deep-flatten features disabled on the Dashboard
Previously, in the Ceph dashboard, it was possible to allow editing the layering & deep-flatten features, which are immutable, resulting in an error - rbd: failed to update image features: (22) Invalid argument
.
With this fix, editing the layering & deep-flatten features are disabled and everything works as expected.
ceph_daemon
label is added to the labeled performance counters in Ceph exporter
Previously, in Ceph exporter, adding the ceph_daemon
label to the labeled performance counters was missed.
With this fix, ceph_daemon
label is added to the labeled performance counters in Ceph exporter. ceph daemon
label is now present on all Ceph daemons performance metrics and instance_id
label for Ceph Object Gateway performance metrics.
Protecting snapshot is enabled only if layering for its parent image is enabled
Previously, protecting snapshot was enabled even if layering was disabled for its parent image. This caused errors when trying to protect the snapshot of an image for which layering was disabled.
With this fix, protecting snapshot is disabled if layering for an image is disabled. Protecting snapshot is enabled only if layering for its parent image is enabled.
Newly added host details are now visible on the cluster expansion review page
Previously, users could not see the information about the hosts that were added in the previous step.
With this fix, hosts that were added in the previous step are now visible on the cluster expansion review page.
Ceph Object Gateway page now loads properly on the Ceph dashboard.
Previously, an incorrect regex matching caused the dashboard to break when trying to load the Ceph Object Gateway page. The Ceph Object Gateway page would not load with specific configurations like rgw_frontends like beast port=80 ssl_port=443
.
With this fix, the regex matching in the codebase is updated and the Ceph Object Gateway page loads without any issues.
4.5. Ceph Object Gateway
Ceph Object Gateway daemon no longer crashes where phoneNumbers.addr
is NULL
Previously, due to a syntax error, the query for select * from s3object[*].phonenumbers where phoneNumbers.addr is NULL;
would cause the Ceph Object Gateway daemon to crash.
With this fix the wrong syntax is identified and reported, no longer causing the daemon to crash.
Ceph Object Gateway daemon no longer crashes with cast( trim)
queries
Previously, due to the trim skip type checking within the query for select cast( trim( leading 132140533849470.72 from _3 ) as float) from s3object;
, the Ceph Object Gateway daemon would crash.
With this fix the type is checked and is identified if wrong and reported, no longer causing the daemon to crash.
Ceph Object Gateway daemon no longer crashes with “where” clause in an s3select
JSON query.
Previously, due to a syntax error, an s3select
JSON query with a “where” clause would cause the the Ceph Object Gateway daemon to crash.
With this fix the wrong syntax is identified and reported, no longer causing the daemon to crash.
Ceph Object Gateway daemon no longer crashes with s3 select phonenumbers.type
query
Previously, due to a syntax error, the query for select phonenumbers.type from s3object[*].phonenumbers;
would cause the Ceph Object Gateway daemon to crash.
With this fix the wrong syntax is identified and reported, no longer causing the daemon to crash.
Ceph Object Gateway daemon validates arguments and no longer crashes
Previously, due to an operator with missing arguments, the daemon would crash when trying to access the nonexistent arguments.
With this fix the daemon validates the number of arguments per operator and the daemon no longer crashes.
Ceph Object Gateway daemon no longer crashes with the trim command
Previously, due to the trim skip type checking within the query for select trim(LEADING '1' from '111abcdef111') from s3object;
, the Ceph Object Gateway daemon would crash.
With this fix, the type is checked and is identified if wrong and reported, no longer causing the daemon to crash.
Ceph Object Gateway daemon no longer crashes if a big value is entered
Previously, due to too large of a value entry, the query for select DATE_DIFF(SECOND, utcnow(),date_add(year,1111111111111111111, utcnow())) from s3object;
would cause the Ceph Object Gateway daemon to crash.
With this fix, the crash is identified and an error is reported.
Ceph Object Gateway now parses the CSV objects without processing failures
Previously, Ceph Object Gateway failed to properly parse CSV objects. When the process failed, the requests would stop without a proper error message.
With this fix, the CSV parser works as expected and processes the CSV objects with no failures.
Object version instance IDs beginning with a hyphen are restored
Previously, when restoring the index on a versioned bucket, object versions with an instance ID beginning with a hyphen would not be properly restored into the bucket index.
With this fix, instance IDs beginning with a hyphen are now recognized and restored into the bucket index, as expected.
Multi-delete function notifications work as expected
Previously, due to internal errors, such as a race condition in the code, the Ceph Object Gateway would crash or react unexpectedly when multi-delete functions were performed and the notifications were set for bucket deletions.
With this fix, notifications for multi-delete function work as expected.
RADOS object multipart upload workflows complete properly
Previously, in some cases, a RADOS object that was part of a multipart upload workflow objects that were created on a previous upload would cause certain parts to not complete or stop in the middle of the upload.
With this fix, all parts upload correctly, once the multipart upload workflow is complete.
Users belonging to a different tenant than the bucket owner can now manage notifications
Previously, a user that belonged to a different tenant than the bucket owner was not able to manage notifications. For example, modify, get, or delete.
With this fix, any user with the correct permissions can manage the notifications for the buckets.
Ability to perform NFS setattr
on buckets is removed
Previously, changing the attributes stored on a bucket via export as an NFS directory triggered an inconsistency in the Ceph Object gateway bucket information cache. Due to this, subsequent accesses to the bucket via NFS failed.
With this fix, the ability to perform NFS setattr
on buckets is removed and attempts to perform NFS setattr
on a bucket, for example, chown
on the directory, have no effect.
This might change in future releases.
Testing for reshardable bucket layouts is added to prevent crashes
Previously, with the added bucket layout code to enable dynamic bucket resharding with multi-site, there was no check to verify if the bucket layout supported resharding during dynamic, immediate, or rescheduled resharding. Due to this, the Ceph Object gateway daemon would crash in case of dynamic bucket resharding and the radosgw-admin
command would crash in case of immediate or scheduled resharding.
With this fix, a test for reshardable bucket layouts is added and the crashes no longer occur. When immediate and scheduled resharding occurs, an error message is displayed. When dynamic bucket resharding occurs, the bucket is skipped.
The user modify -placement-id
command can now be used with an empty --storage-class
argument
Previously, if the --storage-class
argument was not used when running the 'user modify --placement-id' command, the command would fail.
With this fix, the --storage-class
argument can be left empty without causing the command to fail.
Initialization now only unregisters watches that were previously registered
Previously, in some cases, an error in initialization could cause an attempt to unregister a watch that was never registered. This would result in some command line tools crashing unpredictably.
With this fix, only previously registered watches are unregistered.
Multi-site replication now maintains consistent states between zones and prevents overwriting deleted objects
Previously, a race condition in multi-site replication would allow objects that should be deleted to be copied back from another site, resulting in an inconsistent state between zones. As a result, the zone which is receiving the workload ends up with some objects which should be deleted still present.
With this fix, a custom header is added to pass the destination zone’s trace string and is then checked against the object’s replication trace. If there is a match, a 304 response is returned, preventing the full sync from overwriting a deleted object.
The memory footprint of Ceph Object Gateway has significantly been reduced
Previously, in some cases, a memory leak associated with Lua scripting integration caused excessive RGW memory growth.
With this fix, the leak is fixed and the memory footprint for Ceph Object Gateway is significantly reduced.
Bucket index performance no longer impacted during versioned object operations
Previously, in some cases, space leaks would occur and reduce bucket index performance. This was caused by a race condition related to updates of object logical head (OLH), which relates to versioned bucket current version calculations during updates.
With this fix, logic errors in OLH update operations are fixed and space is no longer being leaked during versioned object operations.
Delete markers are working correctly with the LC rule
Previously, optimization was attempted to reuse a sal object handle. Due to this, delete markers were not being generated as expected.
With this fix, the change to re-use sal object handle for get-object-attributes is reverted and delete markers are created correctly.
SQL engine no longer causes Ceph Object Gateway crash with illegal calculations
Previously, in some cases, the SQL engine would throw an exception that was not handled, causing a Ceph Object Gateway crash. This was caused due to an illegal SQL calculation of a date-time operation.
With this fix, the exception is handled with an emitted error message, instead of crashing.
The select trim (LEADING '1' from '111abcdef111') from s3object;
query now works when capitals are used in query
Previously, if LEADING
or TRAILING
were written in all capitals, the string would not properly read, causing a float type to be referred to as a string type, thus leading to a wrong output.
With this fix, type checking is introduced before completing the query, and LEADING
and TRAILING
work written either capitalized or in lower case.
JSON parsing now works for select _1.authors.name from s3object[*] limit 1
query
Previously, an anonymous array given in the select _1.authors.name from s3object[*] limit 1
would give the wrong value output.
With this fix, JSON parsing works, even if an anonymous array is provided to the query.
4.6. Multi-site Ceph Object Gateway
Client no longer resets the connection for an incorrect Content-Length
header field value
Previously, when returning an error page to the client, for example, a 404 or 403 condition, the </body>
and </html>
closing tags were missing, although their presence was accounted for in the request’s Content-Length
header field value. Due to this, depending on the client, the TCP connection between the client and the Rados Gateway would be closed by an RST packet from the client on account of incorrect Content-Length header field value, instead of a FIN packet under normal circumstances.
With this fix, send the </body>
and </html>
closing tags to the client under all the required conditions. The value of the Content-Length
header field correctly represents the length of data sent to the client, and the client no longer resets the connection for an incorrect Content-Length reason.
Sync notification are sent with the correct object size
Previously, when an object was synced between zones, and sync notifications were configured, the notification was sent with zero as the size of the object.
With this fix, sync notifications are sent with the correct object size.
Multi-site sync properly filters and checks according to allowed zones and filters
Previously, when using the multi-site sync policy, certain commands, such as radosgw-admin sync status
, would not filter restricted zones or empty sync group names. The lack of filter caused the output of these commands to be misleading.
With this fix, restricted zones are no longer checked or reported and empty sync group names are filtered out of the status results.
4.7. RADOS
The ceph version command no longer returns the empty version list
Previously, if the MDS daemon was not deployed in the cluster then the ceph version
command returned an empty version list for MDS daemons that represented version inconsistency. This should not be shown if the daemon is not deployed in the cluster.
With this fix, the daemon version information is skipped if the daemon version map is empty and the ceph version
command returns the version information only for the Ceph daemons which are deployed in the cluster.
ms_osd_compression_algorithm
now displays the correct value
Previously, an incorrect value in ms_osd_compression_algorithm
displayed a list of algorithms instead of the default value, causing a discrepancy by listing a set of algorithms instead of one.
With this fix, only the default value is displayed when using the ms_osd_compression_algorithm
command.
MGR no longer disconnects from the cluster without retries
Previously, during network issues, clusters would disconnect with MGR without retries and the authentication of monclient
would fail.
With this fix, retries are added in scenarios where hunting and connection would both fail.
Increased timeout retry value for client_mount_timeout
Previously, due to the mishandling of the client_mount_timeout
configurable, the timeout for authenticating a client to monitors could reach up to 10 retries disregarding its high default value of 5 minutes.
With this fix, the previous single-retry behavior of the configurable is restored and the authentication timeout works as expected.
4.8. RBD Mirroring
Demoted mirror snapshot is removed following the promotion of the image
Previously, due to an implementation defect, the demoted mirror snapshots would not be removed following the promotion of the image, whether on the secondary image or on the primary image. Due to this, demoted mirror snapshots would pile up and consume storage space.
With this fix, the implementation defect is fixed and the appropriate demoted mirror snapshot is removed following the promotion of the image.
Non-primary images are now deleted when the primary image is deleted
Previously, a race condition in the rbd-mirror daemon image replayer prevented a non-primary image from being deleted when the primary was deleted. Due to this, the non-primary image would not be deleted and the storage space was used.
With this fix, the rbd-mirror image replayer is modified to eliminate the race condition. Non-primary images are now deleted when the primary image is deleted.
The librbd
client correctly propagates the block-listing error to the caller
Previously, when the rbd_support
module’s RADOS client was block-listed, the module’s mirror_snapshot_schedule
handler would not always shut down correctly. The handler’s librbd
client would not propagate the block-list error, thereby stalling the handler’s shutdown. This lead to the failures of the mirror_snapshot_schedule
handler and the rbd_support
module to automatically recover from repeated client block-listing. The rbd_support
module stopped scheduling mirror snapshots after its client was repeatedly block-listed.
With this fix, the race in the librbd
client between its exclusive lock acquisition and handling of block-listing is fixed. This allows the librbd
client to propagate the block-listing error correctly to the caller, for example, the mirror_snapshot_schedule
handler, while waiting to acquire an exclusive lock. The mirror_snapshot_schedule
handler and the rbd_support_module
automatically recovers from repeated client block-listing.