Chapter 6. Known Issues
This section documents known issues found in this release of Red Hat Ceph Storage.
Ansible does not properly handle unresponsive tasks
Certain tasks, for example adding monitors with the same host name, cause the ceph-ansible
utility to become unresponsive. Currently, there is no timeout set after which the unresponsive tasks is marked as failed. (BZ#1313935)
Certain image features are not supported with the RBD kernel module
The following image features are not supported with the current version of the RADOS Block Device (RBD) kernel module (krbd
) that is included in Red Hat Enterprise Linux 7.4:
-
object-map
-
deep-flatten
-
journaling
-
fast-diff
RBDs may be created with these features enabled. As a consequence, an attempt to map the kernel RBDs by running the rbd map
command fails.
To work around this issue, disable the unsupported features by setting the rbd_default_features = 1
option in the Ceph configuration file for kernel RBDs or dynamically disable them by running the following command:
rbd feature disable <image> <feature>
This issue is a limitation only in kernel RBDs, and the features work as expected with user-space RBDs.
NFS Ganesha does not show bucket size or number of blocks
NFS Ganesha, the NFS interface of the Ceph Object Gateway, lists buckets as directories. However, the interface always shows that the directory size and the number of blocks is 0
, even if some data is written to the buckets. (BZ#1359408)
An LDAP user can access buckets created by a local RGW user with the same name
The RADOS Object Gateway (RGW) does not differentiate between a local RGW user and an LDAP user with the same name. As a consequence, the LDAP user can access the buckets created by the local RGW user.
To work around this issue, use different names for RGW and LDAP users. (BZ#1361754)
The GNU tar utility currently cannot extract archives directly into the Ceph Object Gateway NFS mounted file systems
The current version of the GNU tar utility makes overlapping write operations when extracting files. This behavior breaks the strict sequential write restriction in the current version of the Ceph Object Gateway NFS. In addition, GNU tar reports these errors in the usual way, but it also by default continues extracting the files after reporting the errors. As a result, the extracted files can contain incorrect data.
To work around this problem, use alternate programs to copy file hierarchies into the Ceph Object Gateway NFS. Recursive copying by using the cp -r
command works correctly. Non-GNU archive utilities might be able to correctly extract the tar archives, but none have been verified. (BZ#1418606)
Old zone group name is sometimes displayed alongside with the new one
In a multi-site configuration when a zone group is renamed, other zones can in some cases continue to display the old zone group name in the output of the radosgw-admin zonegroup list
command.
To work around this issue:
- Verify that the new zone group name is present on each cluster.
Remove the old zone group name:
$ rados -p .rgw.root rm zonegroups_names.<old-name>
Failover and failback cause data sync issues in multi-site environments
In environments using the Ceph Object Gateway multi-site feature, failover and failback cause data sync to stall. This is because the radosgw-admin sync status
command reports that data sync is behind
for an extended period of time.
To workaround this issue, use the radosgw-admin data sync init
command and restart the Gateways. (BZ#1459967)
It is not possible to remove directories stored on S3 versioned buckets by using rm
The mechanism that is used to check for non-empty directories prior to unlinking them works incorrectly in combination with the Ceph Object Gateway Simple Storage Service (S3) versioned buckets. As a consequence, directory trees on versioned buckets cannot be recursively removed with a command such as rm -rf
. To work around this problem, remove any objects in versioned buckets by using the S3 interface. (BZ#1489301)
Deleting directories that contain symbolic links is slow
An attempt to delete directories and subdirectories on a Ceph File System that include a number of hard links by using the rm -rf
command is significantly slower than deleting directories that do not contain any hard links. (BZ#1491246)
Resized LUNs are not immediately visible to initiators when using the iSCSI gateway
When using the iSCSI gateway, resized logical unit numbers (LUNs) are not immediately visible to initiators. This means the initiators are not able to see the additional space allocated to a LUN. To work around this issue, restart the iSCSI gateway after resizing a LUN to expose it to the initiators, or always add new LUNs when increasing storage capacity. All targets must be updated before utilizing the new space by the initiators. (BZ#1492342)
The Ceph Object Gateway requires applications to write sequentially
The Ceph Object Gateway requires applications to write sequentially from offset 0 to the end of a file. Attempting to write out of order causes the upload operation to fail. To work around this issue, use utilities like cp
, cat
, or rsync
when copying files into NFS space. Always mount with the sync
option. (BZ#1492589)
The Expiration, Days
S3 Lifecycle parameter cannot be set to 0
The Ceph Object Gateway does not accept the value of 0
for the Expiration, Days
Lifecycle configuration parameter. Consequently, setting the expiration to 0
cannot be used to trigger background delete operation of objects.
To work around this problem, delete objects directly. (BZ#1493476)
Load on MDS daemons is not always balanced fairly or evenly in multiple active MDS configurations
In certain cases, the MDS balancers offload too much metadata to another active daemon or none at all. (BZ#1494256)
User space issues make df
calculations less accurate for kernel client users
User space improvements in df
calculations have been accepted in the upstream kernel, but have not yet been packaged downstream. The df
command reports more accurate free space data when a Ceph File System is mounted with the ceph-fuse
utility. When mounted with the kernel client, 'df' reports the same, less accurate data as in previous versions. To work around this problem, kernel client users can use the ceph df
command and examine the relevant data pools to determine free space more accurately. (BZ#1494987)
An iSCSI initiator can send more than max_data_area_mb
worth of data when a Ceph cluster is under heavy load causing a temporary performance drop
When a Ceph cluster is under heavy load, an iSCSI initiator might send more data than specified by the max_data_area_mb
parameter. Once the max_data_area_mb
limit has been reached, the target_core_user
module returns queue full statuses for commands. The initiators might not fairly retry these commands and they can hit initiator side time outs and be failed in the multipath layer. The multipath layer will retry the commands on another path while other commands are still being executed on the original path. This causes a temporary performance drop, and in some extreme cases in Linux environment the multipathd
daemon can terminate unexpectedly.
If the multipathd
daemon crashes, restart it manually:
# systemctl restart multipathd
The Ceph iSCSI gateway only supports clusters named "ceph"
The Ceph iSCSI gateway expects the default cluster name, that is "ceph". If a cluster uses a different name, the Ceph iSCSI gateway does not properly connect to the cluster. To work around this problem, use the default cluster name, or manually copy the content of the /etc/ceph/<cluster-name>.conf
file to the /etc/ceph/ceph.conf
file in addition to the associated keyrings. (BZ#1502021)
The stat
command returns ID: 0
for CephFS FUSE clients
When a Ceph File System (CephFS) is mounted as a File System in User Space (FUSE) client, the stat
command outputs ID: 0
instead of a proper ID. (BZ#1502384)
Having more than one path from an initiator to an iSCSI gateway is not supported
In the iSCSI gateway, tcmu-runner
might return the same inquiry and Asymmetric logical unit access (ALUA) info for all iSCSI sessions to a target port group. This can cause the initiator or multipath layer to use the incorrect port info to reference the internal structures for paths and devices, which can result in failures, failover and failback failing, or incorrect multipath and SCSI log or tool output. Therefore, having more than one iSCSI session from an initiator to an iSCSI gateway is not supported. (BZ#1502740)
Incorrect number of tcmu-runner
daemons reported after iSCSI target LUNs fail and recover
After iSCSI target Logical Unit Numbers (LUNs) recover from a failure, the ceph -s
command in certain cases outputs an incorrect number of tcmu-runner
daemons. (BZ#1503411)
The tcmu-runner
daemon does not clean up its blacklisted entries upon recovery
When the path fails over from the Active/Optimized to Active/Non-Optimized path or vice-versa on a failback, the old target is blacklisted to prevent stale writes from occurring. These blacklist entries are not cleaned up after the tcmu-runner
daemon recovers from being blacklisted, resulting in extraneous blacklisted clients until the entries expire after one hour. (BZ#1503692)
delete_website_configuration
cannot be enabled by setting the bucket policy DeleteBucketWebsite
In the Ceph Object Gateway, a user cannot enable delete_website_configuration
on a bucket even when a bucket policy has been written granting them S3:DeleteBucketWebsite
permission.
To work around this issue, you can use other methods of permitting, for example, by using admin operations, by bucket owner, or by ACL. (BZ#1505400)
During a data rebalance of a Ceph cluster, the system might report degraded objects
Under certain circumstances, such as when an OSD is marked out, the number of degraded objects reported during a data rebalance of a Ceph cluster can be too high, in some cases implying a problem where none exists. (BZ#1505457)
The iSCSI gateway can fail to scan or setup LUNs
When using the iSCSI gateway, the Linux initiators can return the kzalloc
failures due to buffers being too large. In addition, the VMWare ESX initiators can return the READ_CAP
failures due to not being able to copy the data. As a consequence, the iSCSI gateway fails to scan or setup Logical Unit Numbers (LUNs), find or rediscover devices, and add the devices back after path failures. (BZ#1505942)
The RESTful API commands do not work as expected
The RESTful plug-in provides API to interact with a Ceph cluster. Currently, the API fails to change the pgp_num
parameter. In addition, it indicates a failure when changing the pg_num
parameter, despite pg_num
being changed as expected. (BZ#1506102)
Adding LVM-based OSDs fail on clusters with other names than "ceph"
An attempt to install a new Ceph cluster or add OSDs by using the osd_scenario: lvm
parameter fails on clusters that use other names than the default "ceph". To work around this problem on new clusters, use the default cluster name ("ceph"). (BZ#1507943)
The iSCSI gwcli
utility does not support hyphens in pool or image names
It is not possible to create a disk using a pool or image name that includes hyphens ("-") by using the iSCSI gwcli
utility. (BZ#1508451)
Ansible creates unused systemd
unit files
When installing the Ceph Object Gateway by using the ceph-ansible
utility, ceph-ansible
creates systemd
unit files for the Ceph Object Gateway host corresponding to all Object Gateway instances located on other hosts. However, only the unit file that corresponds to the hostname of the Ceph Object Gateway host is active. The rest of the unit files appear inactive, but this does not have any impact on the Ceph Object Gateways. (BZ#1508460)
The nfs-server
must be disabled on the NFS Ganesha node
When the nfs-server
service is running on the NFS Ganesha node, an attempt to start the NFS Ganesha instance after its installation fails. To work around this issue, ensure that nfs-server
is stopped and disabled on the NFS Ganesha node before installing NFS Ganesha. To do so:
# systemctl disable nfs-server # systemctl stop nfs-server
Assigning LUNs and hosts to a hostgroup using the iSCSI gwcli
utility prevents access to the LUNs upon reboot of the iSCSI gateway host
After assigning Logical Unit Numbers (LUNs) and hosts to a hostgroup by using the iSCSI gwcli
utiliy, if the iSCSI gateway host is rebooted, the LUN mappings are not properly restored for the hosts. This issue prevents access to the LUNs. (BZ#1508695)
nfs-ganesha.service
fails to start after a crash or a process kill of NFS Ganesha
When the NFS Ganesha process terminates unexpectedly or it is killed, the nfs-ganesha.service
daemon fails to start as expected. (BZ#1508876)
The ms_async_affinity_cores
option does not work
The ms_async_affinitiy_cores
option is not implemented. Specifying it in the Ceph configuration file does not have any effect. (BZ#1509130)
Ansible fails to install clusters that use custom group names in the Ansible inventory file
When the default values of the mon_group_name
and osd_group_name
parameters are changed in the all.yml
file, Ansible fails to install a Ceph cluster. To avoid this issues, do not use custom group names in the Ansible inventory file by changing mon_group_name
and osd_group_name
. (BZ#1509201)
lvm
installation scenario does not work when deploying Ceph in containers
It is not possible to use the osd_scenario: lvm
installation method to install a Ceph cluster in containers. (BZ#1509230)
Compression ratio might not be the same on the destination site as on the source site
When data synced from the source to destination site is compressed, the compression ratio on the destination site might not be the same as on the source site. (BZ#1509266)
ceph log last
does not display the exact number of specified lines
The ceph log last <number>
command shows the specified number of lines from the cluster log and cluster audit log, by default located at /var/log/ceph/<cluster-name>/.log
and /var/log/ceph/<cluster-name>.audit.log
. Currently, the command does not display the exact number of specified lines. To work around this problem, use the tail -<number> <log-file>
command. (BZ#1509374)
ceph-ansible
does not properly check for running containers
In an environment where the Docker application is not preinstalled, the ceph-ansible
utility fails to deploy a Ceph Storage Cluster because it tries to restart ceph-mgr
containers when deploying the ceph-mon
role. This attempt fails because the ceph-mgr
container is not deployed yet. In addition, the docker ps
command returns the following error:
either you don't have docker-client or docker-client-common installed
Because ceph-ansible
only checks if the output of docker ps
exists, and not its content, ceph-ansible
misinterprets this result for a running container. When the ceph-ansible
handler is run later during Monitor deployment, the script it executes fails because no ceph-mgr
container is found.
To work around this problem, make sure that Docker is installed before using ceph-ansible
. For details, see the Getting Docker in RHEL 7 section in the Getting Started with Containers guide for Red Hat Enterprise Linux Atomic Host 7. (BZ#1510555)
Object leaking can occur after using radosgw-admin bucket rm --purge-objects
In the Ceph Object Gateway, the radosgw-admin bucket rm --purge-objects
command is supposed to remove all object from a bucket. However, in some cases, some of the objects are left in the bucket. This is caused by the RGWRados::gc_aio_operate()
operation abandoning on shutdown. To work around this problem, remove the objects by using the rados rm
command. (BZ#1514007)
The Red Hat Ceph Storage Dashboard cannot monitor iSCSI gateway nodes
The cephmetrics-ansible
playbook does not install required Red Hat Ceph Storage Dashboard packages on iSCSI gateway nodes. As a consequence, the Red Hat Ceph Storage Dashboard cannot monitor the iSCSI gateways, and the "iSCSI Overview" dashboard is empty. (BZ#1515153)
Ansible fails to upgrade NFS Ganesha nodes
Ansible fails to upgrade NFS Ganesha nodes because the rolling-update.yml
playbook searches for the /var/log/ganesha/
directory that does not exist. Consequently, the upgrading process terminates with the following error message:
"msg": "file (/var/log/ganesha) is absent, cannot continue"
To work around this problem, create /var/log/ganesha/
manually. (BZ#1518666)
The --limit mdss
option does not create CephFS pools
When deploying the Metadata Server nodes by using the Ansible and the --limit mdss
option, Ansible does not create the Ceph File System (CephFS) pools. To work around this problem, do not use --limit mdss
. (BZ#1518696)
Manual and dynamic resharding sometimes hangs
In the Ceph Object Gateway (RGW), manual and dynamic resharding hangs on a bucket that has versioning enabled. (BZ#1535474)
Resharding a bucket that has ACLs set alters the bucket ACL
In the Ceph Object Gateway (RGW), resharding a bucket with access control list (ACL) set alters the bucket ACL. (BZ#1536795)
Rebooting all Ceph nodes simultaneously will cause an authentication error
When performing a simultaneous reboot of all the Ceph nodes in the storage cluster, a resulting client.admin
authentication error will occur when issuing any Ceph-related commands from the command-line interface. To work around this issue, avoid rebooting all Ceph nodes simultaneously. (BZ#1544808)
Purging a containerized Ceph installation using NVMe disks fails
When attempting to purge a containerized Ceph installation using NVME disks, the purge fails because there are a few places where NVMe disk naming is not taken into account. (BZ#1547999)
When using the rolling_update.yml
playbook to upgrade to Red Hat Ceph Storage 3.0 and from version 3.0 to other zStream releases of 3.0, users who use CephFS must manually upgrade the MDS cluster
Currently the Metadata Server (MDS) cluster does not have built-in versioning or file system flags to support seamless upgrades of the MDS nodes without potentially causing assertions or other faults due to incompatible messages or other functional differences. For this reason, it’s necessary during any cluster upgrade to reduce the number of active MDS nodes for a file system to one, first so that two active MDS nodes do not communicate with different versions. Further, it’s also necessary to take standbys offline as any new CompatSet
flags will propagate via the MDSMap to all MDS nodes and cause older MDS nodes to suicide.
To upgrade the MDS cluster:
Reduce the number of ranks to 1:
ceph fs set <fs_name> max_mds 1
Deactivate all non-zero ranks, from the highest rank to the lowest, while waiting for each MDS to finish stopping:
ceph mds deactivate <fs_name>:<n> ceph status # wait for MDS to finish stopping
Take all standbys offline using
systemctl
:systemctl stop ceph-mds.target ceph status # confirm only one MDS is online and is active
Upgrade the single active MDS and restart daemon using
systemctl
:systemctl restart ceph-mds.target
- Upgrade and start the standby daemons.
Restore the previous max_mds for your cluster:
ceph fs set <fs_name> max_mds <old_max_mds>
For steps on how to upgrade the MDS cluster in a container, refer to the Updating Red Hat Ceph Storage deployed as a Container Image Knowledgebase article. (BZ#1550026)
Adding a new Ceph Manager node will fail when using the Ansible limit
option
Adding a new Ceph Manager to an existing storage cluster using the Ansible limit
option, tries to copy the Ceph Manager’s keyring without generating it first. This causes the Ansible playbook to fail and the new Ceph Manager node will not be configured properly. To workaround this issue, do not use the limit
option while running the Ansible playbook. This will result in a newly generated keyring to be copied successfully. (BZ#1552210)
For Red Hat Ceph Storage deployments running within containers, adding a new OSD will cause the new OSD daemon to continuously restart
Adding a new OSD to an existing Ceph Storage Cluster running within a container, will restart the new OSD daemon every 5 minutes. As a result, the storage cluster will not achieve a HEALTH_OK
state. Currently, there is no workaround for this issue. This does not affect already running OSD daemons. (BZ#1552699)
Reducing the number of active MDS daemons on CephFS can cause kernel clients I/O to hang
Reducing the number of active Metadata Server (MDS) daemons on a Ceph File System (CephFS) may cause kernel clients I/O to hang. If this happens, kernel clients are unable to connect MDS ranks greater than or equal to max_mds
. To workaround this issue, raise max_mds
to be greater than the highest rank. (BZ#1559749)
Adding iSCSI gateways using the gwcli
tool returns an error
Attempting to add an iSCSI gateway using the gwcli
tool returns the error:
package validation checks - OS version is unsupported
To work around this issue, add iSCSI gateways with the parameter skipchecks=true
. (BZ#1561415)
Initiating the ceph-ansible
playbook to expand the cluster sometimes fails on nodes with NVMe disks
When osd_auto_discovery
is set to true
, initiating the ceph-ansible
playbook to expand the cluster causes the playbook to fail on nodes with NVMe disks because it is trying to reconfigure disks that are already being used by existing OSDs. This makes it impossible to add a new daemon collocating with an existing ODS that uses NVMe disks when osd_auto_discovery
is set to true
. To workaround this issue, configure a new daemon on a new node for which osd_auto_discovery
is not set to true
, and use the --limit
parameter when initiating the playbook to expand the cluster. (BZ#1561438)
shrink-osd
playbook cannot shrink some OSDs
The shrink-osd
Ansible playbook does not support shrinking OSDs backed by an NVMe drive. (BZ#1561456)
tcmu-runner
sometimes logs error messages
The tcmu-runner
might sporadically log messages such as Async lock drop
or Could not break lock
. These logs can be ignored if they are not repeating more often than one time per hour. If the messages occur often, this can be indicative of a network path issue between one or more iSCSI initiators and the iSCSI targets and should be investigated. (BZ#1564084)
Sometimes the shrink-mon
Ansible playbook fails to remove a monitor from the monmap
The shrink-mon
Ansible playbook will sometimes fail to remove a monitor from the monmap even though the playbook completes its run successfully. The cluster status shows the monitor intended to be deleted as down. To workaround this issue, launch the shrink-mon
playbook again with the intention of removing the same monitor, or remove the monitor from the monmap manually. (BZ#1564117)
It is not possible to expand a cluster when using the osd_scenario: lvm
option
ceph-ansible
is not idempotent when deploying OSDs using ceph-volume
and the lvm_volumes
config option. Therefor, if you deploy a cluster using the lvm
osd_scenario
option, then you will not be able to expand the cluster. To workaround this issue, remove existing OSDs from the lvm_volumes
config option so that they will not try to be recreated when deploying new OSDs. Cluster expansion will succeed as expected and create the new OSDs. (BZ#1564214)
Upgrading a node in a Ceph cluster installed with ceph-test
packages must have ceph_test = true
in /etc/ansible/hosts
file
When using the ceph-ansible
rolling_update.yml
playbook to upgrade a Ceph node in a RHEL cluster that was installed with ceph-test
packages, set ceph_test = true
in the /etc/ansible/hosts
file for each node that has ceph-test
package installed:
[mons] mon_node1 ceph_test=true [osds] osd_node1 ceph_test=true
Not applicable for clients and MDS nodes. (BZ#1564232)
The shrink-osd.yml
playbook currently has no support for removing OSDs created by ceph-volume
The shrink-osd.yml
playbook assumes all OSDs are created by ceph-disk
. As a result, OSDs deployed using ceph-volume
cannot be shrunk. (BZ#1564444)
Increasing max_mds
from 1
to 2
sometimes causes CephFS to be in degraded state
When increasing max_mds
from 1
to 2
, if the Metadata Server (MDS) daemon is in the starting/resolve state for a long period of time, then restarting the MDS daemon leads to assert. This causes the Ceph File System (CephFS) to be in degraded state. (BZ#1566016)
Mounting of nfs-ganesha
file server on a client sometimes fails
Mounting of nfs-ganesha
file server on a client fails with Connection Refused
when a containerized IPv6 Red Hat Ceph Storage cluster with an nfs-ganesha-rgw
daemon is deployed using the ceph-ansible
playbook. I/Os are then unable to run. (BZ#1566082)
Client I/O sometimes fails for CephFS FUSE clients
Client I/O sometimes fails for Ceph File System (CephFS) as a File System in User Space (FUSE) clients with the error transport endpoint shutdown
due to assert in the FUSE service. To workaround this issues, unmount and then remount CephFS FUSE, and then start the client I/Os. (BZ#1567030)
The DataDog monitoring utility returns "HEALTH_WARN" even though the cluster is healthy
The DataDog monitoring utility uses the overall_status
field to determine the health of a cluster. However, overall_status
is deprecated in Red Hat Ceph Storage 3.0 in favor of the status
field and therefore always returns the HEALTH_WARN
error message. Consequently, DataDog reports HEALTH_WARN
even in cases when the cluster is healthy.