7.3. Troubleshooting 2-site stretch cluster with Arbiter


Administrators can use this troubleshooting information to understand how to troubleshoot and fix their 2-site stretch cluster with arbiter environment.

Problem

After performing complete zone failure and recovery, the workload pods are sometimes stuck in ContainerCreating state with the any of the below errors:

  • MountDevice failed to create newCsiDriverClient: driver name openshift-storage.rbd.csi.ceph.com not found in the list of registered CSI drivers
  • MountDevice failed for volume <volume_name> : rpc error: code = Aborted desc = an operation with the given Volume ID <volume_id> already exists
  • MountVolume.SetUp failed for volume <volume_name> : rpc error: code = Internal desc = staging path <path> for volume <volume_id> is not a mountpoint
Resolution

If the workload pods are stuck with any of the above mentioned errors, perform the following workarounds:

  • For ceph-fs workload stuck in ContainerCreating:

    1. Restart the nodes where the stuck pods are scheduled
    2. Delete these stuck pods
    3. Verify that the new pods are running
  • For ceph-rbd workload stuck in ContainerCreating that do not self recover after sometime

    1. Restart csi-rbd plugin pods in the nodes where the stuck pods are scheduled
    2. Verify that the new pods are running
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

关于红帽文档

Legal Notice

Theme

© 2026 Red Hat
返回顶部