|
Name:CephMonVersionMismatch
Message:There are multiple versions of storage services running.
Description:There are {{ $value }} different versions of Ceph Mon components running.
严重性 :警告
解决方案 :修复
流程 :检查用户界面并记录,并验证更新是否正在进行。
-
如果更新正在进行,则此警报是临时的。
-
如果更新没有进行,重启升级过程。
|
|
Name:CephOSDVersionMismatch
Message:There are multiple versions of storage services running.
Description:There are {{ $value }} different versions of Ceph OSD components running.
严重性 :警告
解决方案 :修复
流程 :检查用户界面并记录,并验证更新是否正在进行。
-
如果更新正在进行,则此警报是临时的。
-
如果更新没有进行,重启升级过程。
|
|
Name:CephClusterCriticallyFull
Message:Storage cluster is critically full and needs immediate expansion
Description:Storage cluster utilization has crossed 85%.
严重性 :Crtical
解决方案 :修复
流程 :删除不必要的数据或扩展集群。
|
|
Name:CephClusterNearFull
修复 :存储集群的空间接近满。需要进行扩展。
Description:Storage cluster utilization has crossed 75%.
严重性 :警告
解决方案 :修复
流程 :删除不必要的数据或扩展集群。
|
|
Name:NooBaaBucketErrorState
Message:A NooBaa Bucket Is In Error State
Description:A NooBaa bucket {{ $labels.bucket_name }} is in error state for more than 6m
严重性 :警告
解决方案 :临时解决方案
流程 :解决 NooBaa Bucket 错误状态
|
|
Name:NooBaaNamespaceResourceErrorState
Message:A NooBaa Namespace Resource Is In Error State
Description:A NooBaa namespace resource {{ $labels.namespace_resource_name }} is in error state for more than 5m
严重性 :警告
解决方案 :修复
流程 :解决 NooBaa Bucket 错误状态
|
|
Name:NooBaaNamespaceBucketErrorState
Message:A NooBaa Namespace Bucket Is In Error State
Description:A NooBaa namespace bucket {{ $labels.bucket_name }} is in error state for more than 5m
严重性 :警告
解决方案 :修复
流程 :解决 NooBaa Bucket 错误状态
|
|
Name:NooBaaBucketExceedingQuotaState
Message:A NooBaa Bucket Is In Exceeding Quota State
Description:A NooBaa bucket {{ $labels.bucket_name }} is exceeding its quota - {{ printf "%0.0f" $value }}% used message:A NooBaa Bucket Is In Exceeding Quota State
严重性 :警告
解决方案 :修复
流程 :解决 NooBaa Bucket Exceeding Quota State 问题
|
|
Name:NooBaaBucketLowCapacityState
Message:A NooBaa Bucket Is In Low Capacity State
Description:A NooBaa bucket {{ $labels.bucket_name }} is using {{ printf "%0.0f" $value }}% of its capacity
严重性 :警告
解决方案 :修复
流程 :解决 NooBaa Bucket Capacity 或 Quota State 问题
|
|
Name:NooBaaBucketNoCapacityState
Message:A NooBaa Bucket Is In No Capacity State
Description:A NooBaa bucket {{ $labels.bucket_name }} is using all of its capacity
严重性 :警告
解决方案 :修复
流程 :解决 NooBaa Bucket Capacity 或 Quota State 问题
|
|
Name:NooBaaBucketReachingQuotaState
Message:A NooBaa Bucket Is In Reaching Quota State
Description:A NooBaa bucket {{ $labels.bucket_name }} is using {{ printf "%0.0f" $value }}% of its quota
严重性 :警告
解决方案 :修复
流程 :解决 NooBaa Bucket Capacity 或 Quota State 问题
|
|
Name:NooBaaResourceErrorState
Message:A NooBaa Resource Is In Error State
Description:A NooBaa resource {{ $labels.resource_name }} is in error state for more than 6m
严重性 :警告
解决方案 :临时解决方案
流程 :解决 NooBaa Bucket 错误状态
|
|
Name:NooBaaSystemCapacityWarning100
Message:A NooBaa System Approached Its Capacity
Description:A NooBaa system approached its capacity, usage is at 100%
严重性 :警告
解决方案 :修复
流程 :解决 NooBaa Bucket Capacity 或 Quota State 问题
|
|
Name:NooBaaSystemCapacityWarning85
Message:A NooBaa System Is Approaching Its Capacity
Description:A NooBaa system is approaching its capacity, usage is more than 85%
严重性 :警告
解决方案 :修复
流程 :解决 NooBaa Bucket Capacity 或 Quota State 问题
|
|
Name:NooBaaSystemCapacityWarning95
Message:A NooBaa System Is Approaching Its Capacity
Description:A NooBaa system is approaching its capacity, usage is more than 95%
严重性 :警告
解决方案 :修复
流程 :解决 NooBaa Bucket Capacity 或 Quota State 问题
|
|
Name:CephMdsMissingReplicas
Message:Insufficient replicas for storage metadata service.
Description: `Minimum required replicas for storage metadata service not available.
可能会影响存储集群的工作。
严重性 :警告
解决方案 :联系红帽支持
流程 :
-
检查警报和操作器状态。
-
如果无法识别该问题,请联系红帽支持团队。
|
|
Name:CephMgrIsAbsent
Message:存储指标收集器服务不再可用。
Description:Ceph Manager has disappeared from Prometheus target discovery.
严重性 :Critical
解决方案 :联系红帽支持
流程 :
检查用户界面并记录,并验证更新是否正在进行。
-
如果更新正在进行,则此警报是临时的。
-
如果更新没有进行,重启升级过程。
-
升级完成后,检查警报和 Operator 状态。
-
如果问题持久或无法识别,请联系红帽支持。
|
|
Name:CephNodeDown
Message:Storage node {{ $labels.node }} went down
Description:Storage node {{ $labels.node }} went down.请立即检查节点。
严重性 :Critical
解决方案 :联系红帽支持
流程 :
-
检查哪个节点停止正常运行,并检查其原因。
采取适当的操作来恢复节点。如果无法恢复节点:
|
|
Name:CephClusterErrorState
Message:Storage cluster is in error state
Description:Storage cluster is in error state for more than 10m.
严重性 :Critical
解决方案 :联系红帽支持
流程 :
|
|
Name:CephClusterWarningState
Message:Storage cluster is in degraded state
Description:Storage cluster is in warning state for more than 10m.
严重性 :警告
解决方案 :联系红帽支持
流程 :
|
|
Name:CephDataRecoveryTakingTooLong
Message:Data recovery is slow
Description:Data recovery has been active for too long.
严重性 :警告
解决方案 :联系红帽支持
|
|
Name:CephOSDDiskNotResponding
Message:Disk not responding
Description:Disk device {{ $labels.device }} not responding, on host {{ $labels.host }}.
严重性 :Critical
解决方案 :联系红帽支持
|
|
Name:CephOSDDiskUnavailable
Message:Disk not accessible
Description:Disk device {{ $labels.device }} not accessible on host {{ $labels.host }}.
严重性 :Critical
解决方案 :联系红帽支持
|
|
Name:CephPGRepairTakingTooLong
Message:Self heal problems detected
Description:Self heal operations taking too long.
严重性 :警告
解决方案 :联系红帽支持
|
|
Name:CephMonHighNumberOfLeaderChanges
Message:Storage Cluster has seen many leader changes recently.
Description:'Ceph Monitor "{{ $labels.job }}": instance {{ $labels.instance }} has seen {{ $value printf "%.2f" }} leader changes per minute recently.'
严重性 :警告
解决方案 :联系红帽支持
|
|
Name:CephMonQuorumAtRisk
Message:Storage quorum at risk
Description:Storage cluster quorum is low.
严重性 :Critical
解决方案 :联系红帽支持
|
|
Name:ClusterObjectStoreState
Message:Cluster Object Store is in unhealthy state.Please check Ceph cluster health.
Description:Cluster Object Store is in unhealthy state for more than 15s.Please check Ceph cluster health.
严重性 :Critical
解决方案 :联系红帽支持
流程 :
|
|
Name:CephOSDFlapping
Message:Storage daemon osd.x has restarted 5 times in the last 5 minutes.Please check the pod events or Ceph status to find out the cause.
Description:Storage OSD restarts more than 5 times in 5 minutes.
严重性 :Critical
解决方案 :联系红帽支持
|
|
Name:OdfPoolMirroringImageHealth
Message:Mirroring image(s) (PV) in the pool <pool-name> are in Warning state for more than a 1m.Mirroring might not work as expected.
Description:Disaster recovery is failing for one or a few applications.
严重性 :警告
解决方案 :联系红帽支持
|
|
Name:OdfMirrorDaemonStatus
Message:Mirror daemon is unhealthy.
Description:Disaster recovery is failing for the entire cluster.Mirror daemon is in unhealthy status for more than 1m.Mirroring on this cluster is not working as expected.
严重性 :Critical
解决方案 :联系红帽支持
|