第 12 章 日志故障排除
12.1. 查看 OpenShift Logging 状态
您可以查看 Red Hat OpenShift Logging Operator 的状态以及多个 OpenShift Logging 组件的状态。
12.1.1. 查看 Red Hat OpenShift Logging Operator 的状态
您可以查看 Red Hat OpenShift Logging Operator 的状态。
先决条件
- 必须安装 OpenShift Logging 和 Elasticsearch。
流程
进入
openshift-logging
项目。$ oc project openshift-logging
查看 OpenShift Logging 状态:
获取 OpenShift Logging 状态:
$ oc get clusterlogging instance -o yaml
输出示例
apiVersion: logging.openshift.io/v1 kind: ClusterLogging .... status: 1 collection: logs: fluentdStatus: daemonSet: fluentd 2 nodes: fluentd-2rhqp: ip-10-0-169-13.ec2.internal fluentd-6fgjh: ip-10-0-165-244.ec2.internal fluentd-6l2ff: ip-10-0-128-218.ec2.internal fluentd-54nx5: ip-10-0-139-30.ec2.internal fluentd-flpnn: ip-10-0-147-228.ec2.internal fluentd-n2frh: ip-10-0-157-45.ec2.internal pods: failed: [] notReady: [] ready: - fluentd-2rhqp - fluentd-54nx5 - fluentd-6fgjh - fluentd-6l2ff - fluentd-flpnn - fluentd-n2frh logstore: 3 elasticsearchStatus: - ShardAllocationEnabled: all cluster: activePrimaryShards: 5 activeShards: 5 initializingShards: 0 numDataNodes: 1 numNodes: 1 pendingTasks: 0 relocatingShards: 0 status: green unassignedShards: 0 clusterName: elasticsearch nodeConditions: elasticsearch-cdm-mkkdys93-1: nodeCount: 1 pods: client: failed: notReady: ready: - elasticsearch-cdm-mkkdys93-1-7f7c6-mjm7c data: failed: notReady: ready: - elasticsearch-cdm-mkkdys93-1-7f7c6-mjm7c master: failed: notReady: ready: - elasticsearch-cdm-mkkdys93-1-7f7c6-mjm7c visualization: 4 kibanaStatus: - deployment: kibana pods: failed: [] notReady: [] ready: - kibana-7fb4fd4cc9-f2nls replicaSets: - kibana-7fb4fd4cc9 replicas: 1
12.1.1.1. 情况消息示例
以下是来自 OpenShift Logging 实例的 Status.Nodes
部分的一些情况消息示例。
类似于以下内容的状态消息表示节点已超过配置的低水位线,并且没有分片将分配给此节点:
输出示例
nodes: - conditions: - lastTransitionTime: 2019-03-15T15:57:22Z message: Disk storage usage for node is 27.5gb (36.74%). Shards will be not be allocated on this node. reason: Disk Watermark Low status: "True" type: NodeStorage deploymentName: example-elasticsearch-clientdatamaster-0-1 upgradeStatus: {}
类似于以下内容的状态消息表示节点已超过配置的高水位线,并且分片将重新定位到其他节点:
输出示例
nodes: - conditions: - lastTransitionTime: 2019-03-15T16:04:45Z message: Disk storage usage for node is 27.5gb (36.74%). Shards will be relocated from this node. reason: Disk Watermark High status: "True" type: NodeStorage deploymentName: cluster-logging-operator upgradeStatus: {}
类似于以下内容的状态消息表示 CR 中的 Elasticsearch 节点选择器与集群中的任何节点都不匹配:
输出示例
Elasticsearch Status: Shard Allocation Enabled: shard allocation unknown Cluster: Active Primary Shards: 0 Active Shards: 0 Initializing Shards: 0 Num Data Nodes: 0 Num Nodes: 0 Pending Tasks: 0 Relocating Shards: 0 Status: cluster health unknown Unassigned Shards: 0 Cluster Name: elasticsearch Node Conditions: elasticsearch-cdm-mkkdys93-1: Last Transition Time: 2019-06-26T03:37:32Z Message: 0/5 nodes are available: 5 node(s) didn't match node selector. Reason: Unschedulable Status: True Type: Unschedulable elasticsearch-cdm-mkkdys93-2: Node Count: 2 Pods: Client: Failed: Not Ready: elasticsearch-cdm-mkkdys93-1-75dd69dccd-f7f49 elasticsearch-cdm-mkkdys93-2-67c64f5f4c-n58vl Ready: Data: Failed: Not Ready: elasticsearch-cdm-mkkdys93-1-75dd69dccd-f7f49 elasticsearch-cdm-mkkdys93-2-67c64f5f4c-n58vl Ready: Master: Failed: Not Ready: elasticsearch-cdm-mkkdys93-1-75dd69dccd-f7f49 elasticsearch-cdm-mkkdys93-2-67c64f5f4c-n58vl Ready:
类似于以下内容的状态消息表示请求的 PVC 无法绑定到 PV:
输出示例
Node Conditions: elasticsearch-cdm-mkkdys93-1: Last Transition Time: 2019-06-26T03:37:32Z Message: pod has unbound immediate PersistentVolumeClaims (repeated 5 times) Reason: Unschedulable Status: True Type: Unschedulable
类似于以下内容的状态消息表示无法调度 Fluentd Pod,因为节点选择器与任何节点都不匹配:
输出示例
Status: Collection: Logs: Fluentd Status: Daemon Set: fluentd Nodes: Pods: Failed: Not Ready: Ready:
12.1.2. 查看 OpenShift Logging 组件的状态
您可以查看多个 OpenShift Logging 组件的状态。
先决条件
- 必须安装 OpenShift Logging 和 Elasticsearch。
流程
进入
openshift-logging
项目。$ oc project openshift-logging
查看 OpenShift Logging 环境的状态:
$ oc describe deployment cluster-logging-operator
输出示例
Name: cluster-logging-operator .... Conditions: Type Status Reason ---- ------ ------ Available True MinimumReplicasAvailable Progressing True NewReplicaSetAvailable .... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ScalingReplicaSet 62m deployment-controller Scaled up replica set cluster-logging-operator-574b8987df to 1----
查看 OpenShift Logging 副本集的状态:
获取副本集的名称:
输出示例
$ oc get replicaset
输出示例
NAME DESIRED CURRENT READY AGE cluster-logging-operator-574b8987df 1 1 1 159m elasticsearch-cdm-uhr537yu-1-6869694fb 1 1 1 157m elasticsearch-cdm-uhr537yu-2-857b6d676f 1 1 1 156m elasticsearch-cdm-uhr537yu-3-5b6fdd8cfd 1 1 1 155m kibana-5bd5544f87 1 1 1 157m
获取副本集的状态:
$ oc describe replicaset cluster-logging-operator-574b8987df
输出示例
Name: cluster-logging-operator-574b8987df .... Replicas: 1 current / 1 desired Pods Status: 1 Running / 0 Waiting / 0 Succeeded / 0 Failed .... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 66m replicaset-controller Created pod: cluster-logging-operator-574b8987df-qjhqv----