第 5 章操作节点

5.1. 查看和列出 OpenShift Container Platform 集群中的节点

您可以列出集群中的所有节点，以获取节点的相关信息，如状态、年龄、内存用量和其他详情。

在执行节点管理操作时，CLI 与代表实际节点主机的节点对象交互。主控机（master）使用来自节点对象的信息执行健康检查，以此验证节点。

5.1.1. 关于列出集群中的所有节点

您可以获取集群中节点的详细信息。

以下命令列出所有节点：

$ oc get nodes

以下示例是具有健康节点的集群：

$ oc get nodes

输出示例

NAME                   STATUS    ROLES     AGE       VERSION
master.example.com     Ready     master    7h        v1.19.0
node1.example.com      Ready     worker    7h        v1.19.0
node2.example.com      Ready     worker    7h        v1.19.0

以下示例是具有一个不健康节点的集群：

$ oc get nodes

输出示例

NAME                   STATUS                      ROLES     AGE       VERSION
master.example.com     Ready                       master    7h        v1.20.0
node1.example.com      NotReady,SchedulingDisabled worker    7h        v1.20.0
node2.example.com      Ready                       worker    7h        v1.20.0

触发 NotReady 状态的条件在本节中显示。

-o wide 选项提供有关节点的附加信息。

$ oc get nodes -o wide

输出示例

NAME                STATUS   ROLES    AGE    VERSION           INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                      KERNEL-VERSION                 CONTAINER-RUNTIME
master.example.com  Ready    master   171m   v1.20.0+39c0afe   10.0.129.108   <none>        Red Hat Enterprise Linux CoreOS 48.83.202103210901-0 (Ootpa)   4.18.0-240.15.1.el8_3.x86_64   cri-o://1.21.0-30.rhaos4.8.gitf2f339d.el8-dev
node1.example.com   Ready    worker   72m    v1.20.0+39c0afe   10.0.129.222   <none>        Red Hat Enterprise Linux CoreOS 48.83.202103210901-0 (Ootpa)   4.18.0-240.15.1.el8_3.x86_64   cri-o://1.21.0-30.rhaos4.8.gitf2f339d.el8-dev
node2.example.com   Ready    worker   164m   v1.20.0+39c0afe   10.0.142.150   <none>        Red Hat Enterprise Linux CoreOS 48.83.202103210901-0 (Ootpa)   4.18.0-240.15.1.el8_3.x86_64   cri-o://1.21.0-30.rhaos4.8.gitf2f339d.el8-dev

以下命令列出一个节点的相关信息：

$ oc get node <node>

例如：

$ oc get node node1.example.com

输出示例

NAME                   STATUS    ROLES     AGE       VERSION
node1.example.com      Ready     worker    7h        v1.20.0

以下命令提供有关特定节点的更多详细信息，包括发生当前状况的原因：

$ oc describe node <node>

例如：

$ oc describe node node1.example.com

输出示例

Name:               node1.example.com 1
Roles:              worker 2
Labels:             beta.kubernetes.io/arch=amd64   3
                    beta.kubernetes.io/instance-type=m4.large
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-east-2
                    failure-domain.beta.kubernetes.io/zone=us-east-2a
                    kubernetes.io/hostname=ip-10-0-140-16
                    node-role.kubernetes.io/worker=
Annotations:        cluster.k8s.io/machine: openshift-machine-api/ahardin-worker-us-east-2a-q5dzc  4
                    machineconfiguration.openshift.io/currentConfig: worker-309c228e8b3a92e2235edd544c62fea8
                    machineconfiguration.openshift.io/desiredConfig: worker-309c228e8b3a92e2235edd544c62fea8
                    machineconfiguration.openshift.io/state: Done
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 13 Feb 2019 11:05:57 -0500
Taints:             <none>  5
Unschedulable:      false
Conditions:                 6
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  OutOfDisk        False   Wed, 13 Feb 2019 15:09:42 -0500   Wed, 13 Feb 2019 11:05:57 -0500   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure   False   Wed, 13 Feb 2019 15:09:42 -0500   Wed, 13 Feb 2019 11:05:57 -0500   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Wed, 13 Feb 2019 15:09:42 -0500   Wed, 13 Feb 2019 11:05:57 -0500   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Wed, 13 Feb 2019 15:09:42 -0500   Wed, 13 Feb 2019 11:05:57 -0500   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Wed, 13 Feb 2019 15:09:42 -0500   Wed, 13 Feb 2019 11:07:09 -0500   KubeletReady                 kubelet is posting ready status
Addresses:   7
  InternalIP:   10.0.140.16
  InternalDNS:  ip-10-0-140-16.us-east-2.compute.internal
  Hostname:     ip-10-0-140-16.us-east-2.compute.internal
Capacity:    8
 attachable-volumes-aws-ebs:  39
 cpu:                         2
 hugepages-1Gi:               0
 hugepages-2Mi:               0
 memory:                      8172516Ki
 pods:                        250
Allocatable:
 attachable-volumes-aws-ebs:  39
 cpu:                         1500m
 hugepages-1Gi:               0
 hugepages-2Mi:               0
 memory:                      7558116Ki
 pods:                        250
System Info:    9
 Machine ID:                              63787c9534c24fde9a0cde35c13f1f66
 System UUID:                             EC22BF97-A006-4A58-6AF8-0A38DEEA122A
 Boot ID:                                 f24ad37d-2594-46b4-8830-7f7555918325
 Kernel Version:                          3.10.0-957.5.1.el7.x86_64
 OS Image:                                Red Hat Enterprise Linux CoreOS 410.8.20190520.0 (Ootpa)
 Operating System:                        linux
 Architecture:                            amd64
 Container Runtime Version:               cri-o://1.16.0-0.6.dev.rhaos4.3.git9ad059b.el8-rc2
 Kubelet Version:                         v1.19.0
 Kube-Proxy Version:                      v1.19.0
PodCIDR:                                  10.128.4.0/24
ProviderID:                               aws:///us-east-2a/i-04e87b31dc6b3e171
Non-terminated Pods:                      (13 in total)  10
  Namespace                               Name                                   CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                               ----                                   ------------  ----------  ---------------  -------------
  openshift-cluster-node-tuning-operator  tuned-hdl5q                            0 (0%)        0 (0%)      0 (0%)           0 (0%)
  openshift-dns                           dns-default-l69zr                      0 (0%)        0 (0%)      0 (0%)           0 (0%)
  openshift-image-registry                node-ca-9hmcg                          0 (0%)        0 (0%)      0 (0%)           0 (0%)
  openshift-ingress                       router-default-76455c45c-c5ptv         0 (0%)        0 (0%)      0 (0%)           0 (0%)
  openshift-machine-config-operator       machine-config-daemon-cvqw9            20m (1%)      0 (0%)      50Mi (0%)        0 (0%)
  openshift-marketplace                   community-operators-f67fh              0 (0%)        0 (0%)      0 (0%)           0 (0%)
  openshift-monitoring                    alertmanager-main-0                    50m (3%)      50m (3%)    210Mi (2%)       10Mi (0%)
  openshift-monitoring                    grafana-78765ddcc7-hnjmm               100m (6%)     200m (13%)  100Mi (1%)       200Mi (2%)
  openshift-monitoring                    node-exporter-l7q8d                    10m (0%)      20m (1%)    20Mi (0%)        40Mi (0%)
  openshift-monitoring                    prometheus-adapter-75d769c874-hvb85    0 (0%)        0 (0%)      0 (0%)           0 (0%)
  openshift-multus                        multus-kw8w5                           0 (0%)        0 (0%)      0 (0%)           0 (0%)
  openshift-sdn                           ovs-t4dsn                              100m (6%)     0 (0%)      300Mi (4%)       0 (0%)
  openshift-sdn                           sdn-g79hg                              100m (6%)     0 (0%)      200Mi (2%)       0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests     Limits
  --------                    --------     ------
  cpu                         380m (25%)   270m (18%)
  memory                      880Mi (11%)  250Mi (3%)
  attachable-volumes-aws-ebs  0            0
Events:     11
  Type     Reason                   Age                From                      Message
  ----     ------                   ----               ----                      -------
  Normal   NodeHasSufficientPID     6d (x5 over 6d)    kubelet, m01.example.com  Node m01.example.com status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  6d                 kubelet, m01.example.com  Updated Node Allocatable limit across pods
  Normal   NodeHasSufficientMemory  6d (x6 over 6d)    kubelet, m01.example.com  Node m01.example.com status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    6d (x6 over 6d)    kubelet, m01.example.com  Node m01.example.com status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientDisk    6d (x6 over 6d)    kubelet, m01.example.com  Node m01.example.com status is now: NodeHasSufficientDisk
  Normal   NodeHasSufficientPID     6d                 kubelet, m01.example.com  Node m01.example.com status is now: NodeHasSufficientPID
  Normal   Starting                 6d                 kubelet, m01.example.com  Starting kubelet.
 ...

1: 节点的名称。
2: 节点的角色，可以是 master 或 worker。
3: 应用到节点的标签。
4: 应用到节点的注解。
5: 应用到节点的污点。
6: 节点条件和状态。conditions 小节列出了 Ready、PIDPressure、PIDPressure、MemoryPressure、DiskPressure 和 OutOfDisk 状态。本节稍后将描述这些条件。
7: 节点的 IP 地址和主机名。
8: pod 资源和可分配的资源。
9: 节点主机的相关信息。
10: 节点上的 pod。
11: 节点报告的事件。

在显示的节点信息中，本节显示的命令输出中会出现以下节点状况：

表 5.1. 节点状况
状况	描述
`Ready`	如果为 `true`，节点处于健康状态，并可以接受 pod。如果为 `false`，则节点处于不健康的状态，不接受 pod。如果为 `unknown`，代表节点控制器在 `node-monitor-grace-period` 时间内（默认为 40 秒）还没有收到来自节点的心跳信号。
`DiskPressure`	如果为 `true`，代表磁盘容量较低。
`MemoryPressure`	如果为 `true`，代表节点内存较低。
`PIDPressure`	如果为 `true`，代表节点上的进程太多。
`OutOfDisk`	如果为 `true`，代表节点上的可用空间不足，无法添加新 pod。
`NetworkUnavailable`	如果为 `true`，代表节点的网络不会被正确配置。
`NotReady`	如果为`true`，代表一个底层组件（如容器运行时或网络）遇到了问题或尚未配置。
`SchedulingDisabled`	无法通过调度将 Pod 放置到节点上。

5.1.2. 列出集群中某一节点上的 pod

您可以列出特定节点上的所有 pod。

流程

列出一个或多个节点上的所有或选定 pod：

$ oc describe node <node1> <node2>

例如：

$ oc describe node ip-10-0-128-218.ec2.internal

列出选定节点上的所有或选定 pod：

$ oc describe --selector=<node_selector>

$ oc describe node  --selector=kubernetes.io/os

或者：

$ oc describe -l=<pod_selector>

$ oc describe node -l node-role.kubernetes.io/worker

列出特定节点上的所有 pod，包括终止的 pod：

$ oc get pod --all-namespaces --field-selector=spec.nodeName=<nodename>

5.1.3. 查看节点上的内存和 CPU 用量统计

您可以显示节点的用量统计，这些统计信息为容器提供了运行时环境。这些用量统计包括 CPU、内存和存储的消耗。

先决条件

您必须有 cluster-reader 权限才能查看用量统计。
必须安装 Metrics 才能查看用量统计。

流程

查看用量统计：

$ oc adm top nodes

输出示例

NAME                                   CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%
ip-10-0-12-143.ec2.compute.internal    1503m        100%      4533Mi          61%
ip-10-0-132-16.ec2.compute.internal    76m          5%        1391Mi          18%
ip-10-0-140-137.ec2.compute.internal   398m         26%       2473Mi          33%
ip-10-0-142-44.ec2.compute.internal    656m         43%       6119Mi          82%
ip-10-0-146-165.ec2.compute.internal   188m         12%       3367Mi          45%
ip-10-0-19-62.ec2.compute.internal     896m         59%       5754Mi          77%
ip-10-0-44-193.ec2.compute.internal    632m         42%       5349Mi          72%

查看具有标签的节点的用量统计信息：
```
$ oc adm top node --selector=''
```
您必须选择过滤所基于的选择器（标签查询）。支持 =、== 和 !=。

第 5 章操作节点

5.1. 查看和列出 OpenShift Container Platform 集群中的节点

5.1.1. 关于列出集群中的所有节点

5.1.2. 列出集群中某一节点上的 pod

5.1.3. 查看节点上的内存和 CPU 用量统计

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Red Hat legal and privacy links

Red Hat legal and privacy links

第 5 章 操作节点

5.1. 查看和列出 OpenShift Container Platform 集群中的节点

5.1.1. 关于列出集群中的所有节点

5.1.2. 列出集群中某一节点上的 pod

5.1.3. 查看节点上的内存和 CPU 用量统计

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Red Hat legal and privacy links

Red Hat legal and privacy links

第 5 章操作节点