1.6. 关于更新过程中的节点状态
如果您更改了机器配置池 (MCP) 而导致一个新的机器配置,例如使用 MachineConfig 或 KubeletConfig 对象,您可以使用机器配置节点自定义资源获取有关节点更新进度的详细信息。如果更新过程中出现问题,您需要对节点进行故障排除,这些信息很有用。
MachineConfigNode 自定义资源允许您在通过升级阶段进行时监控单个节点更新的进度。如果其中一个节点在更新过程中出现问题,这些信息可以帮助进行故障排除。自定义资源报告节点在更新过程中的位置、已完成的阶段以及剩余的阶段。
节点更新过程由机器配置节点自定义资源跟踪的以下阶段和子阶段组成,如本节后续部分中的更多详情:
- 更新准备。MCO 停止配置偏移监控过程,并验证新创建的机器配置是否可以应用到节点。
更新已执行。MCO 对节点进行 cordon 和 drain 操作,并根据需要将新机器配置应用到节点的文件和操作系统。它包含以下子阶段:
- Cordoned
- Drained
- AppliedFilesAndOS
- PinnedImageSetsProgressing MCO 执行固定和预加载容器镜像所需的步骤。
-
PinnedImageSetsDegraded 固定镜像进程失败。您可以使用
oc describe machineconfignode命令查看失败的原因,如本节后面的部分所述。 -
NodeDegraded 节点更新失败。您可以使用
oc describe machineconfignode命令查看失败的原因,如本节后面的部分所述。 - 更新 Post 更新操作,MCO 会根据需要重新载入 CRI-O。
- 重新引导节点 MCO 根据需要重启节点。
更新完成。MCO uncordon 节点,将节点状态更新为集群,并恢复生成节点指标。它包含以下子阶段:
- Uncordoned
- 已更新 MCO 完成了一个节点更新,节点的当前配置版本是所需的更新版本。
- Resumed.MCO 重启了配置偏移监控过程,节点会返回到可操作状态。
当更新通过这些阶段时,您可以查询 MachineConfigNode 自定义资源,这会为每个阶段报告以下条件之一:
-
True。该阶段在该节点上已完成。 -
False。该阶段尚未启动,或不会在该节点上执行。 -
Unknown。该阶段要么在该节点上执行,或者出现错误。如果阶段出现错误,您可以使用oc describe machineconfignodes命令以了解更多信息,如本节后续部分中所述。
例如,假设带有新创建的机器配置的集群:
$ oc get machineconfig
输出示例
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
# ...
rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 c00e2c941bc6e236b50e0bf3988e6c790cf2bbb2 3.5.0 6d15h
rendered-master-a386c2d1550b927d274054124f58be68 c00e2c941bc6e236b50e0bf3988e6c790cf2bbb2 3.5.0 7m26s
# ...
rendered-worker-01f27f752eb84eba917450e43636b210 c00e2c941bc6e236b50e0bf3988e6c790cf2bbb2 3.5.0 6d15h
rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 c00e2c941bc6e236b50e0bf3988e6c790cf2bbb2 3.5.0 7m26s
# ...
您可以使用新机器配置监视节点被更新:
$ oc get machineconfignodes
输出示例
NAME POOLNAME DESIREDCONFIG CURRENTCONFIG UPDATED AGE
ci-ln-ds73n5t-72292-9xsm9-master-0 master rendered-master-a386c2d1550b927d274054124f58be68 rendered-master-a386c2d1550b927d274054124f58be68 True 27M
ci-ln-ds73n5t-72292-9xsm9-master-1 master rendered-master-a386c2d1550b927d274054124f58be68 rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 False 27M
ci-ln-ds73n5t-72292-9xsm9-master-2 master rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 True 27M
ci-ln-ds73n5t-72292-9xsm9-worker-a-2d8tz worker-cnf rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 True 20M
ci-ln-ds73n5t-72292-9xsm9-worker-b-gw5sd worker rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 rendered-worker-01f27f752eb84eba917450e43636b210 False 20M
ci-ln-ds73n5t-72292-9xsm9-worker-c-t227w worker rendered-worker-01f27f752eb84eba917450e43636b210 rendered-worker-01f27f752eb84eba917450e43636b210 True 19M
| 字段 | 含义 |
|---|---|
|
| 节点的名称。 |
|
| 与该节点关联的机器配置池的名称。 |
|
| 节点更新的新机器配置名称。 |
|
| 该节点上当前机器配置的名称。 |
|
| 指明节点是否已使用以下条件之一更新:
|
|
| 在创建后机器配置节点的年龄。如果关联的节点被更新,则不会更改年龄。 |
您可以使用 -o wide 标志来显示有关更新的附加信息:
$ oc get machineconfignodes -o wide
输出示例
NAME POOLNAME DESIREDCONFIG CURRENTCONFIG UPDATED AGE UPDATEPREPARED UPDATEEXECUTED UPDATEPOSTACTIONCOMPLETE UPDATECOMPLETE RESUMED UPDATEDFILESANDOS CORDONEDNODE DRAINEDNODE REBOOTEDNODE UNCORDONEDNODE
ci-ln-ds73n5t-72292-9xsm9-master-0 master rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 True 27M False False False False False False False False False False
ci-ln-ds73n5t-72292-9xsm9-master-1 master rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 True 27M False False False False False False False False False False
ci-ln-ds73n5t-72292-9xsm9-master-2 master rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 True 27M False False False False False False False False False False
ci-ln-ds73n5t-72292-9xsm9-worker-a-2d8tz worker-cnf rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 True 20M False False False False False False False False False False
ci-ln-ds73n5t-72292-9xsm9-worker-b-gw5sd worker rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 rendered-worker-01f27f752eb84eba917450e43636b210 False 20M True True Unknown False False True True True Unknown False
ci-ln-ds73n5t-72292-9xsm9-worker-c-t227w worker rendered-worker-01f27f752eb84eba917450e43636b210 rendered-worker-01f27f752eb84eba917450e43636b210 True 19M False False False False False False False False False False
除了上表中定义的字段外,-o wide 输出还会显示以下字段:
| 阶段名称 | 定义 |
|---|---|
|
| 指明 MCO 是否准备更新节点。 |
|
| 指明 MCO 是否已完成节点上的更新正文。 |
|
| 指明 MCO 是否已在节点上执行更新后操作。 |
|
| 指明 MCO 是否在节点上完成更新。 |
|
| 指明节点是否已恢复正常进程。 |
|
| 指明 MCO 是否已更新节点文件和操作系统。 |
|
| 指明 MCO 是否已将节点标记为不可调度。 |
|
| 指明 MCO 是否已排空节点。 |
|
| 指明 MCO 是否已重启该节点。 |
|
| 指明 MCO 是否已将节点标记为可以调度。 |
如需更新状态的更多详细信息,您可以使用 oc describe machineconfignode 命令:
$ oc describe machineconfignode/<machine_config_node_name>
输出示例
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigNode
metadata:
creationTimestamp: "2025-04-28T18:40:29Z"
generation: 3
name: <machine_config_node_name>
# ...
spec:
configVersion:
desired: rendered-master-34f96af2e41acb615410b97ce1c819e6
node:
name: ci-ln-921r7qk-72292-kxv95-master-0
pool:
name: master
status:
conditions:
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: All pinned image sets complete
reason: AsExpected
status: "False"
type: PinnedImageSetsProgressing
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the UpdatePrepared phase
reason: NotYetOccurred
status: "False"
type: UpdatePrepared
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the UpdateExecuted phase
reason: NotYetOccurred
status: "False"
type: UpdateExecuted
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the UpdatePostActionComplete phase
reason: NotYetOccurred
status: "False"
type: UpdatePostActionComplete
- lastTransitionTime: "2025-04-28T18:42:08Z"
message: 'Action during update to rendered-master-34f96af2e41acb615410b97ce1c819e6:
Uncordoned Node as part of completing upgrade phase'
reason: Uncordoned
status: "False"
type: UpdateComplete
- lastTransitionTime: "2025-04-28T18:42:08Z"
message: 'Action during update to rendered-master-34f96af2e41acb615410b97ce1c819e6:
In desired config . Resumed normal operations.'
reason: Resumed
status: "False"
type: Resumed
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the Drained phase
reason: NotYetOccurred
status: "False"
type: Drained
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the AppliedFilesAndOS phase
reason: NotYetOccurred
status: "False"
type: AppliedFilesAndOS
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the Cordoned phase
reason: NotYetOccurred
status: "False"
type: Cordoned
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the RebootedNode phase
reason: NotYetOccurred
status: "False"
type: RebootedNode
- lastTransitionTime: "2025-04-28T18:42:08Z"
message: Node ci-ln-921r7qk-72292-kxv95-master-0 Updated
reason: Updated
status: "True"
type: Updated
- lastTransitionTime: "2025-04-28T18:42:08Z"
message: 'Action during update to rendered-master-34f96af2e41acb615410b97ce1c819e6:
UnCordoned node. The node is reporting Unschedulable = false'
reason: UpdateCompleteUncordoned
status: "False"
type: Uncordoned
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the NodeDegraded phase
reason: NotYetOccurred
status: "False"
type: NodeDegraded
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: All is good
reason: AsExpected
status: "False"
type: PinnedImageSetsDegraded
configVersion:
current: rendered-master-34f96af2e41acb615410b97ce1c819e6
desired: rendered-master-34f96af2e41acb615410b97ce1c819e6
observedGeneration: 4
----
<1> The `MachineConfigNode` object name.
<2> The new machine configuration. This field updates after the MCO validates the machine config in the `UPDATEPREPARED` phase, then the status adds the new configuration.
<3> The current machine config on the node.
1.6.1. 在更新过程中检查节点状态 复制链接链接已复制到粘贴板!
在更新机器配置池 (MCP) 期间,您可以使用 oc get machineconfignodes 和 oc describe machineconfignodes 命令监控集群中的所有节点的进度。这些命令提供在更新过程中出现问题时很有用的信息,您需要对节点进行故障排除。
有关这些字段的含义的更多信息,请参阅"关于检查机器配置节点状态"。
流程
运行以下命令,查看集群中所有节点的更新状态,包括当前和所需的机器配置:
$ oc get machineconfignodes输出示例
NAME POOLNAME DESIREDCONFIG CURRENTCONFIG UPDATED AGE ci-ln-mdb23yt-72292-kzdsg-master-0 master rendered-master-f21b093d20f68a7c06f922ed3ea5fbc8 rendered-master-1abc053eec29e6c945670f39d6dc8afa False 27M ci-ln-mdb23yt-72292-kzdsg-master-1 master rendered-master-1abc053eec29e6c945670f39d6dc8afa rendered-master-1abc053eec29e6c945670f39d6dc8afa True 27M ci-ln-mdb23yt-72292-kzdsg-master-2 master rendered-master-1abc053eec29e6c945670f39d6dc8afa rendered-master-1abc053eec29e6c945670f39d6dc8afa True 27M ci-ln-mdb23yt-72292-kzdsg-worker-a-gfqjr worker rendered-worker-d0130cd74e9e576d7ba78ce166272bfb rendered-worker-8f61bf839898a4487c3b5263a430e94a False 20M ci-ln-mdb23yt-72292-kzdsg-worker-b-gknq4 worker rendered-worker-8f61bf839898a4487c3b5263a430e94a rendered-worker-8f61bf839898a4487c3b5263a430e94a True 20M ci-ln-mdb23yt-72292-kzdsg-worker-c-mffrx worker rendered-worker-8f61bf839898a4487c3b5263a430e94a rendered-worker-8f61bf839898a4487c3b5263a430e94a True 19M运行以下命令,查看集群中节点的所有机器配置节点状态字段:
$ oc get machineconfignodes -o wide输出示例
NAME POOLNAME DESIREDCONFIG CURRENTCONFIG UPDATED AGE UPDATEPREPARED UPDATEEXECUTED UPDATEPOSTACTIONCOMPLETE UPDATECOMPLETE RESUMED UPDATEDFILESANDOS CORDONEDNODE DRAINEDNODE REBOOTEDNODE UNCORDONEDNODE ci-ln-g6dr34b-72292-g9btv-master-0 master rendered-master-d4e122320b351cdbe1df59ddb63ddcfc rendered-master-6f2064fcb36d2a914de5b0c660dc49ff False 27M True Unknown False False False Unknown False False False False ci-ln-g6dr34b-72292-g9btv-master-1 master rendered-master-6f2064fcb36d2a914de5b0c660dc49ff rendered-master-6f2064fcb36d2a914de5b0c660dc49ff True 27M False False False False False False False False False False ci-ln-g6dr34b-72292-g9btv-master-2 master rendered-master-6f2064fcb36d2a914de5b0c660dc49ff rendered-master-6f2064fcb36d2a914de5b0c660dc49ff True 27M False False False False False False False False False False ci-ln-g6dr34b-72292-g9btv-worker-a-sjh5r worker rendered-worker-671b88c8c569fa3f60dc1a27cf9c91f2 rendered-worker-d5534cb730e5e108905fc285c2a42b6c False 20M True Unknown False False False Unknown False False False False ci-ln-g6dr34b-72292-g9btv-worker-b-xthbz worker rendered-worker-d5534cb730e5e108905fc285c2a42b6c rendered-worker-d5534cb730e5e108905fc285c2a42b6c True 20M False False False False False False False False False False ci-ln-g6dr34b-72292-g9btv-worker-c-gnpd6 worker rendered-worker-d5534cb730e5e108905fc285c2a42b6c rendered-worker-d5534cb730e5e108905fc285c2a42b6c True 19M False False False False False False False False False False运行以下命令,检查特定机器配置池中节点的更新状态:
$ oc get machineconfignodes $(oc get machineconfignodes -o json | jq -r '.items[]|select(.spec.pool.name=="<pool_name>")|.metadata.name')1 其中:
<pool_name>指定机器配置池的名称。
输出示例
NAME POOLNAME DESIREDCONFIG CURRENTCONFIG UPDATED AGE ci-ln-g6dr34b-72292-g9btv-worker-a-sjh5r worker rendered-worker-d5534cb730e5e108905fc285c2a42b6c rendered-worker-d5534cb730e5e108905fc285c2a42b6c True 20M ci-ln-g6dr34b-72292-g9btv-worker-b-xthbz worker rendered-worker-d5534cb730e5e108905fc285c2a42b6c rendered-worker-faf6b50218a8bbce21f1370866283de5 False 20M ci-ln-g6dr34b-72292-g9btv-worker-c-gnpd6 worker rendered-worker-faf6b50218a8bbce21f1370866283de5 rendered-worker-faf6b50218a8bbce21f1370866283de5 True 19M
运行以下命令,检查单个节点的更新状态:
$ oc describe machineconfignode/<node_name>输出示例
apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigNode metadata: creationTimestamp: "2025-04-28T18:52:16Z" generation: 3 name: ci-ln-921r7qk-72292-kxv95-worker-a-zmxrr ownerReferences: - apiVersion: v1 kind: Node name: ci-ln-921r7qk-72292-kxv95-worker-a-zmxrr uid: e548a8d1-4f16-42cd-9234-87ac5aede6c1 resourceVersion: "62331" uid: 11d96e07-582d-4569-a84a-9d8c5229a551 spec: configVersion: desired: rendered-worker-1930ca7433b7f0153286a3f04e4cb57b node: name: ci-ln-921r7qk-72292-kxv95-worker-a-zmxrr pool: name: worker status: conditions: # ... lastTransitionTime: 2025-04-23T14:55:31Z message: Update Compatible. Post Cfg Actions: [] Drain Required: true reason: UpdatePrepared status: True type: UpdatePrepared # ... lastTransitionTime: 2025-04-23T14:55:31Z message: Draining node. The drain will not be complete until desired drainer drain-rendered-worker-1930ca7433b7f0153286a3f04e4cb57b matches current drainer uncordon-rendered-worker-a9673968884f1ea42c26edcd914af907 reason: UpdateExecutedDrained status: True type: Drained # ... lastTransitionTime: 2025-04-23T14:55:31Z message: Cordoned node. The node is reporting Unschedulable = true reason: UpdateExecutedCordoned status: True type: Cordoned # ... - lastTransitionTime: "2025-04-28T18:52:16Z" message: This node has not yet entered the NodeDegraded phase reason: NotYetOccurred status: "False" type: NodeDegraded # ... configversion: current: rendered-worker-8110974a5cea69dff5b263237b58abd8 desired: rendered-worker-1930ca7433b7f0153286a3f04e4cb57b observedgeneration: 4 pinnedImageSets: - desiredGeneration: 1 name: worker-pinned-images # ...