12.5. 在一个健康的集群中安装主 control plane 节点
此流程描述了如何在健康的 OpenShift Container Platform 集群上安装主 control plane 节点。
如果集群不健康,则在管理前需要额外的操作。如需更多信息,请参阅在不健康集群中安装主 control plane 节点。
先决条件
流程
检查并批准 CSR
检查
CertificateSigningRequests(CSR):oc get csr | grep Pending
$ oc get csr | grep PendingCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
csr-5sd59 8m19s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Pending csr-xzqts 10s kubernetes.io/kubelet-serving system:node:worker-6 <none> Pending
csr-5sd59 8m19s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Pending csr-xzqts 10s kubernetes.io/kubelet-serving system:node:worker-6 <none> PendingCopy to Clipboard Copied! Toggle word wrap Toggle overflow 批准所有待处理的 CSR:
oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approveCopy to Clipboard Copied! Toggle word wrap Toggle overflow 重要您必须批准 CSR 才能完成安装。
确认主节点处于
Ready状态:oc get nodes
$ oc get nodesCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
Copy to Clipboard Copied! Toggle word wrap Toggle overflow 注意当集群使用功能
MachineAPI 运行时,etcd-operator需要机器自定义资源 (CR) 引用新节点。将
MachineCR 与BareMetalHost和Node链接:使用唯一
.metadata.name值创建BareMetalHostCR:Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc create -f <filename>
$ oc create -f <filename>Copy to Clipboard Copied! Toggle word wrap Toggle overflow 应用
BareMetalHostCR:oc apply -f <filename>
$ oc apply -f <filename>Copy to Clipboard Copied! Toggle word wrap Toggle overflow 使用唯一的
.machine.name值创建MachineCR:Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc create -f <filename>
$ oc create -f <filename>Copy to Clipboard Copied! Toggle word wrap Toggle overflow 应用
MachineCR:oc apply -f <filename>
$ oc apply -f <filename>Copy to Clipboard Copied! Toggle word wrap Toggle overflow 使用
link-machine-and-node.sh脚本链接BareMetalHost,Machine, 和Node:Copy to Clipboard Copied! Toggle word wrap Toggle overflow bash link-machine-and-node.sh custom-master3 worker-5
$ bash link-machine-and-node.sh custom-master3 worker-5Copy to Clipboard Copied! Toggle word wrap Toggle overflow
确认
etcd成员:oc rsh -n openshift-etcd etcd-worker-2
$ oc rsh -n openshift-etcd etcd-worker-2 etcdctl member list -w tableCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
Copy to Clipboard Copied! Toggle word wrap Toggle overflow 确认
etcd-operator配置适用于所有节点:oc get clusteroperator etcd
$ oc get clusteroperator etcdCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE etcd 4.11.5 True False False 5h54m
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE etcd 4.11.5 True False False 5h54mCopy to Clipboard Copied! Toggle word wrap Toggle overflow 确认
etcd-operator健康状况:oc rsh -n openshift-etcd etcd-worker-0
$ oc rsh -n openshift-etcd etcd-worker-0 etcdctl endpoint healthCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
192.168.111.26 is healthy: committed proposal: took = 11.297561ms 192.168.111.25 is healthy: committed proposal: took = 13.892416ms 192.168.111.28 is healthy: committed proposal: took = 11.870755ms
192.168.111.26 is healthy: committed proposal: took = 11.297561ms 192.168.111.25 is healthy: committed proposal: took = 13.892416ms 192.168.111.28 is healthy: committed proposal: took = 11.870755msCopy to Clipboard Copied! Toggle word wrap Toggle overflow 确认节点健康状况:
oc get Nodes
$ oc get NodesCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
Copy to Clipboard Copied! Toggle word wrap Toggle overflow 确认
ClusterOperators健康状况:oc get ClusterOperators
$ oc get ClusterOperatorsCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
Copy to Clipboard Copied! Toggle word wrap Toggle overflow 确认
ClusterVersion:oc get ClusterVersion
$ oc get ClusterVersionCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.5 True False 5h57m Cluster version is 4.11.5
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.5 True False 5h57m Cluster version is 4.11.5Copy to Clipboard Copied! Toggle word wrap Toggle overflow 删除旧的 control plane 节点:
删除
BareMetalHostCR:oc delete bmh -n openshift-machine-api custom-master3
$ oc delete bmh -n openshift-machine-api custom-master3Copy to Clipboard Copied! Toggle word wrap Toggle overflow 确认
Machine不健康:oc get machine -A
$ oc get machine -ACopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
Copy to Clipboard Copied! Toggle word wrap Toggle overflow 删除
MachineCR:oc delete machine -n openshift-machine-api test-day2-1-6qv96-master-0
$ oc delete machine -n openshift-machine-api test-day2-1-6qv96-master-0 machine.machine.openshift.io "test-day2-1-6qv96-master-0" deletedCopy to Clipboard Copied! Toggle word wrap Toggle overflow 确认删除
NodeCR:oc get nodes
$ oc get nodesCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
检查
etcd-operator日志以确认etcd集群的状态:oc logs -n openshift-etcd-operator etcd-operator-8668df65d-lvpjf
$ oc logs -n openshift-etcd-operator etcd-operator-8668df65d-lvpjfCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
E0927 07:53:10.597523 1 base_controller.go:272] ClusterMemberRemovalController reconciliation failed: cannot remove member: 192.168.111.23 because it is reported as healthy but it doesn't have a machine nor a node resource
E0927 07:53:10.597523 1 base_controller.go:272] ClusterMemberRemovalController reconciliation failed: cannot remove member: 192.168.111.23 because it is reported as healthy but it doesn't have a machine nor a node resourceCopy to Clipboard Copied! Toggle word wrap Toggle overflow 删除物理机器,以允许
etcd-operator协调集群成员:oc rsh -n openshift-etcd etcd-worker-2
$ oc rsh -n openshift-etcd etcd-worker-2 etcdctl member list -w table; etcdctl endpoint healthCopy to Clipboard Copied! Toggle word wrap Toggle overflow 输出示例
Copy to Clipboard Copied! Toggle word wrap Toggle overflow