4.5. 管理 control plane 机器
control plane 机器集为 control plane 机器提供管理功能,与为计算机器提供的计算机器集类似。集群上的 control plane 机器集的可用性和初始状态取决于您的云供应商和您安装的 OpenShift Container Platform 版本。如需更多信息,请参阅开始使用 control plane 机器集。
4.5.1. 在集群中添加 control plane 节点 复制链接链接已复制到粘贴板!
在裸机基础架构上安装集群时,您可以手动扩展到 4 或 5 个 control plane 节点。此流程中的示例使用 node-5 作为新的 control plane 节点。
先决条件
- 已安装一个带有至少三个 control plane 节点的健康集群。
- 您已创建了单个 control plane 节点,您要作为安装后任务添加到集群中。
流程
输入以下命令为新的 control plane 节点检索待处理的证书签名请求(CSR):
$ oc get csr | grep Pending输入以下命令为 control plane 节点批准所有待处理的 CSR:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve重要您必须批准 CSR 才能完成安装。
输入以下命令确认 control plane 节点处于
Ready状态:$ oc get nodes注意在安装程序置备的基础架构中,etcd Operator 依赖于 Machine API 来管理 control plane 并确保 etcd 仲裁。然后,Machine API 使用
MachineCR 代表和管理底层 control plane 节点。创建
BareMetalHost和MachineCR,并将其链接到 control plane 节点的NodeCR。使用唯一
.metadata.name值创建BareMetalHostCR,如下例所示:apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: node-5 namespace: openshift-machine-api spec: automatedCleaningMode: metadata bootMACAddress: 00:00:00:00:00:02 bootMode: UEFI customDeploy: method: install_coreos externallyProvisioned: true online: true userData: name: master-user-data-managed namespace: openshift-machine-api # ...输入以下命令应用
BareMetalHostCR:$ oc apply -f <filename>1 - 1
- 将 <filename> 替换为
BareMetalHostCR 的名称。
使用以下示例中所示的唯一
.metadata.name值创建MachineCR:apiVersion: machine.openshift.io/v1beta1 kind: Machine metadata: annotations: machine.openshift.io/instance-state: externally provisioned metal3.io/BareMetalHost: openshift-machine-api/node-5 finalizers: - machine.machine.openshift.io labels: machine.openshift.io/cluster-api-cluster: <cluster_name>1 machine.openshift.io/cluster-api-machine-role: master machine.openshift.io/cluster-api-machine-type: master name: node-5 namespace: openshift-machine-api spec: metadata: {} providerSpec: value: apiVersion: baremetal.cluster.k8s.io/v1alpha1 customDeploy: method: install_coreos hostSelector: {} image: checksum: "" url: "" kind: BareMetalMachineProviderSpec metadata: creationTimestamp: null userData: name: master-user-data-managed # ...- 1
- 将
<cluster_name>替换为特定集群的名称,如test-day2-1-6qv96。
运行以下命令来获取集群名称:
$ oc get infrastructure cluster -o=jsonpath='{.status.infrastructureName}{"\n"}'输入以下命令应用
MachineCR:$ oc apply -f <filename>1 - 1
- 将
<filename>替换为MachineCR 的名称。
通过运行
link-machine-and-node.sh脚本链接BareMetalHost、Machine和Node对象:将以下
link-machine-and-node.sh脚本复制到本地机器中:#!/bin/bash # Credit goes to # https://bugzilla.redhat.com/show_bug.cgi?id=1801238. # This script will link Machine object # and Node object. This is needed # in order to have IP address of # the Node present in the status of the Machine. set -e machine="$1" node="$2" if [ -z "$machine" ] || [ -z "$node" ]; then echo "Usage: $0 MACHINE NODE" exit 1 fi node_name=$(echo "${node}" | cut -f2 -d':') oc proxy & proxy_pid=$! function kill_proxy { kill $proxy_pid } trap kill_proxy EXIT SIGINT HOST_PROXY_API_PATH="http://localhost:8001/apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts" function print_nics() { local ips local eob declare -a ips readarray -t ips < <(echo "${1}" \ | jq '.[] | select(. | .type == "InternalIP") | .address' \ | sed 's/"//g') eob=',' for (( i=0; i<${#ips[@]}; i++ )); do if [ $((i+1)) -eq ${#ips[@]} ]; then eob="" fi cat <<- EOF { "ip": "${ips[$i]}", "mac": "00:00:00:00:00:00", "model": "unknown", "speedGbps": 10, "vlanId": 0, "pxe": true, "name": "eth1" }${eob} EOF done } function wait_for_json() { local name local url local curl_opts local timeout local start_time local curr_time local time_diff name="$1" url="$2" timeout="$3" shift 3 curl_opts="$@" echo -n "Waiting for $name to respond" start_time=$(date +%s) until curl -g -X GET "$url" "${curl_opts[@]}" 2> /dev/null | jq '.' 2> /dev/null > /dev/null; do echo -n "." curr_time=$(date +%s) time_diff=$((curr_time - start_time)) if [[ $time_diff -gt $timeout ]]; then printf '\nTimed out waiting for %s' "${name}" return 1 fi sleep 5 done echo " Success!" return 0 } wait_for_json oc_proxy "${HOST_PROXY_API_PATH}" 10 -H "Accept: application/json" -H "Content-Type: application/json" addresses=$(oc get node -n openshift-machine-api "${node_name}" -o json | jq -c '.status.addresses') machine_data=$(oc get machines.machine.openshift.io -n openshift-machine-api -o json "${machine}") host=$(echo "$machine_data" | jq '.metadata.annotations["metal3.io/BareMetalHost"]' | cut -f2 -d/ | sed 's/"//g') if [ -z "$host" ]; then echo "Machine $machine is not linked to a host yet." 1>&2 exit 1 fi # The address structure on the host doesn't match the node, so extract # the values we want into separate variables so we can build the patch # we need. hostname=$(echo "${addresses}" | jq '.[] | select(. | .type == "Hostname") | .address' | sed 's/"//g') set +e read -r -d '' host_patch << EOF { "status": { "hardware": { "hostname": "${hostname}", "nics": [ $(print_nics "${addresses}") ], "systemVendor": { "manufacturer": "Red Hat", "productName": "product name", "serialNumber": "" }, "firmware": { "bios": { "date": "04/01/2014", "vendor": "SeaBIOS", "version": "1.11.0-2.el7" } }, "ramMebibytes": 0, "storage": [], "cpu": { "arch": "x86_64", "model": "Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz", "clockMegahertz": 2199.998, "count": 4, "flags": [] } } } } EOF set -e echo "PATCHING HOST" echo "${host_patch}" | jq . curl -s \ -X PATCH \ "${HOST_PROXY_API_PATH}/${host}/status" \ -H "Content-type: application/merge-patch+json" \ -d "${host_patch}" oc get baremetalhost -n openshift-machine-api -o yaml "${host}"输入以下命令使脚本可执行:
$ chmod +x link-machine-and-node.sh输入以下命令运行脚本:
$ bash link-machine-and-node.sh node-5 node-5注意第一个
node-5实例代表计算机,第二个实例代表该节点。
验证
通过执行预先存在的 control plane 节点之一来确认 etcd 成员:
输入以下命令打开到 control plane 节点的远程 shell 会话:
$ oc rsh -n openshift-etcd etcd-node-0列出 etcd 成员:
# etcdctl member list -w table
输入以下命令检查 etcd Operator 配置过程,直到完成为止。预期输出在
PROGRESSING栏下显示False。$ oc get clusteroperator etcd运行以下命令确认 etcd 健康状况:
打开到 control plane 节点的远程 shell 会话:
$ oc rsh -n openshift-etcd etcd-node-0检查端点健康状况。预期输出显示端点
是健康的。# etcdctl endpoint health
输入以下命令验证所有节点是否已就绪。预期输出显示每个节点条目旁边的
Ready状态。$ oc get nodes输入以下命令验证集群 Operator 是否可用。预期输出列出了每个 Operator,并在每个列出的 Operator 旁边显示可用状态为
True。$ oc get ClusterOperators输入以下命令验证集群版本是否正确:
$ oc get ClusterVersion输出示例
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version OpenShift Container Platform.5 True False 5h57m Cluster version is OpenShift Container Platform.5