8.4. 将资源移到基础架构机器集
默认情况下,您的集群中已部署了某些基础架构资源。您可以通过添加基础架构节点选择器来将其移至您创建的基础架构机器集,如下所示:
spec:
nodePlacement: 1
nodeSelector:
matchLabels:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
value: reserved
- effect: NoExecute
key: node-role.kubernetes.io/infra
value: reserved
- 1
- 添加
nodeSelector
参数,并设为适用于您想要移动的组件的值。您可以根据为节点指定的值,按所示格式使用nodeSelector
或使用<key>: <value>
对。如果您在 infrasructure 节点中添加了污点,还要添加匹配的容限。
将特定节点选择器应用到所有基础架构组件会导致 OpenShift Container Platform 使用 该标签将这些工作负载调度到具有该标签的节点。
8.4.1. 移动路由器
您可以将路由器 Pod 部署到不同的计算机器集中。默认情况下,pod 部署到 worker 节点。
先决条件
- 在 OpenShift Container Platform 集群中配置额外的计算机器集。
流程
查看路由器 Operator 的
IngressController
自定义资源:$ oc get ingresscontroller default -n openshift-ingress-operator -o yaml
命令输出类似于以下文本:
apiVersion: operator.openshift.io/v1 kind: IngressController metadata: creationTimestamp: 2019-04-18T12:35:39Z finalizers: - ingresscontroller.operator.openshift.io/finalizer-ingresscontroller generation: 1 name: default namespace: openshift-ingress-operator resourceVersion: "11341" selfLink: /apis/operator.openshift.io/v1/namespaces/openshift-ingress-operator/ingresscontrollers/default uid: 79509e05-61d6-11e9-bc55-02ce4781844a spec: {} status: availableReplicas: 2 conditions: - lastTransitionTime: 2019-04-18T12:36:15Z status: "True" type: Available domain: apps.<cluster>.example.com endpointPublishingStrategy: type: LoadBalancerService selector: ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default
编辑
ingresscontroller
资源,并更改nodeSelector
以使用infra
标签:$ oc edit ingresscontroller default -n openshift-ingress-operator
spec: nodePlacement: nodeSelector: 1 matchLabels: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
- 1
- 添加
nodeSelector
参数,并设为适用于您想要移动的组件的值。您可以根据为节点指定的值,按所示格式使用nodeSelector
或使用<key>: <value>
对。如果您在 infrasructure 节点中添加了污点,还要添加匹配的容限。
确认路由器 Pod 在
infra
节点上运行。查看路由器 Pod 列表,并记下正在运行的 Pod 的节点名称:
$ oc get pod -n openshift-ingress -o wide
输出示例
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-86798b4b5d-bdlvd 1/1 Running 0 28s 10.130.2.4 ip-10-0-217-226.ec2.internal <none> <none> router-default-955d875f4-255g8 0/1 Terminating 0 19h 10.129.2.4 ip-10-0-148-172.ec2.internal <none> <none>
在本例中,正在运行的 Pod 位于
ip-10-0-217-226.ec2.internal
节点上。查看正在运行的 Pod 的节点状态:
$ oc get node <node_name> 1
- 1
- 指定从 Pod 列表获得的
<node_name>
。
输出示例
NAME STATUS ROLES AGE VERSION ip-10-0-217-226.ec2.internal Ready infra,worker 17h v1.30.3
由于角色列表包含
infra
,因此 Pod 在正确的节点上运行。
8.4.2. 移动默认 registry
您需要配置 registry Operator,以便将其 Pod 部署到其他节点。
先决条件
- 在 OpenShift Container Platform 集群中配置额外的计算机器集。
流程
查看
config/instance
对象:$ oc get configs.imageregistry.operator.openshift.io/cluster -o yaml
输出示例
apiVersion: imageregistry.operator.openshift.io/v1 kind: Config metadata: creationTimestamp: 2019-02-05T13:52:05Z finalizers: - imageregistry.operator.openshift.io/finalizer generation: 1 name: cluster resourceVersion: "56174" selfLink: /apis/imageregistry.operator.openshift.io/v1/configs/cluster uid: 36fd3724-294d-11e9-a524-12ffeee2931b spec: httpSecret: d9a012ccd117b1e6616ceccb2c3bb66a5fed1b5e481623 logging: 2 managementState: Managed proxy: {} replicas: 1 requests: read: {} write: {} storage: s3: bucket: image-registry-us-east-1-c92e88cad85b48ec8b312344dff03c82-392c region: us-east-1 status: ...
编辑
config/instance
对象:$ oc edit configs.imageregistry.operator.openshift.io/cluster
spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: namespaces: - openshift-image-registry topologyKey: kubernetes.io/hostname weight: 100 logLevel: Normal managementState: Managed nodeSelector: 1 node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
- 1
- 添加
nodeSelector
参数,并设为适用于您想要移动的组件的值。您可以根据为节点指定的值,按所示格式使用nodeSelector
或使用<key>: <value>
对。如果您在 infrasructure 节点中添加了污点,还要添加匹配的容限。
验证 registry pod 已移至基础架构节点。
运行以下命令,以识别 registry pod 所在的节点:
$ oc get pods -o wide -n openshift-image-registry
确认节点具有您指定的标签:
$ oc describe node <node_name>
查看命令输出,并确认
node-role.kubernetes.io/infra
列在LABELS
列表中。
8.4.3. 移动监控解决方案
监控堆栈包含多个组件,包括 Prometheus、Thanos Querier 和 Alertmanager。Cluster Monitoring Operator 管理此堆栈。要将监控堆栈重新部署到基础架构节点,您可以创建并应用自定义配置映射。
流程
编辑
cluster-monitoring-config
配置映射,并更改nodeSelector
以使用infra
标签:$ oc edit configmap cluster-monitoring-config -n openshift-monitoring
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: |+ alertmanagerMain: nodeSelector: 1 node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute prometheusK8s: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute prometheusOperator: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute metricsServer: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute kubeStateMetrics: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute telemeterClient: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute openshiftStateMetrics: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute thanosQuerier: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute monitoringPlugin: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute
- 1
- 添加
nodeSelector
参数,并设为适用于您想要移动的组件的值。您可以根据为节点指定的值,按所示格式使用nodeSelector
或使用<key>: <value>
对。如果您在 infrasructure 节点中添加了污点,还要添加匹配的容限。
观察监控 pod 移至新机器:
$ watch 'oc get pod -n openshift-monitoring -o wide'
如果组件没有移到
infra
节点,请删除带有这个组件的 pod:$ oc delete pod -n openshift-monitoring <pod>
已删除 pod 的组件在
infra
节点上重新创建。
8.4.4. 移动 Vertical Pod Autoscaler Operator 组件
Vertical Pod Autoscaler Operator (VPA)由三个组件组成:推荐器(recommender)、更新器(updater)和准入控制器(admission controller)。Operator 和每个组件在 control plane 节点上的 VPA 命名空间中都有自己的 pod。您可以通过在 VPA 订阅和 VerticalPodAutoscalerController
CR 中添加节点选择器,将 VPA Operator 和组件 pod 移到基础架构节点。
以下示例显示了 VPA pod 到 control plane 节点的默认部署。
输出示例
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES vertical-pod-autoscaler-operator-6c75fcc9cd-5pb6z 1/1 Running 0 7m59s 10.128.2.24 c416-tfsbj-master-1 <none> <none> vpa-admission-plugin-default-6cb78d6f8b-rpcrj 1/1 Running 0 5m37s 10.129.2.22 c416-tfsbj-master-1 <none> <none> vpa-recommender-default-66846bd94c-dsmpp 1/1 Running 0 5m37s 10.129.2.20 c416-tfsbj-master-0 <none> <none> vpa-updater-default-db8b58df-2nkvf 1/1 Running 0 5m37s 10.129.2.21 c416-tfsbj-master-1 <none> <none>
流程
通过将节点选择器添加到 VPA Operator 的
Subscription
自定义资源 (CR)中来移动 VPA Operator pod:编辑 CR:
$ oc edit Subscription vertical-pod-autoscaler -n openshift-vertical-pod-autoscaler
添加节点选择器以匹配 infra 节点上的节点角色标签:
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: labels: operators.coreos.com/vertical-pod-autoscaler.openshift-vertical-pod-autoscaler: "" name: vertical-pod-autoscaler # ... spec: config: nodeSelector: node-role.kubernetes.io/infra: "" 1
- 1
- 指定 infra 节点的节点角色。
注意如果 infra 节点使用污点,则需要为
Subscription
CR 添加容限。例如:
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: labels: operators.coreos.com/vertical-pod-autoscaler.openshift-vertical-pod-autoscaler: "" name: vertical-pod-autoscaler # ... spec: config: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: 1 - key: "node-role.kubernetes.io/infra" operator: "Exists" effect: "NoSchedule"
- 1
- 为 infra 节点上的污点指定容限。
通过将节点选择器添加到
VerticalPodAutoscaler
自定义资源 (CR) 来移动每个 VPA 组件:编辑 CR:
$ oc edit VerticalPodAutoscalerController default -n openshift-vertical-pod-autoscaler
添加节点选择器以匹配 infra 节点上的节点角色标签:
apiVersion: autoscaling.openshift.io/v1 kind: VerticalPodAutoscalerController metadata: name: default namespace: openshift-vertical-pod-autoscaler # ... spec: deploymentOverrides: admission: container: resources: {} nodeSelector: node-role.kubernetes.io/infra: "" 1 recommender: container: resources: {} nodeSelector: node-role.kubernetes.io/infra: "" 2 updater: container: resources: {} nodeSelector: node-role.kubernetes.io/infra: "" 3
注意如果目标节点使用污点,则需要为
VerticalPodAutoscalerController
CR 添加容限。例如:
apiVersion: autoscaling.openshift.io/v1 kind: VerticalPodAutoscalerController metadata: name: default namespace: openshift-vertical-pod-autoscaler # ... spec: deploymentOverrides: admission: container: resources: {} nodeSelector: node-role.kubernetes.io/infra: "" tolerations: 1 - key: "my-example-node-taint-key" operator: "Exists" effect: "NoSchedule" recommender: container: resources: {} nodeSelector: node-role.kubernetes.io/infra: "" tolerations: 2 - key: "my-example-node-taint-key" operator: "Exists" effect: "NoSchedule" updater: container: resources: {} nodeSelector: node-role.kubernetes.io/infra: "" tolerations: 3 - key: "my-example-node-taint-key" operator: "Exists" effect: "NoSchedule"
验证
您可以使用以下命令验证 pod 是否已移动:
$ oc get pods -n openshift-vertical-pod-autoscaler -o wide
pod 不再部署到 control plane 节点。
输出示例
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES vertical-pod-autoscaler-operator-6c75fcc9cd-5pb6z 1/1 Running 0 7m59s 10.128.2.24 c416-tfsbj-infra-eastus3-2bndt <none> <none> vpa-admission-plugin-default-6cb78d6f8b-rpcrj 1/1 Running 0 5m37s 10.129.2.22 c416-tfsbj-infra-eastus1-lrgj8 <none> <none> vpa-recommender-default-66846bd94c-dsmpp 1/1 Running 0 5m37s 10.129.2.20 c416-tfsbj-infra-eastus1-lrgj8 <none> <none> vpa-updater-default-db8b58df-2nkvf 1/1 Running 0 5m37s 10.129.2.21 c416-tfsbj-infra-eastus1-lrgj8 <none> <none>
8.4.5. 移动 Cluster Resource Override Operator pod
默认情况下,Cluster Resource Override Operator 安装过程会创建一个 Operator pod,并在 clusterresourceoverride-operator
命名空间中的节点上创建两个 Cluster Resource Override pod。您可以根据需要将这些 pod 移到其他节点,如基础架构节点。
以下示例显示了 Cluster Resource Override pod 被部署到 control plane 节点,Cluster Resource Override Operator pod 被部署到 worker 节点。
集群资源覆盖 pod 示例
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES clusterresourceoverride-786b8c898c-9wrdq 1/1 Running 0 23s 10.128.2.32 ip-10-0-14-183.us-west-2.compute.internal <none> <none> clusterresourceoverride-786b8c898c-vn2lf 1/1 Running 0 26s 10.130.2.10 ip-10-0-20-140.us-west-2.compute.internal <none> <none> clusterresourceoverride-operator-6b8b8b656b-lvr62 1/1 Running 0 56m 10.131.0.33 ip-10-0-2-39.us-west-2.compute.internal <none> <none>
节点列表示例
NAME STATUS ROLES AGE VERSION ip-10-0-14-183.us-west-2.compute.internal Ready control-plane,master 65m v1.30.4 ip-10-0-2-39.us-west-2.compute.internal Ready worker 58m v1.30.4 ip-10-0-20-140.us-west-2.compute.internal Ready control-plane,master 65m v1.30.4 ip-10-0-23-244.us-west-2.compute.internal Ready infra 55m v1.30.4 ip-10-0-77-153.us-west-2.compute.internal Ready control-plane,master 65m v1.30.4 ip-10-0-99-108.us-west-2.compute.internal Ready worker 24m v1.30.4 ip-10-0-24-233.us-west-2.compute.internal Ready infra 55m v1.30.4 ip-10-0-88-109.us-west-2.compute.internal Ready worker 24m v1.30.4 ip-10-0-67-453.us-west-2.compute.internal Ready infra 55m v1.30.4
流程
通过将节点选择器添加到 Cluster Resource Override Operator 的
Subscription
自定义资源 (CR) 中来移动 Cluster Resource Override Operator pod。编辑 CR:
$ oc edit -n clusterresourceoverride-operator subscriptions.operators.coreos.com clusterresourceoverride
添加节点选择器以匹配您要安装 Cluster Resource Override Operator pod 的节点上的节点角色标签:
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: clusterresourceoverride namespace: clusterresourceoverride-operator # ... spec: config: nodeSelector: node-role.kubernetes.io/infra: "" 1 # ...
- 1
- 指定您要部署 Cluster Resource Override Operator pod 的节点的角色。
注意如果 infra 节点使用污点,则需要为
Subscription
CR 添加容限。例如:
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: clusterresourceoverride namespace: clusterresourceoverride-operator # ... spec: config: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: 1 - key: "node-role.kubernetes.io/infra" operator: "Exists" effect: "NoSchedule"
- 1
- 为 infra 节点上的污点指定容限。
通过将节点选择器添加到
ClusterResourceOverride
自定义资源(CR) 来移动 Cluster Resource Override pod:编辑 CR:
$ oc edit ClusterResourceOverride cluster -n clusterresourceoverride-operator
添加节点选择器以匹配 infra 节点上的节点角色标签:
apiVersion: operator.autoscaling.openshift.io/v1 kind: ClusterResourceOverride metadata: name: cluster resourceVersion: "37952" spec: podResourceOverride: spec: cpuRequestToLimitPercent: 25 limitCPUToMemoryPercent: 200 memoryRequestToLimitPercent: 50 deploymentOverrides: replicas: 1 1 nodeSelector: node-role.kubernetes.io/infra: "" 2 # ...
注意如果 infra 节点使用污点,则需要为
ClusterResourceOverride
CR 添加容限。例如:
apiVersion: operator.autoscaling.openshift.io/v1 kind: ClusterResourceOverride metadata: name: cluster # ... spec: podResourceOverride: spec: memoryRequestToLimitPercent: 50 cpuRequestToLimitPercent: 25 limitCPUToMemoryPercent: 200 deploymentOverrides: replicas: 3 nodeSelector: node-role.kubernetes.io/worker: "" tolerations: 1 - key: "key" operator: "Equal" value: "value" effect: "NoSchedule"
- 1
- 为 infra 节点上的污点指定容限。
验证
您可以使用以下命令验证 pod 是否已移动:
$ oc get pods -n clusterresourceoverride-operator -o wide
Cluster Resource Override pod 现在部署到 infra 节点。
输出示例
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES clusterresourceoverride-786b8c898c-9wrdq 1/1 Running 0 23s 10.127.2.25 ip-10-0-23-244.us-west-2.compute.internal <none> <none> clusterresourceoverride-786b8c898c-vn2lf 1/1 Running 0 26s 10.128.0.80 ip-10-0-24-233.us-west-2.compute.internal <none> <none> clusterresourceoverride-operator-6b8b8b656b-lvr62 1/1 Running 0 56m 10.129.0.71 ip-10-0-67-453.us-west-2.compute.internal <none> <none>
其他资源