7.4. 将资源移到基础架构机器集
默认情况下,您的集群中已部署了某些基础架构资源。您可以通过添加基础架构节点选择器来将其移至您创建的基础架构机器集,如下所示:
spec:
nodePlacement: 1
nodeSelector:
matchLabels:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
value: reserved
- effect: NoExecute
key: node-role.kubernetes.io/infra
value: reserved
- 1
- 添加
nodeSelector
参数,并设为适用于您想要移动的组件的值。您可以根据为节点指定的值,按所示格式使用nodeSelector
或使用<key>: <value>
对。如果您在 infrasructure 节点中添加了污点,还要添加匹配的容限。
将特定节点选择器应用到所有基础架构组件会导致 OpenShift Container Platform 使用 该标签将这些工作负载调度到具有该标签的节点。
7.4.1. 移动路由器
您可以将路由器 pod 部署到不同的机器集中。默认情况下,pod 部署到 worker 节点。
先决条件
- 在 OpenShift Container Platform 集群中配置额外的机器集。
流程
查看路由器 Operator 的
IngressController
自定义资源:$ oc get ingresscontroller default -n openshift-ingress-operator -o yaml
命令输出类似于以下文本:
apiVersion: operator.openshift.io/v1 kind: IngressController metadata: creationTimestamp: 2019-04-18T12:35:39Z finalizers: - ingresscontroller.operator.openshift.io/finalizer-ingresscontroller generation: 1 name: default namespace: openshift-ingress-operator resourceVersion: "11341" selfLink: /apis/operator.openshift.io/v1/namespaces/openshift-ingress-operator/ingresscontrollers/default uid: 79509e05-61d6-11e9-bc55-02ce4781844a spec: {} status: availableReplicas: 2 conditions: - lastTransitionTime: 2019-04-18T12:36:15Z status: "True" type: Available domain: apps.<cluster>.example.com endpointPublishingStrategy: type: LoadBalancerService selector: ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default
编辑
ingresscontroller
资源,并更改nodeSelector
以使用infra
标签:$ oc edit ingresscontroller default -n openshift-ingress-operator
spec: nodePlacement: nodeSelector: 1 matchLabels: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
- 1
- 添加
nodeSelector
参数,并设为适用于您想要移动的组件的值。您可以根据为节点指定的值,按所示格式使用nodeSelector
或使用<key>: <value>
对。如果您在 infrasructure 节点中添加了污点,还要添加匹配的容限。
确认路由器 Pod 在
infra
节点上运行。查看路由器 Pod 列表,并记下正在运行的 Pod 的节点名称:
$ oc get pod -n openshift-ingress -o wide
输出示例
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-86798b4b5d-bdlvd 1/1 Running 0 28s 10.130.2.4 ip-10-0-217-226.ec2.internal <none> <none> router-default-955d875f4-255g8 0/1 Terminating 0 19h 10.129.2.4 ip-10-0-148-172.ec2.internal <none> <none>
在本例中,正在运行的 Pod 位于
ip-10-0-217-226.ec2.internal
节点上。查看正在运行的 Pod 的节点状态:
$ oc get node <node_name> 1
- 1
- 指定从 Pod 列表获得的
<node_name>
。
输出示例
NAME STATUS ROLES AGE VERSION ip-10-0-217-226.ec2.internal Ready infra,worker 17h v1.23.0
由于角色列表包含
infra
,因此 Pod 在正确的节点上运行。
7.4.2. 移动默认 registry
您需要配置 registry Operator,以便将其 Pod 部署到其他节点。
先决条件
- 在 OpenShift Container Platform 集群中配置额外的机器集。
流程
查看
config/instance
对象:$ oc get configs.imageregistry.operator.openshift.io/cluster -o yaml
输出示例
apiVersion: imageregistry.operator.openshift.io/v1 kind: Config metadata: creationTimestamp: 2019-02-05T13:52:05Z finalizers: - imageregistry.operator.openshift.io/finalizer generation: 1 name: cluster resourceVersion: "56174" selfLink: /apis/imageregistry.operator.openshift.io/v1/configs/cluster uid: 36fd3724-294d-11e9-a524-12ffeee2931b spec: httpSecret: d9a012ccd117b1e6616ceccb2c3bb66a5fed1b5e481623 logging: 2 managementState: Managed proxy: {} replicas: 1 requests: read: {} write: {} storage: s3: bucket: image-registry-us-east-1-c92e88cad85b48ec8b312344dff03c82-392c region: us-east-1 status: ...
编辑
config/instance
对象:$ oc edit configs.imageregistry.operator.openshift.io/cluster
spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: namespaces: - openshift-image-registry topologyKey: kubernetes.io/hostname weight: 100 logLevel: Normal managementState: Managed nodeSelector: 1 node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
- 1
- 添加
nodeSelector
参数,并设为适用于您想要移动的组件的值。您可以根据为节点指定的值,按所示格式使用nodeSelector
或使用<key>: <value>
对。如果您在 infrasructure 节点中添加了污点,还要添加匹配的容限。
验证 registry pod 已移至基础架构节点。
运行以下命令,以识别 registry pod 所在的节点:
$ oc get pods -o wide -n openshift-image-registry
确认节点具有您指定的标签:
$ oc describe node <node_name>
查看命令输出,并确认
node-role.kubernetes.io/infra
列在LABELS
列表中。
7.4.3. 移动监控解决方案
监控堆栈包含多个组件,包括 Prometheus、Grafana 和 Alertmanager。Cluster Monitoring Operator 管理此堆栈。要将监控堆栈重新部署到基础架构节点,您可以创建并应用自定义配置映射。
流程
编辑
cluster-monitoring-config
配置映射,并更改nodeSelector
以使用infra
标签:$ oc edit configmap cluster-monitoring-config -n openshift-monitoring
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: |+ alertmanagerMain: nodeSelector: 1 node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute prometheusK8s: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute prometheusOperator: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute grafana: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute k8sPrometheusAdapter: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute kubeStateMetrics: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute telemeterClient: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute openshiftStateMetrics: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute thanosQuerier: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - key: node-role.kubernetes.io/infra value: reserved effect: NoSchedule - key: node-role.kubernetes.io/infra value: reserved effect: NoExecute
- 1
- 添加
nodeSelector
参数,并设为适用于您想要移动的组件的值。您可以根据为节点指定的值,按所示格式使用nodeSelector
或使用<key>: <value>
对。如果您在 infrasructure 节点中添加了污点,还要添加匹配的容限。
观察监控 pod 移至新机器:
$ watch 'oc get pod -n openshift-monitoring -o wide'
如果组件没有移到
infra
节点,请删除带有这个组件的 pod:$ oc delete pod -n openshift-monitoring <pod>
已删除 pod 的组件在
infra
节点上重新创建。
7.4.4. 移动 OpenShift Logging 资源
您可以配置 Cluster Logging Operator,以将用于日志记录子系统组件的 Pod(如 Elasticsearch 和 Kibana)部署到不同的节点上。您无法将 Cluster Logging Operator Pod 从其安装位置移走。
例如,您可以因为 CPU、内存和磁盘要求较高而将 Elasticsearch Pod 移到一个单独的节点上。
先决条件
- 必须安装 Red Hat OpenShift Logging 和 Elasticsearch Operator。默认情况下没有安装这些功能。
流程
编辑
openshift-logging
项目中的ClusterLogging
自定义资源(CR):$ oc edit ClusterLogging instance
apiVersion: logging.openshift.io/v1 kind: ClusterLogging ... spec: collection: logs: fluentd: resources: null type: fluentd logStore: elasticsearch: nodeCount: 3 nodeSelector: 1 node-role.kubernetes.io/infra: '' tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved redundancyPolicy: SingleRedundancy resources: limits: cpu: 500m memory: 16Gi requests: cpu: 500m memory: 16Gi storage: {} type: elasticsearch managementState: Managed visualization: kibana: nodeSelector: 2 node-role.kubernetes.io/infra: '' tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved proxy: resources: null replicas: 1 resources: null type: kibana ...
验证
要验证组件是否已移动,您可以使用 oc get pod -o wide
命令。
例如:
您需要移动来自
ip-10-0-147-79.us-east-2.compute.internal
节点上的 Kibana pod:$ oc get pod kibana-5b8bdf44f9-ccpq9 -o wide
输出示例
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kibana-5b8bdf44f9-ccpq9 2/2 Running 0 27s 10.129.2.18 ip-10-0-147-79.us-east-2.compute.internal <none> <none>
您需要将 Kibana pod 移到
ip-10-0-139-48.us-east-2.compute.internal
节点,该节点是一个专用的基础架构节点:$ oc get nodes
输出示例
NAME STATUS ROLES AGE VERSION ip-10-0-133-216.us-east-2.compute.internal Ready master 60m v1.23.0 ip-10-0-139-146.us-east-2.compute.internal Ready master 60m v1.23.0 ip-10-0-139-192.us-east-2.compute.internal Ready worker 51m v1.23.0 ip-10-0-139-241.us-east-2.compute.internal Ready worker 51m v1.23.0 ip-10-0-147-79.us-east-2.compute.internal Ready worker 51m v1.23.0 ip-10-0-152-241.us-east-2.compute.internal Ready master 60m v1.23.0 ip-10-0-139-48.us-east-2.compute.internal Ready infra 51m v1.23.0
请注意,该节点具有
node-role.kubernetes.io/infra: "
label:$ oc get node ip-10-0-139-48.us-east-2.compute.internal -o yaml
输出示例
kind: Node apiVersion: v1 metadata: name: ip-10-0-139-48.us-east-2.compute.internal selfLink: /api/v1/nodes/ip-10-0-139-48.us-east-2.compute.internal uid: 62038aa9-661f-41d7-ba93-b5f1b6ef8751 resourceVersion: '39083' creationTimestamp: '2020-04-13T19:07:55Z' labels: node-role.kubernetes.io/infra: '' ...
要移动 Kibana pod,编辑
ClusterLogging
CR 以添加节点选择器:apiVersion: logging.openshift.io/v1 kind: ClusterLogging ... spec: ... visualization: kibana: nodeSelector: 1 node-role.kubernetes.io/infra: '' proxy: resources: null replicas: 1 resources: null type: kibana
- 1
- 添加节点选择器以匹配节点规格中的 label。
保存 CR 后,当前 Kibana Pod 将被终止,新的 Pod 会被部署:
$ oc get pods
输出示例
NAME READY STATUS RESTARTS AGE cluster-logging-operator-84d98649c4-zb9g7 1/1 Running 0 29m elasticsearch-cdm-hwv01pf7-1-56588f554f-kpmlg 2/2 Running 0 28m elasticsearch-cdm-hwv01pf7-2-84c877d75d-75wqj 2/2 Running 0 28m elasticsearch-cdm-hwv01pf7-3-f5d95b87b-4nx78 2/2 Running 0 28m fluentd-42dzz 1/1 Running 0 28m fluentd-d74rq 1/1 Running 0 28m fluentd-m5vr9 1/1 Running 0 28m fluentd-nkxl7 1/1 Running 0 28m fluentd-pdvqb 1/1 Running 0 28m fluentd-tflh6 1/1 Running 0 28m kibana-5b8bdf44f9-ccpq9 2/2 Terminating 0 4m11s kibana-7d85dcffc8-bfpfp 2/2 Running 0 33s
新 pod 位于
ip-10-0-139-48.us-east-2.compute.internal
节点上 :$ oc get pod kibana-7d85dcffc8-bfpfp -o wide
输出示例
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kibana-7d85dcffc8-bfpfp 2/2 Running 0 43s 10.131.0.22 ip-10-0-139-48.us-east-2.compute.internal <none> <none>
片刻后,原始 Kibana Pod 将被删除。
$ oc get pods
输出示例
NAME READY STATUS RESTARTS AGE cluster-logging-operator-84d98649c4-zb9g7 1/1 Running 0 30m elasticsearch-cdm-hwv01pf7-1-56588f554f-kpmlg 2/2 Running 0 29m elasticsearch-cdm-hwv01pf7-2-84c877d75d-75wqj 2/2 Running 0 29m elasticsearch-cdm-hwv01pf7-3-f5d95b87b-4nx78 2/2 Running 0 29m fluentd-42dzz 1/1 Running 0 29m fluentd-d74rq 1/1 Running 0 29m fluentd-m5vr9 1/1 Running 0 29m fluentd-nkxl7 1/1 Running 0 29m fluentd-pdvqb 1/1 Running 0 29m fluentd-tflh6 1/1 Running 0 29m kibana-7d85dcffc8-bfpfp 2/2 Running 0 62s
其他资源
- 有关移动 OpenShift Container Platform 组件的常规说明,请参阅监控文档。