5.3. 节点主机任务


5.3.1. 弃用节点主机

无论弃用基础架构节点或应用程序节点,步骤都是一样的。

先决条件

确保有足够的容量,以便将现有 pod 从节点集合中移除。只有在删除基础架构节点时,才建议删除基础架构节点。

流程
  1. 列出所有可用的节点,以查找要弃用的节点:

    $ oc get nodes
    NAME                  STATUS                     AGE       VERSION
    ocp-infra-node-b7pl   Ready                      23h       v1.6.1+5115d708d7
    ocp-infra-node-p5zj   Ready                      23h       v1.6.1+5115d708d7
    ocp-infra-node-rghb   Ready                      23h       v1.6.1+5115d708d7
    ocp-master-dgf8       Ready,SchedulingDisabled   23h       v1.6.1+5115d708d7
    ocp-master-q1v2       Ready,SchedulingDisabled   23h       v1.6.1+5115d708d7
    ocp-master-vq70       Ready,SchedulingDisabled   23h       v1.6.1+5115d708d7
    ocp-node-020m         Ready                      23h       v1.6.1+5115d708d7
    ocp-node-7t5p         Ready                      23h       v1.6.1+5115d708d7
    ocp-node-n0dd         Ready                      23h       v1.6.1+5115d708d7
    Copy to Clipboard Toggle word wrap

    例如,本节弃用 ocp-infra-node-b7pl 基础架构节点。

  2. 描述节点及其运行的服务:

    $ oc describe node ocp-infra-node-b7pl
    Name:			ocp-infra-node-b7pl
    Role:
    Labels:			beta.kubernetes.io/arch=amd64
    			beta.kubernetes.io/instance-type=n1-standard-2
    			beta.kubernetes.io/os=linux
    			failure-domain.beta.kubernetes.io/region=europe-west3
    			failure-domain.beta.kubernetes.io/zone=europe-west3-c
    			kubernetes.io/hostname=ocp-infra-node-b7pl
    			role=infra
    Annotations:		volumes.kubernetes.io/controller-managed-attach-detach=true
    Taints:			<none>
    CreationTimestamp:	Wed, 22 Nov 2017 09:36:36 -0500
    Phase:
    Conditions:
      ...
    Addresses:		10.156.0.11,ocp-infra-node-b7pl
    Capacity:
     cpu:		2
     memory:	7494480Ki
     pods:		20
    Allocatable:
     cpu:		2
     memory:	7392080Ki
     pods:		20
    System Info:
     Machine ID:			bc95ccf67d047f2ae42c67862c202e44
     System UUID:			9762CC3D-E23C-AB13-B8C5-FA16F0BCCE4C
     Boot ID:			ca8bf088-905d-4ec0-beec-8f89f4527ce4
     Kernel Version:		3.10.0-693.5.2.el7.x86_64
     OS Image:			Employee SKU
     Operating System:		linux
     Architecture:			amd64
     Container Runtime Version:	docker://1.12.6
     Kubelet Version:		v1.6.1+5115d708d7
     Kube-Proxy Version:		v1.6.1+5115d708d7
    ExternalID:			437740049672994824
    Non-terminated Pods:		(2 in total)
      Namespace			Name				CPU Requests	CPU Limits	Memory Requests	Memory Limits
      ---------			----				------------	----------	---------------	-------------
      default			docker-registry-1-5szjs		100m (5%)	0 (0%)		256Mi (3%)0 (0%)
      default			router-1-vzlzq			100m (5%)	0 (0%)		256Mi (3%)0 (0%)
    Allocated resources:
      (Total limits may be over 100 percent, i.e., overcommitted.)
      CPU Requests	CPU Limits	Memory Requests	Memory Limits
      ------------	----------	---------------	-------------
      200m (10%)	0 (0%)		512Mi (7%)	0 (0%)
    Events:		<none>
    Copy to Clipboard Toggle word wrap

    上面的输出显示节点正在运行两个 pod: router-1-vzlzqdocker-registry-1-5szjs。有两个基础架构节点可用于迁移这两个 pod。

    注意

    上述集群是一个高可用性集群,这意味着 routerdocker-registry 服务在所有基础架构节点上运行。

  3. 将节点标记为不可调度并撤离其所有 pod:

    $ oc adm drain ocp-infra-node-b7pl --delete-local-data
    node "ocp-infra-node-b7pl" cordoned
    WARNING: Deleting pods with local storage: docker-registry-1-5szjs
    pod "docker-registry-1-5szjs" evicted
    pod "router-1-vzlzq" evicted
    node "ocp-infra-node-b7pl" drained
    Copy to Clipboard Toggle word wrap

    如果 pod 已附加了本地存储(例如 EmptyDir),则必须提供 --delete-local-data 选项。通常,在生产环境中运行的 pod 应该只将本地存储用于临时或缓存文件,但不适用于任何重要或持久的。对于常规存储,应用程序应使用对象存储或持久性卷。在这种情况下,docker-registry pod 的本地存储为空,因为对象存储被用来存储容器镜像。

    注意

    以上操作会删除节点上运行的现有 pod。然后,根据复制控制器创建新的 pod。

    通常,每个应用都应该使用部署配置进行部署,这将利用复制控制器创建 pod。

    oc adm drain 不会删除任何不是镜像 pod 的裸机 pod(镜像 pod,或由 ReplicationControllerReplicaSetDaemonSetStatefulSet 或作业进行管理的 pod)。要做到这一点,需要 --force 选项。请注意,在此操作过程中,不会在其他节点上重新创建裸机 pod,数据可能会丢失。

    以下示例显示了 registry 的复制控制器的输出:

    $ oc describe rc/docker-registry-1
    Name:		docker-registry-1
    Namespace:	default
    Selector:	deployment=docker-registry-1,deploymentconfig=docker-registry,docker-registry=default
    Labels:		docker-registry=default
    		openshift.io/deployment-config.name=docker-registry
    Annotations: ...
    Replicas:	3 current / 3 desired
    Pods Status:	3 Running / 0 Waiting / 0 Succeeded / 0 Failed
    Pod Template:
      Labels:		deployment=docker-registry-1
    			deploymentconfig=docker-registry
    			docker-registry=default
      Annotations:		openshift.io/deployment-config.latest-version=1
    			openshift.io/deployment-config.name=docker-registry
    			openshift.io/deployment.name=docker-registry-1
      Service Account:	registry
      Containers:
       registry:
        Image:	openshift3/ose-docker-registry:v3.6.173.0.49
        Port:	5000/TCP
        Requests:
          cpu:	100m
          memory:	256Mi
        Liveness:	http-get https://:5000/healthz delay=10s timeout=5s period=10s #success=1 #failure=3
        Readiness:	http-get https://:5000/healthz delay=0s timeout=5s period=10s #success=1 #failure=3
        Environment:
          REGISTRY_HTTP_ADDR:					:5000
          REGISTRY_HTTP_NET:					tcp
          REGISTRY_HTTP_SECRET:					tyGEnDZmc8dQfioP3WkNd5z+Xbdfy/JVXf/NLo3s/zE=
          REGISTRY_MIDDLEWARE_REPOSITORY_OPENSHIFT_ENFORCEQUOTA:	false
          REGISTRY_HTTP_TLS_KEY:					/etc/secrets/registry.key
          OPENSHIFT_DEFAULT_REGISTRY:				docker-registry.default.svc:5000
          REGISTRY_CONFIGURATION_PATH:				/etc/registry/config.yml
          REGISTRY_HTTP_TLS_CERTIFICATE:				/etc/secrets/registry.crt
        Mounts:
          /etc/registry from docker-config (rw)
          /etc/secrets from registry-certificates (rw)
          /registry from registry-storage (rw)
      Volumes:
       registry-storage:
        Type:	EmptyDir (a temporary directory that shares a pod's lifetime)
        Medium:
       registry-certificates:
        Type:	Secret (a volume populated by a Secret)
        SecretName:	registry-certificates
        Optional:	false
       docker-config:
        Type:	Secret (a volume populated by a Secret)
        SecretName:	registry-config
        Optional:	false
    Events:
      FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason		Message
      ---------	--------	-----	----			-------------	--------	------		-------
      49m		49m		1	replication-controller			Normal		SuccessfulCreate	Created pod: docker-registry-1-dprp5
    Copy to Clipboard Toggle word wrap

    输出底部的事件显示有关新 pod 创建的信息。因此,当列出所有 pod 时:

    $ oc get pods
    NAME                       READY     STATUS    RESTARTS   AGE
    docker-registry-1-dprp5    1/1       Running   0          52m
    docker-registry-1-kr8jq    1/1       Running   0          1d
    docker-registry-1-ncpl2    1/1       Running   0          1d
    registry-console-1-g4nqg   1/1       Running   0          1d
    router-1-2gshr             0/1       Pending   0          52m
    router-1-85qm4             1/1       Running   0          1d
    router-1-q5sr8             1/1       Running   0          1d
    Copy to Clipboard Toggle word wrap
  4. 现在,在已弃用节点上运行的 docker-registry-1-5szjsrouter-1-vzlzq pod 不再可用。相反,创建了两个新 pod: docker-registry-1-dprp5router-1-2gshr。如上所示,新的路由器 Pod 是 router-1-2gshr,但处于 Pending 状态。这是因为每个节点只能在一个路由器中运行,并绑定到主机的端口 80 和 443。
  5. 观察新创建的 registry pod 时,以下示例显示了 ocp-infra-node-rghb 节点上已创建了 pod,它与弃用节点的不同:

    $ oc describe pod docker-registry-1-dprp5
    Name:			docker-registry-1-dprp5
    Namespace:		default
    Security Policy:	hostnetwork
    Node:			ocp-infra-node-rghb/10.156.0.10
    ...
    Copy to Clipboard Toggle word wrap

    弃用基础架构和应用程序节点的唯一区别在于,在基础架构节点被撤离后,如果没有计划替换该节点,在基础架构节点上运行的服务可以缩减:

    $ oc scale dc/router --replicas 2
    deploymentconfig "router" scaled
    
    $ oc scale dc/docker-registry --replicas 2
    deploymentconfig "docker-registry" scaled
    Copy to Clipboard Toggle word wrap
  6. 现在,每个基础架构节点只运行一个 pod 中的一类:

    $ oc get pods
    NAME                       READY     STATUS    RESTARTS   AGE
    docker-registry-1-kr8jq    1/1       Running   0          1d
    docker-registry-1-ncpl2    1/1       Running   0          1d
    registry-console-1-g4nqg   1/1       Running   0          1d
    router-1-85qm4             1/1       Running   0          1d
    router-1-q5sr8             1/1       Running   0          1d
    
    $ oc describe po/docker-registry-1-kr8jq | grep Node:
    Node:			ocp-infra-node-p5zj/10.156.0.9
    
    $ oc describe po/docker-registry-1-ncpl2 | grep Node:
    Node:			ocp-infra-node-rghb/10.156.0.10
    Copy to Clipboard Toggle word wrap
    注意

    要提供完整的高可用性集群,至少有三个基础架构节点应当始终可用。

  7. 验证节点上的调度是否已禁用:

    $ oc get nodes
    NAME                  STATUS                     AGE       VERSION
    ocp-infra-node-b7pl   Ready,SchedulingDisabled   1d        v1.6.1+5115d708d7
    ocp-infra-node-p5zj   Ready                      1d        v1.6.1+5115d708d7
    ocp-infra-node-rghb   Ready                      1d        v1.6.1+5115d708d7
    ocp-master-dgf8       Ready,SchedulingDisabled   1d        v1.6.1+5115d708d7
    ocp-master-q1v2       Ready,SchedulingDisabled   1d        v1.6.1+5115d708d7
    ocp-master-vq70       Ready,SchedulingDisabled   1d        v1.6.1+5115d708d7
    ocp-node-020m         Ready                      1d        v1.6.1+5115d708d7
    ocp-node-7t5p         Ready                      1d        v1.6.1+5115d708d7
    ocp-node-n0dd         Ready                      1d        v1.6.1+5115d708d7
    Copy to Clipboard Toggle word wrap

    节点没有包含任何 pod:

    $ oc describe node ocp-infra-node-b7pl
    Name:			ocp-infra-node-b7pl
    Role:
    Labels:			beta.kubernetes.io/arch=amd64
    			beta.kubernetes.io/instance-type=n1-standard-2
    			beta.kubernetes.io/os=linux
    			failure-domain.beta.kubernetes.io/region=europe-west3
    			failure-domain.beta.kubernetes.io/zone=europe-west3-c
    			kubernetes.io/hostname=ocp-infra-node-b7pl
    			role=infra
    Annotations:		volumes.kubernetes.io/controller-managed-attach-detach=true
    Taints:			<none>
    CreationTimestamp:	Wed, 22 Nov 2017 09:36:36 -0500
    Phase:
    Conditions:
      ...
    Addresses:		10.156.0.11,ocp-infra-node-b7pl
    Capacity:
     cpu:		2
     memory:	7494480Ki
     pods:		20
    Allocatable:
     cpu:		2
     memory:	7392080Ki
     pods:		20
    System Info:
     Machine ID:			bc95ccf67d047f2ae42c67862c202e44
     System UUID:			9762CC3D-E23C-AB13-B8C5-FA16F0BCCE4C
     Boot ID:			ca8bf088-905d-4ec0-beec-8f89f4527ce4
     Kernel Version:		3.10.0-693.5.2.el7.x86_64
     OS Image:			Employee SKU
     Operating System:		linux
     Architecture:			amd64
     Container Runtime Version:	docker://1.12.6
     Kubelet Version:		v1.6.1+5115d708d7
     Kube-Proxy Version:		v1.6.1+5115d708d7
    ExternalID:			437740049672994824
    Non-terminated Pods:		(0 in total)
      Namespace			Name		CPU Requests	CPU Limits	Memory Requests	Memory Limits
      ---------			----		------------	----------	---------------	-------------
    Allocated resources:
      (Total limits may be over 100 percent, i.e., overcommitted.)
      CPU Requests	CPU Limits	Memory Requests	Memory Limits
      ------------	----------	---------------	-------------
      0 (0%)	0 (0%)		0 (0%)		0 (0%)
    Events:		<none>
    Copy to Clipboard Toggle word wrap
  8. /etc/haproxy/haproxy.cfg 配置文件中的 backend 部分删除基础架构实例:

    backend router80
        balance source
        mode tcp
        server infra-1.example.com 192.168.55.12:80 check
        server infra-2.example.com 192.168.55.13:80 check
    
    backend router443
        balance source
        mode tcp
        server infra-1.example.com 192.168.55.12:443 check
        server infra-2.example.com 192.168.55.13:443 check
    Copy to Clipboard Toggle word wrap
  9. 然后,重新启动 haproxy 服务。

    $ sudo systemctl restart haproxy
    Copy to Clipboard Toggle word wrap
  10. 在所有 pod 都被逐出后,使用以下命令从集群中删除节点:

    $ oc delete node ocp-infra-node-b7pl
    node "ocp-infra-node-b7pl" deleted
    Copy to Clipboard Toggle word wrap
    $ oc get nodes
    NAME                  STATUS                     AGE       VERSION
    ocp-infra-node-p5zj   Ready                      1d        v1.6.1+5115d708d7
    ocp-infra-node-rghb   Ready                      1d        v1.6.1+5115d708d7
    ocp-master-dgf8       Ready,SchedulingDisabled   1d        v1.6.1+5115d708d7
    ocp-master-q1v2       Ready,SchedulingDisabled   1d        v1.6.1+5115d708d7
    ocp-master-vq70       Ready,SchedulingDisabled   1d        v1.6.1+5115d708d7
    ocp-node-020m         Ready                      1d        v1.6.1+5115d708d7
    ocp-node-7t5p         Ready                      1d        v1.6.1+5115d708d7
    ocp-node-n0dd         Ready                      1d        v1.6.1+5115d708d7
    Copy to Clipboard Toggle word wrap
注意

如需有关撤离和排空 pod 或节点的更多信息,请参阅节点维护部分。

5.3.1.1. 替换节点主机

如果需要添加节点以替代已弃用节点,请按照 将主机添加到现有集群 部分。

返回顶部
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2025 Red Hat