搜索

5.4.5. 删除 etcd 主机

download PDF

如果 etcd 主机无法恢复,将其从集群中移除。

在所有 master 主机上执行的步骤

流程
  1. 从 etcd 集群中删除其他 etcd 主机。为每个 etcd 节点运行以下命令:

    # etcdctl3 --endpoints=https://<surviving host IP>:2379
      --cacert=/etc/etcd/ca.crt
      --cert=/etc/etcd/peer.crt
      --key=/etc/etcd/peer.key member remove <failed member ID>
  2. 在每个 master 上重启 master API 服务:

    # master-restart api restart-master controller

在当前 etcd 集群中执行的步骤

流程
  1. 从集群中删除失败的主机:

    # etcdctl2 cluster-health
    member 5ee217d19001 is healthy: got healthy result from https://192.168.55.12:2379
    member 2a529ba1840722c0 is healthy: got healthy result from https://192.168.55.8:2379
    failed to check the health of member 8372784203e11288 on https://192.168.55.21:2379: Get https://192.168.55.21:2379/health: dial tcp 192.168.55.21:2379: getsockopt: connection refused
    member 8372784203e11288 is unreachable: [https://192.168.55.21:2379] are all unreachable
    member ed4f0efd277d7599 is healthy: got healthy result from https://192.168.55.13:2379
    cluster is healthy
    
    # etcdctl2 member remove 8372784203e11288 1
    Removed member 8372784203e11288 from cluster
    
    # etcdctl2 cluster-health
    member 5ee217d19001 is healthy: got healthy result from https://192.168.55.12:2379
    member 2a529ba1840722c0 is healthy: got healthy result from https://192.168.55.8:2379
    member ed4f0efd277d7599 is healthy: got healthy result from https://192.168.55.13:2379
    cluster is healthy
    1
    remove 命令需要 etcd ID,而不是主机名。
  2. 要确保 etcd 配置在 etcd 服务重启时不使用失败的主机,修改所有剩余的 etcd 主机上的 /etc/etcd/etcd.conf 文件,并在 ETCD_INITIAL_CLUSTER 变量的值中删除失败主机:

    # vi /etc/etcd/etcd.conf

    例如:

    ETCD_INITIAL_CLUSTER=master-0.example.com=https://192.168.55.8:2380,master-1.example.com=https://192.168.55.12:2380,master-2.example.com=https://192.168.55.13:2380

    成为:

    ETCD_INITIAL_CLUSTER=master-0.example.com=https://192.168.55.8:2380,master-1.example.com=https://192.168.55.12:2380
    注意

    不需要重启 etcd 服务,因为失败的主机是使用 etcdctl 被删除。

  3. 修改 Ansible 清单文件,以反映集群的当前状态,并避免在重新运行 playbook 时出现问题:

    [OSEv3:children]
    masters
    nodes
    etcd
    
    ... [OUTPUT ABBREVIATED] ...
    
    [etcd]
    master-0.example.com
    master-1.example.com
  4. 如果您使用 Flannel,请修改每个主机上 /etc/sysconfig/flanneldflanneld 服务配置并删除 etcd 主机:

    FLANNEL_ETCD_ENDPOINTS=https://master-0.example.com:2379,https://master-1.example.com:2379,https://master-2.example.com:2379
  5. 重启 flanneld 服务:

    # systemctl restart flanneld.service
Red Hat logoGithubRedditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

© 2024 Red Hat, Inc.