5.4.5. 删除 etcd 主机
如果 etcd 主机无法恢复,将其从集群中移除。
在所有 master 主机上执行的步骤
流程
从 etcd 集群中删除其他 etcd 主机。为每个 etcd 节点运行以下命令:
# etcdctl3 --endpoints=https://<surviving host IP>:2379 --cacert=/etc/etcd/ca.crt --cert=/etc/etcd/peer.crt --key=/etc/etcd/peer.key member remove <failed member ID>
在每个 master 上重启 master API 服务:
# master-restart api restart-master controller
在当前 etcd 集群中执行的步骤
流程
从集群中删除失败的主机:
# etcdctl2 cluster-health member 5ee217d19001 is healthy: got healthy result from https://192.168.55.12:2379 member 2a529ba1840722c0 is healthy: got healthy result from https://192.168.55.8:2379 failed to check the health of member 8372784203e11288 on https://192.168.55.21:2379: Get https://192.168.55.21:2379/health: dial tcp 192.168.55.21:2379: getsockopt: connection refused member 8372784203e11288 is unreachable: [https://192.168.55.21:2379] are all unreachable member ed4f0efd277d7599 is healthy: got healthy result from https://192.168.55.13:2379 cluster is healthy # etcdctl2 member remove 8372784203e11288 1 Removed member 8372784203e11288 from cluster # etcdctl2 cluster-health member 5ee217d19001 is healthy: got healthy result from https://192.168.55.12:2379 member 2a529ba1840722c0 is healthy: got healthy result from https://192.168.55.8:2379 member ed4f0efd277d7599 is healthy: got healthy result from https://192.168.55.13:2379 cluster is healthy
- 1
remove
命令需要 etcd ID,而不是主机名。
要确保 etcd 配置在 etcd 服务重启时不使用失败的主机,修改所有剩余的 etcd 主机上的
/etc/etcd/etcd.conf
文件,并在ETCD_INITIAL_CLUSTER
变量的值中删除失败主机:# vi /etc/etcd/etcd.conf
例如:
ETCD_INITIAL_CLUSTER=master-0.example.com=https://192.168.55.8:2380,master-1.example.com=https://192.168.55.12:2380,master-2.example.com=https://192.168.55.13:2380
成为:
ETCD_INITIAL_CLUSTER=master-0.example.com=https://192.168.55.8:2380,master-1.example.com=https://192.168.55.12:2380
注意不需要重启 etcd 服务,因为失败的主机是使用
etcdctl
被删除。修改 Ansible 清单文件,以反映集群的当前状态,并避免在重新运行 playbook 时出现问题:
[OSEv3:children] masters nodes etcd ... [OUTPUT ABBREVIATED] ... [etcd] master-0.example.com master-1.example.com
如果您使用 Flannel,请修改每个主机上
/etc/sysconfig/flanneld
的flanneld
服务配置并删除 etcd 主机:FLANNEL_ETCD_ENDPOINTS=https://master-0.example.com:2379,https://master-1.example.com:2379,https://master-2.example.com:2379
重启
flanneld
服务:# systemctl restart flanneld.service