This documentation is for a release that is no longer maintained
See documentation for the latest supported version 3 or the latest supported version 4.Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Chapter 38. Replacing a failed etcd member
If some etcd members fail, but you still have a quorum of etcd members, you can use the remaining etcd members and the data that they contain to add more etcd members without etcd or cluster downtime.
38.1. Removing a failed etcd node Link kopierenLink in die Zwischenablage kopiert!
Before you add a new etcd node, remove the failed one.
Procedure
From an active etcd host, remove the failed etcd node:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Stop the etcd service on the failed etcd member by removing the etcd pod definition:
mkdir -p /etc/origin/node/pods-stopped mv /etc/origin/node/pods/* /etc/origin/node/pods-stopped/
# mkdir -p /etc/origin/node/pods-stopped # mv /etc/origin/node/pods/* /etc/origin/node/pods-stopped/Copy to Clipboard Copied! Toggle word wrap Toggle overflow
38.2. Adding an etcd member Link kopierenLink in die Zwischenablage kopiert!
You can add an etcd host either by using an Ansible playbook or by manual steps.
38.2.1. Adding a new etcd host using Ansible Link kopierenLink in die Zwischenablage kopiert!
Procedure
In the Ansible inventory file, create a new group named
[new_etcd]and add the new host. Then, add thenew_etcdgroup as a child of the[OSEv3]group:Copy to Clipboard Copied! Toggle word wrap Toggle overflow From the host that installed OpenShift Container Platform and hosts the Ansible inventory file, run the etcd
scaleupplaybook:ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-etcd/scaleup.yml
$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-etcd/scaleup.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow After the playbook runs, modify the inventory file to reflect the current status by moving the new etcd host from the
[new_etcd]group to the[etcd]group:Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you use Flannel, modify the
flanneldservice configuration on every OpenShift Container Platform host, located at/etc/sysconfig/flanneld, to include the new etcd host:FLANNEL_ETCD_ENDPOINTS=https://master-0.example.com:2379,https://master-1.example.com:2379,https://master-2.example.com:2379,https://etcd0.example.com:2379
FLANNEL_ETCD_ENDPOINTS=https://master-0.example.com:2379,https://master-1.example.com:2379,https://master-2.example.com:2379,https://etcd0.example.com:2379Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the
flanneldservice:systemctl restart flanneld.service
# systemctl restart flanneld.serviceCopy to Clipboard Copied! Toggle word wrap Toggle overflow
38.2.2. Manually adding a new etcd host Link kopierenLink in die Zwischenablage kopiert!
If you do not run etcd as static pods on master nodes, you might need to add another etcd host.
Procedure
Modify the current etcd cluster
To create the etcd certificates, run the openssl command, replacing the values with those from your environment.
Create some environment variables:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe custom
opensslextensions used asetcd_v3_ca_*include the $SAN environment variable assubjectAltName. See/etc/etcd/ca/openssl.cnffor more information.Create the directory to store the configuration and certificates:
mkdir -p ${PREFIX}# mkdir -p ${PREFIX}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the server certificate request and sign it: (server.csr and server.crt)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the peer certificate request and sign it: (peer.csr and peer.crt)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the current etcd configuration and
ca.crtfiles from the current node as examples to modify later:cp /etc/etcd/etcd.conf ${PREFIX} cp /etc/etcd/ca.crt ${PREFIX}# cp /etc/etcd/etcd.conf ${PREFIX} # cp /etc/etcd/ca.crt ${PREFIX}Copy to Clipboard Copied! Toggle word wrap Toggle overflow While still on the surviving etcd host, add the new host to the cluster. To add additional etcd members to the cluster, you must first adjust the default localhost peer in the
peerURLsvalue for the first member:Get the member ID for the first member using the
member listcommand:etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \ member list# etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \1 member listCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Ensure that you specify the URLs of only active etcd members in the
--peersparameter value.
Obtain the IP address where etcd listens for cluster peers:
ss -l4n | grep 2380
$ ss -l4n | grep 2380Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update the value of
peerURLsusing theetcdctl member updatecommand by passing the member ID and IP address obtained from the previous steps:etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \ member update 511b7fb6cc0001 https://172.18.1.18:2380# etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \ member update 511b7fb6cc0001 https://172.18.1.18:2380Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Re-run the
member listcommand and ensure the peer URLs no longer include localhost.
Add the new host to the etcd cluster. Note that the new host is not yet configured, so the status stays as
unstarteduntil the you configure the new host.WarningYou must add each member and bring it online one at a time. When you add each additional member to the cluster, you must adjust the
peerURLslist for the current peers. ThepeerURLslist grows by one for each member added. Theetcdctl member addcommand outputs the values that you must set in the etcd.conf file as you add each member, as described in the following instructions.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- In this line,
10.3.9.222is a label for the etcd member. You can specify the host name, IP address, or a simple name.
Update the sample
${PREFIX}/etcd.conffile.Replace the following values with the values generated in the previous step:
- ETCD_NAME
- ETCD_INITIAL_CLUSTER
- ETCD_INITIAL_CLUSTER_STATE
Modify the following variables with the new host IP from the output of the previous step. You can use
${NEW_ETCD_IP}as the value.ETCD_LISTEN_PEER_URLS ETCD_LISTEN_CLIENT_URLS ETCD_INITIAL_ADVERTISE_PEER_URLS ETCD_ADVERTISE_CLIENT_URLS
ETCD_LISTEN_PEER_URLS ETCD_LISTEN_CLIENT_URLS ETCD_INITIAL_ADVERTISE_PEER_URLS ETCD_ADVERTISE_CLIENT_URLSCopy to Clipboard Copied! Toggle word wrap Toggle overflow - If you previously used the member system as an etcd node, you must overwrite the current values in the /etc/etcd/etcd.conf file.
Check the file for syntax errors or missing IP addresses, otherwise the etcd service might fail:
vi ${PREFIX}/etcd.conf# vi ${PREFIX}/etcd.confCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
On the node that hosts the installation files, update the
[etcd]hosts group in the /etc/ansible/hosts inventory file. Remove the old etcd hosts and add the new ones. Create a
tgzfile that contains the certificates, the sample configuration file, and thecaand copy it to the new host:tar -czvf /etc/etcd/generated_certs/${CN}.tgz -C ${PREFIX} . scp /etc/etcd/generated_certs/${CN}.tgz ${CN}:/tmp/# tar -czvf /etc/etcd/generated_certs/${CN}.tgz -C ${PREFIX} . # scp /etc/etcd/generated_certs/${CN}.tgz ${CN}:/tmp/Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Modify the new etcd host
Install
iptables-servicesto provide iptables utilities to open the required ports for etcd:yum install -y iptables-services
# yum install -y iptables-servicesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
OS_FIREWALL_ALLOWfirewall rules to allow etcd to communicate:- Port 2379/tcp for clients
Port 2380/tcp for peer communication
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIn this example, a new chain
OS_FIREWALL_ALLOWis created, which is the standard naming the OpenShift Container Platform installer uses for firewall rules.WarningIf the environment is hosted in an IaaS environment, modify the security groups for the instance to allow incoming traffic to those ports as well.
Install etcd:
yum install -y etcd
# yum install -y etcdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure version
etcd-2.3.7-4.el7.x86_64or greater is installed,Ensure the etcd service is not running by removing the etcd pod definition:
mkdir -p /etc/origin/node/pods-stopped mv /etc/origin/node/pods/* /etc/origin/node/pods-stopped/
# mkdir -p /etc/origin/node/pods-stopped # mv /etc/origin/node/pods/* /etc/origin/node/pods-stopped/Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove any etcd configuration and data:
rm -Rf /etc/etcd/* rm -Rf /var/lib/etcd/*
# rm -Rf /etc/etcd/* # rm -Rf /var/lib/etcd/*Copy to Clipboard Copied! Toggle word wrap Toggle overflow Extract the certificates and configuration files:
tar xzvf /tmp/etcd0.example.com.tgz -C /etc/etcd/
# tar xzvf /tmp/etcd0.example.com.tgz -C /etc/etcd/Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start etcd on the new host:
systemctl enable etcd --now
# systemctl enable etcd --nowCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the host is part of the cluster and the current cluster health:
If you use the v2 etcd api, run the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you use the v3 etcd api, run the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Modify each OpenShift Container Platform master
Modify the master configuration in the
etcClientInfosection of the/etc/origin/master/master-config.yamlfile on every master. Add the new etcd host to the list of the etcd servers OpenShift Container Platform uses to store the data, and remove any failed etcd hosts:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the master API service:
On every master:
master-restart api master-restart controllers
# master-restart api # master-restart controllersCopy to Clipboard Copied! Toggle word wrap Toggle overflow WarningThe number of etcd nodes must be odd, so you must add at least two hosts.
If you use Flannel, modify the
flanneldservice configuration located at/etc/sysconfig/flanneldon every OpenShift Container Platform host to include the new etcd host:FLANNEL_ETCD_ENDPOINTS=https://master-0.example.com:2379,https://master-1.example.com:2379,https://master-2.example.com:2379,https://etcd0.example.com:2379
FLANNEL_ETCD_ENDPOINTS=https://master-0.example.com:2379,https://master-1.example.com:2379,https://master-2.example.com:2379,https://etcd0.example.com:2379Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the
flanneldservice:systemctl restart flanneld.service
# systemctl restart flanneld.serviceCopy to Clipboard Copied! Toggle word wrap Toggle overflow