This documentation is for a release that is no longer maintained
See documentation for the latest supported version 3 or the latest supported version 4.Questo contenuto non è disponibile nella lingua selezionata.
Chapter 4. Replacing a master host
You can replace a failed master host.
First, remove the failed master host from your cluster, and then add a replacement master host. If the failed master host ran etcd, scale up etcd by adding etcd to the new master host.
You must complete all sections of this topic.
4.1. Deprecating a master host Copia collegamentoCollegamento copiato negli appunti!
Master hosts run important services, such as the OpenShift Container Platform API and controllers services. In order to deprecate a master host, these services must be stopped.
The OpenShift Container Platform API service is an active/active service, so stopping the service does not affect the environment as long as the requests are sent to a separate master server. However, the OpenShift Container Platform controllers service is an active/passive service, where the services leverage etcd to decide the active master.
Deprecating a master host in a multi-master architecture includes removing the master from the load balancer pool to avoid new connections attempting to use that master. This process depends heavily on the load balancer used. The steps below show the details of removing the master from haproxy
. In the event that OpenShift Container Platform is running on a cloud provider, or using a F5
appliance, see the specific product documents to remove the master from rotation.
Procedure
Remove the
backend
section in the/etc/haproxy/haproxy.cfg
configuration file. For example, if deprecating a master namedmaster-0.example.com
usinghaproxy
, ensure the host name is removed from the following:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Then, restart the
haproxy
service.sudo systemctl restart haproxy
$ sudo systemctl restart haproxy
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Once the master is removed from the load balancer, disable the API and controller services:
sudo systemctl disable --now atomic-openshift-master-api sudo systemctl disable --now atomic-openshift-master-controllers
$ sudo systemctl disable --now atomic-openshift-master-api $ sudo systemctl disable --now atomic-openshift-master-controllers
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Because the master host is a unschedulable OpenShift Container Platform node, follow the steps in the Deprecating a node host section.
Remove the master host from the
[masters]
and[nodes]
groups in the/etc/ansible/hosts
Ansible inventory file to avoid issues if running any Ansible tasks using that inventory file.WarningDeprecating the first master host listed in the Ansible inventory file requires extra precautions.
The
/etc/origin/master/ca.serial.txt
file is generated on only the first master listed in the Ansible host inventory. If you deprecate the first master host, copy the/etc/origin/master/ca.serial.txt
file to the rest of master hosts before the process.The
kubernetes
service includes the master host IPs as endpoints. To verify that the master has been properly deprecated, review thekubernetes
service output and see if the deprecated master has been removed:Copy to Clipboard Copied! Toggle word wrap Toggle overflow After the master has been successfully deprecated, the host where the master was previously running can be safely deleted.
4.2. Adding hosts Copia collegamentoCollegamento copiato negli appunti!
You can add new hosts to your cluster by running the scaleup.yml playbook. This playbook queries the master, generates and distributes new certificates for the new hosts, and then runs the configuration playbooks on only the new hosts. Before running the scaleup.yml playbook, complete all prerequisite host preparation steps.
The scaleup.yml playbook configures only the new host. It does not update NO_PROXY in master services, and it does not restart master services.
You must have an existing inventory file,for example /etc/ansible/hosts, that is representative of your current cluster configuration in order to run the scaleup.yml playbook. If you previously used the atomic-openshift-installer
command to run your installation, you can check ~/.config/openshift/hosts for the last inventory file that the installer generated and use that file as your inventory file. You can modify this file as required. You must then specify the file location with -i
when you run the ansible-playbook
.
See the cluster limits section for the recommended maximum number of nodes.
Procedure
Ensure you have the latest playbooks by updating the atomic-openshift-utils package:
yum update atomic-openshift-utils
# yum update atomic-openshift-utils
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Edit your /etc/ansible/hosts file and add new_<host_type> to the [OSEv3:children] section:
For example, to add a new node host, add new_nodes:
[OSEv3:children] masters nodes new_nodes
[OSEv3:children] masters nodes new_nodes
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To add new master hosts, add new_masters.
Create a [new_<host_type>] section to specify host information for the new hosts. Format this section like an existing section, as shown in the following example of adding a new node:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow See Configuring Host Variables for more options.
When adding new masters, add hosts to both the [new_masters] section and the [new_nodes] section to ensure that the new master host is part of the OpenShift SDN.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ImportantIf you label a master host with the
region=infra
label and have no other dedicated infrastructure nodes, you must also explicitly mark the host as schedulable by addingopenshift_schedulable=true
to the entry. Otherwise, the registry and router pods cannot be placed anywhere.Run the scaleup.yml playbook. If your inventory file is located somewhere other than the default of /etc/ansible/hosts, specify the location with the
-i
option.For additional nodes:
ansible-playbook [-i /path/to/file] \ /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-node/scaleup.yml
# ansible-playbook [-i /path/to/file] \ /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-node/scaleup.yml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For additional masters:
ansible-playbook [-i /path/to/file] \ /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-master/scaleup.yml
# ansible-playbook [-i /path/to/file] \ /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-master/scaleup.yml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
- After the playbook runs, verify the installation.
Move any hosts that you defined in the [new_<host_type>] section to their appropriate section. By moving these hosts, subsequent playbook runs that use this inventory file treat the nodes correctly. You can keep the empty [new_<host_type>] section. For example, when adding new nodes:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
4.3. Scaling etcd Copia collegamentoCollegamento copiato negli appunti!
You can scale the etcd cluster vertically by adding more resources to the etcd hosts or horizontally by adding more etcd hosts.
Due to the voting system etcd uses, the cluster must always contain an odd number of members.
Having a cluster with an odd number of etcd hosts can account for fault tolerance. Having an odd number of etcd hosts does not change the number needed for a quorum but increases the tolerance for failure. For example, with a cluster of three members, quorum is two, which leaves a failure tolerance of one. This ensures the cluster continues to operate if two of the members are healthy.
Having an in-production cluster of three etcd hosts is recommended.
The new host requires a fresh Red Hat Enterprise Linux version 7 dedicated host. The etcd storage should be located on an SSD disk to achieve maximum performance and on a dedicated disk mounted in /var/lib/etcd
.
Prerequisites
- Before you add a new etcd host, perform a backup of both etcd configuration and data to prevent data loss.
Check the current etcd cluster status to avoid adding new hosts to an unhealthy cluster.
If you use the v2 etcd api, run this command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you use the v3 etcd api, run this command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Before running the
scaleup
playbook, ensure the new host is registered to the proper Red Hat software channels:Copy to Clipboard Copied! Toggle word wrap Toggle overflow etcd is hosted in the
rhel-7-server-extras-rpms
software channel.Upgrade etcd and iptables on the current etcd nodes:
yum update etcd iptables-services
# yum update etcd iptables-services
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Back up the /etc/etcd configuration for the etcd hosts.
- If the new etcd members will also be OpenShift Container Platform nodes, add the desired number of hosts to the cluster.
- The rest of this procedure assumes you added one host, but if you add multiple hosts, perform all steps on each host.
4.3.1. Adding a new etcd host using Ansible Copia collegamentoCollegamento copiato negli appunti!
Procedure
In the Ansible inventory file, create a new group named
[new_etcd]
and add the new host. Then, add thenew_etcd
group as a child of the[OSEv3]
group:Copy to Clipboard Copied! Toggle word wrap Toggle overflow From the host that installed OpenShift Container Platform and hosts the Ansible inventory file, run the etcd
scaleup
playbook:ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-etcd/scaleup.yml
$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-etcd/scaleup.yml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow After the playbook runs, modify the inventory file to reflect the current status by moving the new etcd host from the
[new_etcd]
group to the[etcd]
group:Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you use Flannel, modify the
flanneld
service configuration on every OpenShift Container Platform host, located at/etc/sysconfig/flanneld
, to include the new etcd host:FLANNEL_ETCD_ENDPOINTS=https://master-0.example.com:2379,https://master-1.example.com:2379,https://master-2.example.com:2379,https://etcd0.example.com:2379
FLANNEL_ETCD_ENDPOINTS=https://master-0.example.com:2379,https://master-1.example.com:2379,https://master-2.example.com:2379,https://etcd0.example.com:2379
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the
flanneld
service:systemctl restart flanneld.service
# systemctl restart flanneld.service
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
4.3.2. Manually adding a new etcd host Copia collegamentoCollegamento copiato negli appunti!
Procedure
Modify the current etcd cluster
To create the etcd certificates, run the openssl
command, replacing the values with those from your environment.
Create some environment variables:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe custom
openssl
extensions used asetcd_v3_ca_*
include the $SAN environment variable assubjectAltName
. See/etc/etcd/ca/openssl.cnf
for more information.Create the directory to store the configuration and certificates:
mkdir -p ${PREFIX}
# mkdir -p ${PREFIX}
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the server certificate request and sign it: (server.csr and server.crt)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the peer certificate request and sign it: (peer.csr and peer.crt)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the current etcd configuration and
ca.crt
files from the current node as examples to modify later:cp /etc/etcd/etcd.conf ${PREFIX} cp /etc/etcd/ca.crt ${PREFIX}
# cp /etc/etcd/etcd.conf ${PREFIX} # cp /etc/etcd/ca.crt ${PREFIX}
Copy to Clipboard Copied! Toggle word wrap Toggle overflow While still on the surviving etcd host, add the new host to the cluster. To add additional etcd members to the cluster, you must first adjust the default localhost peer in the
peerURLs
value for the first member:Get the member ID for the first member using the
member list
command:etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \ member list
# etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \
1 member list
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Ensure that you specify the URLs of only active etcd members in the
--peers
parameter value.
Obtain the IP address where etcd listens for cluster peers:
ss -l4n | grep 2380
$ ss -l4n | grep 2380
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update the value of
peerURLs
using theetcdctl member update
command by passing the member ID and IP address obtained from the previous steps:etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \ member update 511b7fb6cc0001 https://172.18.1.18:2380
# etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \ member update 511b7fb6cc0001 https://172.18.1.18:2380
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Re-run the
member list
command and ensure the peer URLs no longer include localhost.
Add the new host to the etcd cluster. Note that the new host is not yet configured, so the status stays as
unstarted
until the you configure the new host.WarningYou must add each member and bring it online one at a time. When you add each additional member to the cluster, you must adjust the
peerURLs
list for the current peers. ThepeerURLs
list grows by one for each member added. Theetcdctl member add
command outputs the values that you must set in the etcd.conf file as you add each member, as described in the following instructions.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- In this line,
10.3.9.222
is a label for the etcd member. You can specify the host name, IP address, or a simple name.
Update the sample
${PREFIX}/etcd.conf
file.Replace the following values with the values generated in the previous step:
- ETCD_NAME
- ETCD_INITIAL_CLUSTER
- ETCD_INITIAL_CLUSTER_STATE
Modify the following variables with the new host IP from the output of the previous step. You can use
${NEW_ETCD_IP}
as the value.ETCD_LISTEN_PEER_URLS ETCD_LISTEN_CLIENT_URLS ETCD_INITIAL_ADVERTISE_PEER_URLS ETCD_ADVERTISE_CLIENT_URLS
ETCD_LISTEN_PEER_URLS ETCD_LISTEN_CLIENT_URLS ETCD_INITIAL_ADVERTISE_PEER_URLS ETCD_ADVERTISE_CLIENT_URLS
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - If you previously used the member system as an etcd node, you must overwrite the current values in the /etc/etcd/etcd.conf file.
Check the file for syntax errors or missing IP addresses, otherwise the etcd service might fail:
vi ${PREFIX}/etcd.conf
# vi ${PREFIX}/etcd.conf
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
On the node that hosts the installation files, update the
[etcd]
hosts group in the /etc/ansible/hosts inventory file. Remove the old etcd hosts and add the new ones. Create a
tgz
file that contains the certificates, the sample configuration file, and theca
and copy it to the new host:tar -czvf /etc/etcd/generated_certs/${CN}.tgz -C ${PREFIX} . scp /etc/etcd/generated_certs/${CN}.tgz ${CN}:/tmp/
# tar -czvf /etc/etcd/generated_certs/${CN}.tgz -C ${PREFIX} . # scp /etc/etcd/generated_certs/${CN}.tgz ${CN}:/tmp/
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Modify the new etcd host
Install
iptables-services
to provide iptables utilities to open the required ports for etcd:yum install -y iptables-services
# yum install -y iptables-services
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
OS_FIREWALL_ALLOW
firewall rules to allow etcd to communicate:- Port 2379/tcp for clients
Port 2380/tcp for peer communication
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIn this example, a new chain
OS_FIREWALL_ALLOW
is created, which is the standard naming the OpenShift Container Platform installer uses for firewall rules.WarningIf the environment is hosted in an IaaS environment, modify the security groups for the instance to allow incoming traffic to those ports as well.
Install etcd:
yum install -y etcd
# yum install -y etcd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure version
etcd-2.3.7-4.el7.x86_64
or greater is installed,Ensure the etcd service is not running:
systemctl disable etcd --now
# systemctl disable etcd --now
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove any etcd configuration and data:
rm -Rf /etc/etcd/* rm -Rf /var/lib/etcd/*
# rm -Rf /etc/etcd/* # rm -Rf /var/lib/etcd/*
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Extract the certificates and configuration files:
tar xzvf /tmp/etcd0.example.com.tgz -C /etc/etcd/
# tar xzvf /tmp/etcd0.example.com.tgz -C /etc/etcd/
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Modify the file ownership permissions:
chown -R etcd/etcd /etc/etcd/* chown -R etcd/etcd /var/lib/etcd/
# chown -R etcd/etcd /etc/etcd/* # chown -R etcd/etcd /var/lib/etcd/
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start etcd on the new host:
systemctl enable etcd --now
# systemctl enable etcd --now
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the host is part of the cluster and the current cluster health:
If you use the v2 etcd api, run the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you use the v3 etcd api, run the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Modify each OpenShift Container Platform master
Modify the master configuration in the
etcClientInfo
section of the/etc/origin/master/master-config.yaml
file on every master. Add the new etcd host to the list of the etcd servers OpenShift Container Platform uses to store the data, and remove any failed etcd hosts:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the master API service:
On every master:
systemctl restart atomic-openshift-master-api
# systemctl restart atomic-openshift-master-api
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Or, on a single master cluster installation:
systemctl restart atomic-openshift-master
# systemctl restart atomic-openshift-master
Copy to Clipboard Copied! Toggle word wrap Toggle overflow WarningThe number of etcd nodes must be odd, so you must add at least two hosts.
If you use Flannel, modify the
flanneld
service configuration located at/etc/sysconfig/flanneld
on every OpenShift Container Platform host to include the new etcd host:FLANNEL_ETCD_ENDPOINTS=https://master-0.example.com:2379,https://master-1.example.com:2379,https://master-2.example.com:2379,https://etcd0.example.com:2379
FLANNEL_ETCD_ENDPOINTS=https://master-0.example.com:2379,https://master-1.example.com:2379,https://master-2.example.com:2379,https://etcd0.example.com:2379
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the
flanneld
service:systemctl restart flanneld.service
# systemctl restart flanneld.service
Copy to Clipboard Copied! Toggle word wrap Toggle overflow