홈
제품
OpenShift Container Platform
4.11
백업 및 복원
5.2. 비정상적인 etcd 멤버 교체

5.2. 비정상적인 etcd 멤버 교체

이 문서는 비정상적인 단일 etcd 멤버를 교체하는 프로세스를 설명합니다.

이 프로세스는 시스템이 실행 중이 아니거나 노드가 준비되지 않았거나 etcd pod가 크래시 루프 상태에있는 등 etcd 멤버의 비정상적인 상태에 따라 다릅니다.

참고

대부분의 컨트롤 플레인 호스트가 손실된 경우 재해 복구 절차에 따라 이 프로세스 대신 이전 클러스터 상태로 복원 합니다.

교체된 멤버 에서 컨트롤 플레인 인증서가 유효하지 않은 경우 이 절차 대신 만료된 컨트롤 플레인 인증서 복구 절차를 따라야 합니다.

컨트롤 플레인 노드가 유실되고 새 노드가 생성되는 경우 etcd 클러스터 Operator는 새 TLS 인증서 생성 및 노드를 etcd 멤버로 추가하는 프로세스를 진행합니다.

5.2.1. 사전 요구 사항
링크 복사

비정상적인 etcd 멤버를 교체하기 전에 etcd 백업을 수행하십시오.

5.2.2. 비정상 etcd 멤버 식별
링크 복사

클러스터에 비정상적인 etcd 멤버가 있는지 여부를 확인할 수 있습니다.

사전 요구 사항

cluster-admin 역할의 사용자로 클러스터에 액세스할 수 있어야 합니다.

프로세스

다음 명령을 사용하여 EtcdMembersAvailable 상태의 상태 조건을 확인하십시오.

oc get etcd -o=jsonpath='{range .items[0].status.conditions[?(@.type=="EtcdMembersAvailable")]}{.message}{"\n"}'

$ oc get etcd -o=jsonpath='{range .items[0].status.conditions[?(@.type=="EtcdMembersAvailable")]}{.message}{"\n"}'

Copy to Clipboard

Toggle word wrap

출력을 확인합니다.
```
2 of 3 members are available, ip-10-0-131-183.ec2.internal is unhealthy
```
```
2 of 3 members are available, ip-10-0-131-183.ec2.internal is unhealthy
```
Copy to Clipboard Toggle word wrap
이 출력 예는 ip-10-0-131-183.ec2.internal etcd 멤버가 비정상임을 보여줍니다.

5.2.3. 비정상적인 etcd 멤버의 상태 확인
링크 복사

비정상적인 etcd 멤버를 교체하는 프로세스는 etcd가 다음의 어떤 상태에 있는지에 따라 달라집니다.

컴퓨터가 실행 중이 아니거나 노드가 준비되지 않았습니다.
etcd pod가 크래시 루프 상태에 있습니다.

다음 프로세스에서는 etcd 멤버가 어떤 상태에 있는지를 확인합니다. 이를 통해 비정상 etcd 멤버를 대체하기 위해 수행해야하는 단계를 확인할 수 있습니다.

참고

시스템이 실행되고 있지 않거나 노드가 준비되지 않았지만 곧 정상 상태로 돌아올 것으로 예상되는 경우 etcd 멤버를 교체하기 위한 절차를 수행할 필요가 없습니다. etcd 클러스터 Operator는 머신 또는 노드가 정상 상태로 돌아 오면 자동으로 동기화됩니다.

사전 요구 사항

cluster-admin 역할의 사용자로 클러스터에 액세스할 수 있어야 합니다.
비정상적인 etcd 멤버를 식별하고 있습니다.

프로세스

시스템이 실행되고 있지 않은지를 확인합니다.
```
oc get machines -A -ojsonpath='{range .items[*]}{@.status.nodeRef.name}{"\t"}{@.status.providerStatus.instanceState}{"\n"}' | grep -v running
```
```
$ oc get machines -A -ojsonpath='{range .items[*]}{@.status.nodeRef.name}{"\t"}{@.status.providerStatus.instanceState}{"\n"}' | grep -v running
```
Copy to Clipboard Toggle word wrap
출력 예
```
ip-10-0-131-183.ec2.internal  stopped 
```
```
ip-10-0-131-183.ec2.internal  stopped 
```
1
Copy to Clipboard Toggle word wrap
1
이 출력은 노드와 노드 시스템의 상태를 나열합니다. 상태가 running이 아닌 경우 시스템은 실행되지 않습니다.
시스템이 실행되고 있지 않은 경우, 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 비정상적인 etcd 멤버 교체 프로세스를 수행하십시오.
노드가 준비되지 않았는지 확인합니다.
다음 조건 중 하나에 해당하면 노드가 준비되지 않은 것입니다.
- 시스템이 실행중인 경우 노드에 액세스할 수 있는지 확인하십시오.
  $ oc get nodes -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{"\t"}{range .spec.taints[*]}{.key}{" "}' | grep unreachable
  Copy to Clipboard Toggle word wrap
  출력 예
  ip-10-0-131-183.ec2.internal node-role.kubernetes.io/master node.kubernetes.io/unreachable node.kubernetes.io/unreachable
  1
  
  Copy to Clipboard Toggle word wrap
  1
  unreachable 상태의 노드가 나열되면 노드가 준비되지 않은 것 입니다.
- 노드에 여전히 액세스할 수 있는 경우 노드가 NotReady로 나열되어 있는지 확인하십시오.
  $ oc get nodes -l node-role.kubernetes.io/master | grep "NotReady"
  Copy to Clipboard Toggle word wrap
  출력 예
  ip-10-0-131-183.ec2.internal NotReady master 122m v1.24.0
  1
  
  Copy to Clipboard Toggle word wrap
  1
  노드가 NotReady로 표시되면 노드가 준비되지 않은 것입니다.
노드가 준비되지 않은 경우 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 비정상적인 etcd 멤버 교체 프로세스를 수행하십시오.

etcd pod가 크래시 루프 상태인지 확인합니다.

시스템이 실행되고 있고 노드가 준비된 경우 etcd pod가 크래시 루프 상태인지 확인하십시오.

모든 컨트롤 플레인 노드가 Ready 로 나열되어 있는지 확인합니다.

oc get nodes -l node-role.kubernetes.io/master

$ oc get nodes -l node-role.kubernetes.io/master

Copy to Clipboard

Toggle word wrap

출력 예

NAME                           STATUS   ROLES    AGE     VERSION
ip-10-0-131-183.ec2.internal   Ready    master   6h13m   v1.24.0
ip-10-0-164-97.ec2.internal    Ready    master   6h13m   v1.24.0
ip-10-0-154-204.ec2.internal   Ready    master   6h13m   v1.24.0

NAME                           STATUS   ROLES    AGE     VERSION
ip-10-0-131-183.ec2.internal   Ready    master   6h13m   v1.24.0
ip-10-0-164-97.ec2.internal    Ready    master   6h13m   v1.24.0
ip-10-0-154-204.ec2.internal   Ready    master   6h13m   v1.24.0

Copy to Clipboard

Toggle word wrap

etcd pod의 상태가 Error 또는 CrashloopBackoff인지 확인하십시오.

oc -n openshift-etcd get pods -l k8s-app=etcd

$ oc -n openshift-etcd get pods -l k8s-app=etcd

Copy to Clipboard

Toggle word wrap

출력 예

etcd-ip-10-0-131-183.ec2.internal                2/3     Error       7          6h9m 
etcd-ip-10-0-164-97.ec2.internal                 3/3     Running     0          6h6m
etcd-ip-10-0-154-204.ec2.internal                3/3     Running     0          6h6m

etcd-ip-10-0-131-183.ec2.internal                2/3     Error       7          6h9m


etcd-ip-10-0-164-97.ec2.internal                 3/3     Running     0          6h6m
etcd-ip-10-0-154-204.ec2.internal                3/3     Running     0          6h6m

Copy to Clipboard

Toggle word wrap

1: 이 pod의 상태는 Error이므로 etcd pod는 크래시 루프 상태입니다.

etcd pod가 크래시 루프 상태인 경우etcd pod가 크래시 루프 상태인 비정상적인 etcd 멤버 교체 프로세스를 수행하십시오.

5.2.4. 비정상적인 etcd 멤버 교체
링크 복사

비정상적인 etcd 멤버의 상태에 따라 다음 절차 중 하나를 사용합니다.

5.2.4.1. 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 비정상적인 etcd 멤버 교체
링크 복사

다음에서는 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 경우의 비정상적인 etcd 멤버를 교체하는 프로세스에 대해 자세히 설명합니다.

사전 요구 사항

비정상적인 etcd 멤버를 식별했습니다.
시스템이 실행되고 있지 않거나 노드가 준비되지 않았음을 확인했습니다.
중요
다른 컨트롤 플레인 노드의 전원이 꺼진 경우 기다려야 합니다. 비정상 etcd 멤버 교체가 완료될 때까지 컨트롤 플레인 노드의 전원을 꺼야 합니다.
cluster-admin 역할의 사용자로 클러스터에 액세스할 수 있어야 합니다.
etcd 백업이 수행되었습니다.
중요
문제가 발생할 경우 클러스터를 복원할 수 있도록 이 프로세스를 수행하기 전에 etcd 백업을 수행해야합니다.

프로세스

비정상적인 멤버를 제거합니다.

영향을 받는 노드에 없는 pod를 선택합니다.

클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.

oc -n openshift-etcd get pods -l k8s-app=etcd

$ oc -n openshift-etcd get pods -l k8s-app=etcd

Copy to Clipboard

Toggle word wrap

출력 예

etcd-ip-10-0-131-183.ec2.internal                3/3     Running     0          123m
etcd-ip-10-0-164-97.ec2.internal                 3/3     Running     0          123m
etcd-ip-10-0-154-204.ec2.internal                3/3     Running     0          124m

etcd-ip-10-0-131-183.ec2.internal                3/3     Running     0          123m
etcd-ip-10-0-164-97.ec2.internal                 3/3     Running     0          123m
etcd-ip-10-0-154-204.ec2.internal                3/3     Running     0          124m

Copy to Clipboard

Toggle word wrap

실행중인 etcd 컨테이너에 연결하고 영향을 받는 노드에 없는 pod 이름을 전달합니다.
클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.
```
oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
```
```
$ oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
```
Copy to Clipboard Toggle word wrap

멤버 목록을 확인합니다.

etcdctl member list -w table

sh-4.2# etcdctl member list -w table

Copy to Clipboard

Toggle word wrap

출력 예

+------------------+---------+------------------------------+---------------------------+---------------------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+------------------------------+---------------------------+---------------------------+
| 6fc1e7c9db35841d | started | ip-10-0-131-183.ec2.internal | https://10.0.131.183:2380 | https://10.0.131.183:2379 |
| 757b6793e2408b6c | started |  ip-10-0-164-97.ec2.internal |  https://10.0.164.97:2380 |  https://10.0.164.97:2379 |
| ca8c2990a0aa29d1 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 |
+------------------+---------+------------------------------+---------------------------+---------------------------+

+------------------+---------+------------------------------+---------------------------+---------------------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+------------------------------+---------------------------+---------------------------+
| 6fc1e7c9db35841d | started | ip-10-0-131-183.ec2.internal | https://10.0.131.183:2380 | https://10.0.131.183:2379 |
| 757b6793e2408b6c | started |  ip-10-0-164-97.ec2.internal |  https://10.0.164.97:2380 |  https://10.0.164.97:2379 |
| ca8c2990a0aa29d1 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 |
+------------------+---------+------------------------------+---------------------------+---------------------------+

Copy to Clipboard

Toggle word wrap

이러한 값은 프로세스의 뒷부분에서 필요하므로 비정상 etcd 멤버의 ID와 이름을 기록해 두십시오. $ etcdctl endpoint health 명령은 교체 절차가 완료되고 새 멤버가 추가될 때까지 제거된 멤버를 나열합니다.

etcdctl member remove 명령에 ID를 지정하여 비정상적인 etcd 멤버를 제거합니다.
```
etcdctl member remove 6fc1e7c9db35841d
```
```
sh-4.2# etcdctl member remove 6fc1e7c9db35841d
```
Copy to Clipboard Toggle word wrap
출력 예
```
Member 6fc1e7c9db35841d removed from cluster ead669ce1fbfb346
```
```
Member 6fc1e7c9db35841d removed from cluster ead669ce1fbfb346
```
Copy to Clipboard Toggle word wrap

멤버 목록을 다시 표시하고 멤버가 제거되었는지 확인합니다.

etcdctl member list -w table

sh-4.2# etcdctl member list -w table

Copy to Clipboard

Toggle word wrap

출력 예

+------------------+---------+------------------------------+---------------------------+---------------------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+------------------------------+---------------------------+---------------------------+
| 757b6793e2408b6c | started |  ip-10-0-164-97.ec2.internal |  https://10.0.164.97:2380 |  https://10.0.164.97:2379 |
| ca8c2990a0aa29d1 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 |
+------------------+---------+------------------------------+---------------------------+---------------------------+

+------------------+---------+------------------------------+---------------------------+---------------------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+------------------------------+---------------------------+---------------------------+
| 757b6793e2408b6c | started |  ip-10-0-164-97.ec2.internal |  https://10.0.164.97:2380 |  https://10.0.164.97:2379 |
| ca8c2990a0aa29d1 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 |
+------------------+---------+------------------------------+---------------------------+---------------------------+

Copy to Clipboard

Toggle word wrap

이제 노드 쉘을 종료할 수 있습니다.

다음 명령을 입력하여 쿼럼 보호기를 끄십시오.
```
oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}'
```
```
$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}'
```
Copy to Clipboard Toggle word wrap
이 명령을 사용하면 보안을 다시 생성하고 정적 Pod를 롤아웃할 수 있습니다.
중요
쿼럼 가드를 끄면 나머지 etcd 인스턴스가 재부팅되는 동안 구성 변경 사항을 반영하는 동안 클러스터에 연결할 수 없을 수 있습니다.
참고
두 멤버로 실행하는 경우 etcd는 추가 멤버 실패를 허용할 수 없습니다. 나머지 멤버 중 하나를 다시 시작하면 쿼럼이 중단되고 클러스터의 다운 타임이 발생합니다. 쿼럼 가드에서는 다운타임을 일으킬 수 있는 구성 변경으로 인해 etcd가 다시 시작되지 않도록 보호하므로 이 절차를 완료하려면 비활성화해야 합니다.

삭제된 비정상 etcd 멤버의 이전 암호를 제거합니다.

삭제된 비정상 etcd 멤버의 시크릿(secrets)을 나열합니다.

oc get secrets -n openshift-etcd | grep ip-10-0-131-183.ec2.internal

$ oc get secrets -n openshift-etcd | grep ip-10-0-131-183.ec2.internal

Copy to Clipboard

Toggle word wrap

1: 이 프로세스의 앞부분에서 기록한 비정상 etcd 멤버의 이름을 전달합니다.

다음 출력에 표시된대로 피어, 서빙 및 메트릭 시크릿이 있습니다.

출력 예

etcd-peer-ip-10-0-131-183.ec2.internal              kubernetes.io/tls                     2      47m
etcd-serving-ip-10-0-131-183.ec2.internal           kubernetes.io/tls                     2      47m
etcd-serving-metrics-ip-10-0-131-183.ec2.internal   kubernetes.io/tls                     2      47m

etcd-peer-ip-10-0-131-183.ec2.internal              kubernetes.io/tls                     2      47m
etcd-serving-ip-10-0-131-183.ec2.internal           kubernetes.io/tls                     2      47m
etcd-serving-metrics-ip-10-0-131-183.ec2.internal   kubernetes.io/tls                     2      47m

Copy to Clipboard

Toggle word wrap

제거된 비정상 etcd 멤버의 시크릿을 삭제합니다.

피어 시크릿을 삭제합니다.

oc delete secret -n openshift-etcd etcd-peer-ip-10-0-131-183.ec2.internal

$ oc delete secret -n openshift-etcd etcd-peer-ip-10-0-131-183.ec2.internal

Copy to Clipboard

Toggle word wrap

서빙 시크릿을 삭제합니다.

oc delete secret -n openshift-etcd etcd-serving-ip-10-0-131-183.ec2.internal

$ oc delete secret -n openshift-etcd etcd-serving-ip-10-0-131-183.ec2.internal

Copy to Clipboard

Toggle word wrap

메트릭 시크릿을 삭제합니다.

oc delete secret -n openshift-etcd etcd-serving-metrics-ip-10-0-131-183.ec2.internal

$ oc delete secret -n openshift-etcd etcd-serving-metrics-ip-10-0-131-183.ec2.internal

Copy to Clipboard

Toggle word wrap

컨트롤 플레인 시스템을 삭제하고 다시 생성합니다. 이 시스템을 다시 만든 후에는 새 버전이 강제 실행되고 etcd는 자동으로 확장됩니다.

설치 프로그램에서 제공한 인프라를 실행 중이거나 Machine API를 사용하여 컴퓨터를 만든 경우 다음 단계를 수행합니다. 그렇지 않으면 원래 마스터를 만들 때 사용한 방법과 동일한 방법을 사용하여 새 마스터를 작성해야합니다.

비정상 멤버의 컴퓨터를 가져옵니다.

클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.

oc get machines -n openshift-machine-api -o wide

$ oc get machines -n openshift-machine-api -o wide

Copy to Clipboard

Toggle word wrap

출력 예

NAME                                        PHASE     TYPE        REGION      ZONE         AGE     NODE                           PROVIDERID                              STATE
clustername-8qw5l-master-0                  Running   m4.xlarge   us-east-1   us-east-1a   3h37m   ip-10-0-131-183.ec2.internal   aws:///us-east-1a/i-0ec2782f8287dfb7e   stopped 
clustername-8qw5l-master-1                  Running   m4.xlarge   us-east-1   us-east-1b   3h37m   ip-10-0-154-204.ec2.internal   aws:///us-east-1b/i-096c349b700a19631   running
clustername-8qw5l-master-2                  Running   m4.xlarge   us-east-1   us-east-1c   3h37m   ip-10-0-164-97.ec2.internal    aws:///us-east-1c/i-02626f1dba9ed5bba   running
clustername-8qw5l-worker-us-east-1a-wbtgd   Running   m4.large    us-east-1   us-east-1a   3h28m   ip-10-0-129-226.ec2.internal   aws:///us-east-1a/i-010ef6279b4662ced   running
clustername-8qw5l-worker-us-east-1b-lrdxb   Running   m4.large    us-east-1   us-east-1b   3h28m   ip-10-0-144-248.ec2.internal   aws:///us-east-1b/i-0cb45ac45a166173b   running
clustername-8qw5l-worker-us-east-1c-pkg26   Running   m4.large    us-east-1   us-east-1c   3h28m   ip-10-0-170-181.ec2.internal   aws:///us-east-1c/i-06861c00007751b0a   running

NAME                                        PHASE     TYPE        REGION      ZONE         AGE     NODE                           PROVIDERID                              STATE
clustername-8qw5l-master-0                  Running   m4.xlarge   us-east-1   us-east-1a   3h37m   ip-10-0-131-183.ec2.internal   aws:///us-east-1a/i-0ec2782f8287dfb7e   stopped


clustername-8qw5l-master-1                  Running   m4.xlarge   us-east-1   us-east-1b   3h37m   ip-10-0-154-204.ec2.internal   aws:///us-east-1b/i-096c349b700a19631   running
clustername-8qw5l-master-2                  Running   m4.xlarge   us-east-1   us-east-1c   3h37m   ip-10-0-164-97.ec2.internal    aws:///us-east-1c/i-02626f1dba9ed5bba   running
clustername-8qw5l-worker-us-east-1a-wbtgd   Running   m4.large    us-east-1   us-east-1a   3h28m   ip-10-0-129-226.ec2.internal   aws:///us-east-1a/i-010ef6279b4662ced   running
clustername-8qw5l-worker-us-east-1b-lrdxb   Running   m4.large    us-east-1   us-east-1b   3h28m   ip-10-0-144-248.ec2.internal   aws:///us-east-1b/i-0cb45ac45a166173b   running
clustername-8qw5l-worker-us-east-1c-pkg26   Running   m4.large    us-east-1   us-east-1c   3h28m   ip-10-0-170-181.ec2.internal   aws:///us-east-1c/i-06861c00007751b0a   running

Copy to Clipboard

Toggle word wrap

1: 이는 비정상 노드의 컨트롤 플레인 시스템 ip-10-0-131-183.ec2.internal입니다.

시스템 설정을 파일 시스템의 파일에 저장합니다.

oc get machine clustername-8qw5l-master-0 \
    -n openshift-machine-api \
    -o yaml \
    > new-master-machine.yaml

$ oc get machine clustername-8qw5l-master-0 \


    -n openshift-machine-api \
    -o yaml \
    > new-master-machine.yaml

Copy to Clipboard

Toggle word wrap

1: 비정상 노드의 컨트롤 플레인 시스템의 이름을 지정합니다.

이전 단계에서 만든 new-master-machine.yaml 파일을 편집하여 새 이름을 할당하고 불필요한 필드를 제거합니다.

전체 status 섹션을 삭제합니다.

status:
  addresses:
  - address: 10.0.131.183
    type: InternalIP
  - address: ip-10-0-131-183.ec2.internal
    type: InternalDNS
  - address: ip-10-0-131-183.ec2.internal
    type: Hostname
  lastUpdated: "2020-04-20T17:44:29Z"
  nodeRef:
    kind: Node
    name: ip-10-0-131-183.ec2.internal
    uid: acca4411-af0d-4387-b73e-52b2484295ad
  phase: Running
  providerStatus:
    apiVersion: awsproviderconfig.openshift.io/v1beta1
    conditions:
    - lastProbeTime: "2020-04-20T16:53:50Z"
      lastTransitionTime: "2020-04-20T16:53:50Z"
      message: machine successfully created
      reason: MachineCreationSucceeded
      status: "True"
      type: MachineCreation
    instanceId: i-0fdb85790d76d0c3f
    instanceState: stopped
    kind: AWSMachineProviderStatus

status:
  addresses:
  - address: 10.0.131.183
    type: InternalIP
  - address: ip-10-0-131-183.ec2.internal
    type: InternalDNS
  - address: ip-10-0-131-183.ec2.internal
    type: Hostname
  lastUpdated: "2020-04-20T17:44:29Z"
  nodeRef:
    kind: Node
    name: ip-10-0-131-183.ec2.internal
    uid: acca4411-af0d-4387-b73e-52b2484295ad
  phase: Running
  providerStatus:
    apiVersion: awsproviderconfig.openshift.io/v1beta1
    conditions:
    - lastProbeTime: "2020-04-20T16:53:50Z"
      lastTransitionTime: "2020-04-20T16:53:50Z"
      message: machine successfully created
      reason: MachineCreationSucceeded
      status: "True"
      type: MachineCreation
    instanceId: i-0fdb85790d76d0c3f
    instanceState: stopped
    kind: AWSMachineProviderStatus

Copy to Clipboard

Toggle word wrap

metadata.name 필드를 새 이름으로 변경합니다.
이전 시스템과 동일한 기본 이름을 유지하고 마지막 번호를 사용 가능한 다음 번호로 변경하는 것이 좋습니다. 이 예에서 clustername-8qw5l-master-0은 clustername-8qw5l-master-3으로 변경되어 있습니다.
예를 들어 다음과 같습니다.
```
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  ...
  name: clustername-8qw5l-master-3
  ...
```
```
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  ...
  name: clustername-8qw5l-master-3
  ...
```
Copy to Clipboard Toggle word wrap

spec.providerID 필드를 삭제합니다.

  providerID: aws:///us-east-1a/i-0fdb85790d76d0c3f

  providerID: aws:///us-east-1a/i-0fdb85790d76d0c3f

Copy to Clipboard

Toggle word wrap

비정상 멤버의 시스템을 삭제합니다.
```
oc delete machine -n openshift-machine-api clustername-8qw5l-master-0
```
```
$ oc delete machine -n openshift-machine-api clustername-8qw5l-master-0 
```
1
Copy to Clipboard Toggle word wrap
1
비정상 노드의 컨트롤 플레인 시스템의 이름을 지정합니다.

시스템이 삭제되었는지 확인합니다.

oc get machines -n openshift-machine-api -o wide

$ oc get machines -n openshift-machine-api -o wide

Copy to Clipboard

Toggle word wrap

출력 예

NAME                                        PHASE     TYPE        REGION      ZONE         AGE     NODE                           PROVIDERID                              STATE
clustername-8qw5l-master-1                  Running   m4.xlarge   us-east-1   us-east-1b   3h37m   ip-10-0-154-204.ec2.internal   aws:///us-east-1b/i-096c349b700a19631   running
clustername-8qw5l-master-2                  Running   m4.xlarge   us-east-1   us-east-1c   3h37m   ip-10-0-164-97.ec2.internal    aws:///us-east-1c/i-02626f1dba9ed5bba   running
clustername-8qw5l-worker-us-east-1a-wbtgd   Running   m4.large    us-east-1   us-east-1a   3h28m   ip-10-0-129-226.ec2.internal   aws:///us-east-1a/i-010ef6279b4662ced   running
clustername-8qw5l-worker-us-east-1b-lrdxb   Running   m4.large    us-east-1   us-east-1b   3h28m   ip-10-0-144-248.ec2.internal   aws:///us-east-1b/i-0cb45ac45a166173b   running
clustername-8qw5l-worker-us-east-1c-pkg26   Running   m4.large    us-east-1   us-east-1c   3h28m   ip-10-0-170-181.ec2.internal   aws:///us-east-1c/i-06861c00007751b0a   running

NAME                                        PHASE     TYPE        REGION      ZONE         AGE     NODE                           PROVIDERID                              STATE
clustername-8qw5l-master-1                  Running   m4.xlarge   us-east-1   us-east-1b   3h37m   ip-10-0-154-204.ec2.internal   aws:///us-east-1b/i-096c349b700a19631   running
clustername-8qw5l-master-2                  Running   m4.xlarge   us-east-1   us-east-1c   3h37m   ip-10-0-164-97.ec2.internal    aws:///us-east-1c/i-02626f1dba9ed5bba   running
clustername-8qw5l-worker-us-east-1a-wbtgd   Running   m4.large    us-east-1   us-east-1a   3h28m   ip-10-0-129-226.ec2.internal   aws:///us-east-1a/i-010ef6279b4662ced   running
clustername-8qw5l-worker-us-east-1b-lrdxb   Running   m4.large    us-east-1   us-east-1b   3h28m   ip-10-0-144-248.ec2.internal   aws:///us-east-1b/i-0cb45ac45a166173b   running
clustername-8qw5l-worker-us-east-1c-pkg26   Running   m4.large    us-east-1   us-east-1c   3h28m   ip-10-0-170-181.ec2.internal   aws:///us-east-1c/i-06861c00007751b0a   running

Copy to Clipboard

Toggle word wrap

new-master-machine.yaml 파일을 사용하여 새 시스템을 만듭니다.
```
oc apply -f new-master-machine.yaml
```
```
$ oc apply -f new-master-machine.yaml
```
Copy to Clipboard Toggle word wrap

새 시스템이 생성되었는지 확인합니다.

oc get machines -n openshift-machine-api -o wide

$ oc get machines -n openshift-machine-api -o wide

Copy to Clipboard

Toggle word wrap

출력 예

NAME                                        PHASE          TYPE        REGION      ZONE         AGE     NODE                           PROVIDERID                              STATE
clustername-8qw5l-master-1                  Running        m4.xlarge   us-east-1   us-east-1b   3h37m   ip-10-0-154-204.ec2.internal   aws:///us-east-1b/i-096c349b700a19631   running
clustername-8qw5l-master-2                  Running        m4.xlarge   us-east-1   us-east-1c   3h37m   ip-10-0-164-97.ec2.internal    aws:///us-east-1c/i-02626f1dba9ed5bba   running
clustername-8qw5l-master-3                  Provisioning   m4.xlarge   us-east-1   us-east-1a   85s     ip-10-0-133-53.ec2.internal    aws:///us-east-1a/i-015b0888fe17bc2c8   running 
clustername-8qw5l-worker-us-east-1a-wbtgd   Running        m4.large    us-east-1   us-east-1a   3h28m   ip-10-0-129-226.ec2.internal   aws:///us-east-1a/i-010ef6279b4662ced   running
clustername-8qw5l-worker-us-east-1b-lrdxb   Running        m4.large    us-east-1   us-east-1b   3h28m   ip-10-0-144-248.ec2.internal   aws:///us-east-1b/i-0cb45ac45a166173b   running
clustername-8qw5l-worker-us-east-1c-pkg26   Running        m4.large    us-east-1   us-east-1c   3h28m   ip-10-0-170-181.ec2.internal   aws:///us-east-1c/i-06861c00007751b0a   running

NAME                                        PHASE          TYPE        REGION      ZONE         AGE     NODE                           PROVIDERID                              STATE
clustername-8qw5l-master-1                  Running        m4.xlarge   us-east-1   us-east-1b   3h37m   ip-10-0-154-204.ec2.internal   aws:///us-east-1b/i-096c349b700a19631   running
clustername-8qw5l-master-2                  Running        m4.xlarge   us-east-1   us-east-1c   3h37m   ip-10-0-164-97.ec2.internal    aws:///us-east-1c/i-02626f1dba9ed5bba   running
clustername-8qw5l-master-3                  Provisioning   m4.xlarge   us-east-1   us-east-1a   85s     ip-10-0-133-53.ec2.internal    aws:///us-east-1a/i-015b0888fe17bc2c8   running


clustername-8qw5l-worker-us-east-1a-wbtgd   Running        m4.large    us-east-1   us-east-1a   3h28m   ip-10-0-129-226.ec2.internal   aws:///us-east-1a/i-010ef6279b4662ced   running
clustername-8qw5l-worker-us-east-1b-lrdxb   Running        m4.large    us-east-1   us-east-1b   3h28m   ip-10-0-144-248.ec2.internal   aws:///us-east-1b/i-0cb45ac45a166173b   running
clustername-8qw5l-worker-us-east-1c-pkg26   Running        m4.large    us-east-1   us-east-1c   3h28m   ip-10-0-170-181.ec2.internal   aws:///us-east-1c/i-06861c00007751b0a   running

Copy to Clipboard

Toggle word wrap

1: 새 시스템 clustername-8qw5l-master-3이 생성되고 단계가 Provisioning에서 Running으로 변경되면 시스템이 준비 상태가 됩니다.

새 시스템을 만드는 데 몇 분이 소요될 수 있습니다. etcd 클러스터 Operator는 머신 또는 노드가 정상 상태로 돌아 오면 자동으로 동기화됩니다.

다음 명령을 입력하여 쿼럼 보호기를 다시 켭니다.

oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": null}}'

$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": null}}'

Copy to Clipboard

Toggle word wrap

다음 명령을 입력하여 unsupportedConfigOverrides 섹션이 오브젝트에서 제거되었는지 확인할 수 있습니다.
```
oc get etcd/cluster -oyaml
```
```
$ oc get etcd/cluster -oyaml
```
Copy to Clipboard Toggle word wrap

단일 노드 OpenShift를 사용하는 경우 노드를 다시 시작합니다. 그렇지 않으면 etcd 클러스터 Operator에서 다음 오류가 발생할 수 있습니다.

출력 예

EtcdCertSignerControllerDegraded: [Operation cannot be fulfilled on secrets "etcd-peer-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-metrics-sno-0": the object has been modified; please apply your changes to the latest version and try again]

EtcdCertSignerControllerDegraded: [Operation cannot be fulfilled on secrets "etcd-peer-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-metrics-sno-0": the object has been modified; please apply your changes to the latest version and try again]

Copy to Clipboard

Toggle word wrap

검증

모든 etcd pod가 올바르게 실행되고 있는지 확인합니다.

클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.

oc -n openshift-etcd get pods -l k8s-app=etcd

$ oc -n openshift-etcd get pods -l k8s-app=etcd

Copy to Clipboard

Toggle word wrap

출력 예

etcd-ip-10-0-133-53.ec2.internal                 3/3     Running     0          7m49s
etcd-ip-10-0-164-97.ec2.internal                 3/3     Running     0          123m
etcd-ip-10-0-154-204.ec2.internal                3/3     Running     0          124m

etcd-ip-10-0-133-53.ec2.internal                 3/3     Running     0          7m49s
etcd-ip-10-0-164-97.ec2.internal                 3/3     Running     0          123m
etcd-ip-10-0-154-204.ec2.internal                3/3     Running     0          124m

Copy to Clipboard

Toggle word wrap

이전 명령의 출력에 두 개의 pod만 나열되는 경우 수동으로 etcd 재배포를 강제 수행할 수 있습니다. 클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.

oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge

$ oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge

Copy to Clipboard

Toggle word wrap

1: forceRedeploymentReason 값은 고유해야하므로 타임 스탬프가 추가됩니다.

정확히 세 개의 etcd 멤버가 있는지 확인합니다.

실행중인 etcd 컨테이너에 연결하고 영향을 받는 노드에 없는 pod 이름을 전달합니다.
클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.
```
oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
```
```
$ oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
```
Copy to Clipboard Toggle word wrap

멤버 목록을 확인합니다.

etcdctl member list -w table

sh-4.2# etcdctl member list -w table

Copy to Clipboard

Toggle word wrap

출력 예

+------------------+---------+------------------------------+---------------------------+---------------------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+------------------------------+---------------------------+---------------------------+
| 5eb0d6b8ca24730c | started |  ip-10-0-133-53.ec2.internal |  https://10.0.133.53:2380 |  https://10.0.133.53:2379 |
| 757b6793e2408b6c | started |  ip-10-0-164-97.ec2.internal |  https://10.0.164.97:2380 |  https://10.0.164.97:2379 |
| ca8c2990a0aa29d1 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 |
+------------------+---------+------------------------------+---------------------------+---------------------------+

+------------------+---------+------------------------------+---------------------------+---------------------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+------------------------------+---------------------------+---------------------------+
| 5eb0d6b8ca24730c | started |  ip-10-0-133-53.ec2.internal |  https://10.0.133.53:2380 |  https://10.0.133.53:2379 |
| 757b6793e2408b6c | started |  ip-10-0-164-97.ec2.internal |  https://10.0.164.97:2380 |  https://10.0.164.97:2379 |
| ca8c2990a0aa29d1 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 |
+------------------+---------+------------------------------+---------------------------+---------------------------+

Copy to Clipboard

Toggle word wrap

이전 명령의 출력에 세 개 이상의 etcd 멤버가 나열된 경우 원하지 않는 멤버를 신중하게 제거해야 합니다.

주의

올바른 etcd 멤버를 제거하십시오. etcd 멤버를 제거하면 쿼럼이 손실될 수 있습니다.

5.2.4.2. etcd pod가 크래시 루프 상태인 비정상적인 etcd 멤버 교체
링크 복사

이 단계에서는 etcd pod가 크래시 루프 상태에 있는 경우 비정상 etcd 멤버를 교체하는 방법을 설명합니다.

전제 조건

비정상적인 etcd 멤버를 식별했습니다.
etcd pod가 크래시 루프 상태에 있는것으로 확인되었습니다.
cluster-admin 역할의 사용자로 클러스터에 액세스할 수 있습니다.
etcd 백업이 수행되었습니다.
중요
문제가 발생할 경우 클러스터를 복원할 수 있도록 이 프로세스를 수행하기 전에 etcd 백업을 수행해야합니다.

프로세스

크래시 루프 상태에 있는 etcd pod를 중지합니다.
1. 크래시 루프 상태의 노드를 디버깅합니다.
  클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.
  $ oc debug node/ip-10-0-131-183.ec2.internal
  1
  Copy to Clipboard Toggle word wrap
  1
  이를 비정상 노드의 이름으로 변경합니다.
2. 루트 디렉토리를 /host 로 변경합니다.
  sh-4.2# chroot /host
  Copy to Clipboard Toggle word wrap
3. kubelet 매니페스트 디렉토리에서 기존 etcd pod 파일을 이동합니다.
  sh-4.2# mkdir /var/lib/etcd-backup
  Copy to Clipboard Toggle word wrap
  sh-4.2# mv /etc/kubernetes/manifests/etcd-pod.yaml /var/lib/etcd-backup/
  Copy to Clipboard Toggle word wrap
4. etcd 데이터 디렉토리를 다른 위치로 이동합니다.
  sh-4.2# mv /var/lib/etcd/ /tmp
  Copy to Clipboard Toggle word wrap
  이제 노드 쉘을 종료할 수 있습니다.

비정상적인 멤버를 제거합니다.

영향을 받는 노드에 없는 pod를 선택합니다.

클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.

oc -n openshift-etcd get pods -l k8s-app=etcd

$ oc -n openshift-etcd get pods -l k8s-app=etcd

Copy to Clipboard

Toggle word wrap

출력 예

etcd-ip-10-0-131-183.ec2.internal                2/3     Error       7          6h9m
etcd-ip-10-0-164-97.ec2.internal                 3/3     Running     0          6h6m
etcd-ip-10-0-154-204.ec2.internal                3/3     Running     0          6h6m

etcd-ip-10-0-131-183.ec2.internal                2/3     Error       7          6h9m
etcd-ip-10-0-164-97.ec2.internal                 3/3     Running     0          6h6m
etcd-ip-10-0-154-204.ec2.internal                3/3     Running     0          6h6m

Copy to Clipboard

Toggle word wrap

실행중인 etcd 컨테이너에 연결하고 영향을 받는 노드에 없는 pod 이름을 전달합니다.
클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.
```
oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
```
```
$ oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
```
Copy to Clipboard Toggle word wrap

멤버 목록을 확인합니다.

etcdctl member list -w table

sh-4.2# etcdctl member list -w table

Copy to Clipboard

Toggle word wrap

출력 예

+------------------+---------+------------------------------+---------------------------+---------------------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+------------------------------+---------------------------+---------------------------+
| 62bcf33650a7170a | started | ip-10-0-131-183.ec2.internal | https://10.0.131.183:2380 | https://10.0.131.183:2379 |
| b78e2856655bc2eb | started |  ip-10-0-164-97.ec2.internal |  https://10.0.164.97:2380 |  https://10.0.164.97:2379 |
| d022e10b498760d5 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 |
+------------------+---------+------------------------------+---------------------------+---------------------------+

+------------------+---------+------------------------------+---------------------------+---------------------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+------------------------------+---------------------------+---------------------------+
| 62bcf33650a7170a | started | ip-10-0-131-183.ec2.internal | https://10.0.131.183:2380 | https://10.0.131.183:2379 |
| b78e2856655bc2eb | started |  ip-10-0-164-97.ec2.internal |  https://10.0.164.97:2380 |  https://10.0.164.97:2379 |
| d022e10b498760d5 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 |
+------------------+---------+------------------------------+---------------------------+---------------------------+

Copy to Clipboard

Toggle word wrap

이러한 값은 프로세스의 뒷부분에서 필요하므로 비정상 etcd 멤버의 ID와 이름을 기록해 두십시오.

etcdctl member remove 명령에 ID를 지정하여 비정상적인 etcd 멤버를 제거합니다.
```
etcdctl member remove 62bcf33650a7170a
```
```
sh-4.2# etcdctl member remove 62bcf33650a7170a
```
Copy to Clipboard Toggle word wrap
출력 예
```
Member 62bcf33650a7170a removed from cluster ead669ce1fbfb346
```
```
Member 62bcf33650a7170a removed from cluster ead669ce1fbfb346
```
Copy to Clipboard Toggle word wrap

멤버 목록을 다시 표시하고 멤버가 제거되었는지 확인합니다.

etcdctl member list -w table

sh-4.2# etcdctl member list -w table

Copy to Clipboard

Toggle word wrap

출력 예

+------------------+---------+------------------------------+---------------------------+---------------------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+------------------------------+---------------------------+---------------------------+
| b78e2856655bc2eb | started |  ip-10-0-164-97.ec2.internal |  https://10.0.164.97:2380 |  https://10.0.164.97:2379 |
| d022e10b498760d5 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 |
+------------------+---------+------------------------------+---------------------------+---------------------------+

+------------------+---------+------------------------------+---------------------------+---------------------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+------------------------------+---------------------------+---------------------------+
| b78e2856655bc2eb | started |  ip-10-0-164-97.ec2.internal |  https://10.0.164.97:2380 |  https://10.0.164.97:2379 |
| d022e10b498760d5 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 |
+------------------+---------+------------------------------+---------------------------+---------------------------+

Copy to Clipboard

Toggle word wrap

이제 노드 쉘을 종료할 수 있습니다.

다음 명령을 입력하여 쿼럼 보호기를 끄십시오.

oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}'

$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}'

Copy to Clipboard

Toggle word wrap

이 명령을 사용하면 보안을 다시 생성하고 정적 Pod를 롤아웃할 수 있습니다.

삭제된 비정상 etcd 멤버의 이전 암호를 제거합니다.

삭제된 비정상 etcd 멤버의 시크릿(secrets)을 나열합니다.

oc get secrets -n openshift-etcd | grep ip-10-0-131-183.ec2.internal

$ oc get secrets -n openshift-etcd | grep ip-10-0-131-183.ec2.internal

Copy to Clipboard

Toggle word wrap

1: 이 프로세스의 앞부분에서 기록한 비정상 etcd 멤버의 이름을 전달합니다.

다음 출력에 표시된대로 피어, 서빙 및 메트릭 시크릿이 있습니다.

출력 예

etcd-peer-ip-10-0-131-183.ec2.internal              kubernetes.io/tls                     2      47m
etcd-serving-ip-10-0-131-183.ec2.internal           kubernetes.io/tls                     2      47m
etcd-serving-metrics-ip-10-0-131-183.ec2.internal   kubernetes.io/tls                     2      47m

etcd-peer-ip-10-0-131-183.ec2.internal              kubernetes.io/tls                     2      47m
etcd-serving-ip-10-0-131-183.ec2.internal           kubernetes.io/tls                     2      47m
etcd-serving-metrics-ip-10-0-131-183.ec2.internal   kubernetes.io/tls                     2      47m

Copy to Clipboard

Toggle word wrap

제거된 비정상 etcd 멤버의 시크릿을 삭제합니다.

피어 시크릿을 삭제합니다.

oc delete secret -n openshift-etcd etcd-peer-ip-10-0-131-183.ec2.internal

$ oc delete secret -n openshift-etcd etcd-peer-ip-10-0-131-183.ec2.internal

Copy to Clipboard

Toggle word wrap

서빙 시크릿을 삭제합니다.

oc delete secret -n openshift-etcd etcd-serving-ip-10-0-131-183.ec2.internal

$ oc delete secret -n openshift-etcd etcd-serving-ip-10-0-131-183.ec2.internal

Copy to Clipboard

Toggle word wrap

메트릭 시크릿을 삭제합니다.

oc delete secret -n openshift-etcd etcd-serving-metrics-ip-10-0-131-183.ec2.internal

$ oc delete secret -n openshift-etcd etcd-serving-metrics-ip-10-0-131-183.ec2.internal

Copy to Clipboard

Toggle word wrap

etcd를 강제로 재배포합니다.
클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.
```
oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "single-master-recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge
```
```
$ oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "single-master-recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge 
```
1
Copy to Clipboard Toggle word wrap
1
forceRedeploymentReason 값은 고유해야하므로 타임 스탬프가 추가됩니다.
etcd 클러스터 Operator가 재배포를 수행하면 모든 컨트롤 플레인 노드가 etcd pod가 작동하는지 확인합니다.

다음 명령을 입력하여 쿼럼 보호기를 다시 켭니다.

oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": null}}'

$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": null}}'

Copy to Clipboard

Toggle word wrap

다음 명령을 입력하여 unsupportedConfigOverrides 섹션이 오브젝트에서 제거되었는지 확인할 수 있습니다.
```
oc get etcd/cluster -oyaml
```
```
$ oc get etcd/cluster -oyaml
```
Copy to Clipboard Toggle word wrap

단일 노드 OpenShift를 사용하는 경우 노드를 다시 시작합니다. 그렇지 않으면 etcd 클러스터 Operator에서 다음 오류가 발생할 수 있습니다.

출력 예

EtcdCertSignerControllerDegraded: [Operation cannot be fulfilled on secrets "etcd-peer-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-metrics-sno-0": the object has been modified; please apply your changes to the latest version and try again]

EtcdCertSignerControllerDegraded: [Operation cannot be fulfilled on secrets "etcd-peer-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-metrics-sno-0": the object has been modified; please apply your changes to the latest version and try again]

Copy to Clipboard

Toggle word wrap

검증

새 멤버가 사용 가능하고 정상적인 상태에 있는지 확인합니다.

실행중인 etcd 컨테이너에 다시 연결합니다.
cluster-admin 사용자로 클러스터에 액세스할 수 있는 터미널에서 다음 명령을 실행합니다.
```
oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
```
```
$ oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
```
Copy to Clipboard Toggle word wrap

모든 멤버가 정상인지 확인합니다.

etcdctl endpoint health

sh-4.2# etcdctl endpoint health

Copy to Clipboard

Toggle word wrap

출력 예

https://10.0.131.183:2379 is healthy: successfully committed proposal: took = 16.671434ms
https://10.0.154.204:2379 is healthy: successfully committed proposal: took = 16.698331ms
https://10.0.164.97:2379 is healthy: successfully committed proposal: took = 16.621645ms

https://10.0.131.183:2379 is healthy: successfully committed proposal: took = 16.671434ms
https://10.0.154.204:2379 is healthy: successfully committed proposal: took = 16.698331ms
https://10.0.164.97:2379 is healthy: successfully committed proposal: took = 16.621645ms

Copy to Clipboard

Toggle word wrap

5.2.4.3. 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 비정상적인 베어 메탈 etcd 멤버 교체
링크 복사

이 프로세스에서는 시스템이 실행되고 있지 않거나 노드가 준비되지 않았기 때문에 비정상 상태의 베어 메탈 etcd 멤버를 교체하는 단계를 자세히 설명합니다.

설치 관리자 프로비저닝 인프라를 실행 중이거나 Machine API를 사용하여 머신을 생성한 경우 다음 단계를 따르십시오. 그렇지 않으면 원래 생성하는 데 사용된 방법과 동일한 방법으로 새 컨트롤 플레인 노드를 생성해야 합니다.

사전 요구 사항

비정상적인 베어 메탈 etcd 멤버를 식별했습니다.
시스템이 실행되고 있지 않거나 노드가 준비되지 않았음을 확인했습니다.
cluster-admin 역할의 사용자로 클러스터에 액세스할 수 있습니다.
etcd 백업이 수행되었습니다.
중요
문제가 발생할 경우 클러스터를 복원할 수 있도록 이 단계를 수행하기 전에 etcd 백업을 수행해야합니다.

절차

비정상 멤버를 확인하고 제거합니다.

영향을 받는 노드에 없는 pod를 선택합니다.

클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.

oc -n openshift-etcd get pods -l k8s-app=etcd -o wide

$ oc -n openshift-etcd get pods -l k8s-app=etcd -o wide

Copy to Clipboard

Toggle word wrap

출력 예

etcd-openshift-control-plane-0   5/5   Running   11   3h56m   192.168.10.9   openshift-control-plane-0  <none>           <none>
etcd-openshift-control-plane-1   5/5   Running   0    3h54m   192.168.10.10   openshift-control-plane-1   <none>           <none>
etcd-openshift-control-plane-2   5/5   Running   0    3h58m   192.168.10.11   openshift-control-plane-2   <none>           <none>

etcd-openshift-control-plane-0   5/5   Running   11   3h56m   192.168.10.9   openshift-control-plane-0  <none>           <none>
etcd-openshift-control-plane-1   5/5   Running   0    3h54m   192.168.10.10   openshift-control-plane-1   <none>           <none>
etcd-openshift-control-plane-2   5/5   Running   0    3h58m   192.168.10.11   openshift-control-plane-2   <none>           <none>

Copy to Clipboard

Toggle word wrap

실행중인 etcd 컨테이너에 연결하고 영향을 받는 노드에 없는 pod 이름을 전달합니다.
클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.
```
oc rsh -n openshift-etcd etcd-openshift-control-plane-0
```
```
$ oc rsh -n openshift-etcd etcd-openshift-control-plane-0
```
Copy to Clipboard Toggle word wrap

멤버 목록을 확인합니다.

etcdctl member list -w table

sh-4.2# etcdctl member list -w table

Copy to Clipboard

Toggle word wrap

출력 예

+------------------+---------+--------------------+---------------------------+---------------------------+---------------------+
| ID               | STATUS  | NAME                      | PEER ADDRS                  | CLIENT ADDRS                | IS LEARNER |
+------------------+---------+--------------------+---------------------------+---------------------------+---------------------+
| 7a8197040a5126c8 | started | openshift-control-plane-2 | https://192.168.10.11:2380/ | https://192.168.10.11:2379/ | false |
| 8d5abe9669a39192 | started | openshift-control-plane-1 | https://192.168.10.10:2380/ | https://192.168.10.10:2379/ | false |
| cc3830a72fc357f9 | started | openshift-control-plane-0 | https://192.168.10.9:2380/ | https://192.168.10.9:2379/   | false |
+------------------+---------+--------------------+---------------------------+---------------------------+---------------------+

+------------------+---------+--------------------+---------------------------+---------------------------+---------------------+
| ID               | STATUS  | NAME                      | PEER ADDRS                  | CLIENT ADDRS                | IS LEARNER |
+------------------+---------+--------------------+---------------------------+---------------------------+---------------------+
| 7a8197040a5126c8 | started | openshift-control-plane-2 | https://192.168.10.11:2380/ | https://192.168.10.11:2379/ | false |
| 8d5abe9669a39192 | started | openshift-control-plane-1 | https://192.168.10.10:2380/ | https://192.168.10.10:2379/ | false |
| cc3830a72fc357f9 | started | openshift-control-plane-0 | https://192.168.10.9:2380/ | https://192.168.10.9:2379/   | false |
+------------------+---------+--------------------+---------------------------+---------------------------+---------------------+

Copy to Clipboard

Toggle word wrap

이러한 값은 프로세스의 뒷부분에서 필요하므로 비정상 etcd 멤버의 ID와 이름을 기록해 두십시오. etcdctl endpoint health 명령은 교체 절차가 완료되고 새 멤버가 추가될 때까지 제거된 멤버를 나열합니다.

etcdctl member remove 명령에 ID를 지정하여 비정상적인 etcd 멤버를 제거합니다.
주의
올바른 etcd 멤버를 제거하십시오. etcd 멤버를 제거하면 쿼럼이 손실될 수 있습니다.
```
etcdctl member remove 7a8197040a5126c8
```
```
sh-4.2# etcdctl member remove 7a8197040a5126c8
```
Copy to Clipboard Toggle word wrap
출력 예
```
Member 7a8197040a5126c8 removed from cluster b23536c33f2cdd1b
```
```
Member 7a8197040a5126c8 removed from cluster b23536c33f2cdd1b
```
Copy to Clipboard Toggle word wrap

멤버 목록을 다시 표시하고 멤버가 제거되었는지 확인합니다.

etcdctl member list -w table

sh-4.2# etcdctl member list -w table

Copy to Clipboard

Toggle word wrap

출력 예

+------------------+---------+--------------------+---------------------------+---------------------------+-------------------------+
| ID               | STATUS  | NAME                      | PEER ADDRS                  | CLIENT ADDRS                | IS LEARNER |
+------------------+---------+--------------------+---------------------------+---------------------------+-------------------------+
| 7a8197040a5126c8 | started | openshift-control-plane-2 | https://192.168.10.11:2380/ | https://192.168.10.11:2379/ | false |
| 8d5abe9669a39192 | started | openshift-control-plane-1 | https://192.168.10.10:2380/ | https://192.168.10.10:2379/ | false |
+------------------+---------+--------------------+---------------------------+---------------------------+-------------------------+

+------------------+---------+--------------------+---------------------------+---------------------------+-------------------------+
| ID               | STATUS  | NAME                      | PEER ADDRS                  | CLIENT ADDRS                | IS LEARNER |
+------------------+---------+--------------------+---------------------------+---------------------------+-------------------------+
| 7a8197040a5126c8 | started | openshift-control-plane-2 | https://192.168.10.11:2380/ | https://192.168.10.11:2379/ | false |
| 8d5abe9669a39192 | started | openshift-control-plane-1 | https://192.168.10.10:2380/ | https://192.168.10.10:2379/ | false |
+------------------+---------+--------------------+---------------------------+---------------------------+-------------------------+

Copy to Clipboard

Toggle word wrap

이제 노드 쉘을 종료할 수 있습니다.

중요

멤버를 제거한 후 나머지 etcd 인스턴스가 재부팅되는 동안 잠시 동안 클러스터에 연결할 수 없습니다.

다음 명령을 입력하여 쿼럼 보호기를 끄십시오.

oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}'

$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}'

Copy to Clipboard

Toggle word wrap

이 명령을 사용하면 보안을 다시 생성하고 정적 Pod를 롤아웃할 수 있습니다.

다음 명령을 실행하여 제거된 비정상 etcd 멤버의 이전 암호를 제거합니다.

삭제된 비정상 etcd 멤버의 시크릿(secrets)을 나열합니다.

oc get secrets -n openshift-etcd | grep openshift-control-plane-2

$ oc get secrets -n openshift-etcd | grep openshift-control-plane-2

Copy to Clipboard

Toggle word wrap

이 프로세스의 앞부분에서 기록한 비정상 etcd 멤버의 이름을 전달합니다.

다음 출력에 표시된대로 피어, 서빙 및 메트릭 시크릿이 있습니다.

etcd-peer-openshift-control-plane-2             kubernetes.io/tls   2   134m
etcd-serving-metrics-openshift-control-plane-2  kubernetes.io/tls   2   134m
etcd-serving-openshift-control-plane-2          kubernetes.io/tls   2   134m

etcd-peer-openshift-control-plane-2             kubernetes.io/tls   2   134m
etcd-serving-metrics-openshift-control-plane-2  kubernetes.io/tls   2   134m
etcd-serving-openshift-control-plane-2          kubernetes.io/tls   2   134m

Copy to Clipboard

Toggle word wrap

제거된 비정상 etcd 멤버의 시크릿을 삭제합니다.

피어 시크릿을 삭제합니다.

oc delete secret etcd-peer-openshift-control-plane-2 -n openshift-etcd

secret "etcd-peer-openshift-control-plane-2" deleted

$ oc delete secret etcd-peer-openshift-control-plane-2 -n openshift-etcd

secret "etcd-peer-openshift-control-plane-2" deleted

Copy to Clipboard

Toggle word wrap

서빙 시크릿을 삭제합니다.

oc delete secret etcd-serving-metrics-openshift-control-plane-2 -n openshift-etcd

secret "etcd-serving-metrics-openshift-control-plane-2" deleted

$ oc delete secret etcd-serving-metrics-openshift-control-plane-2 -n openshift-etcd

secret "etcd-serving-metrics-openshift-control-plane-2" deleted

Copy to Clipboard

Toggle word wrap

메트릭 시크릿을 삭제합니다.

oc delete secret etcd-serving-openshift-control-plane-2 -n openshift-etcd

secret "etcd-serving-openshift-control-plane-2" deleted

$ oc delete secret etcd-serving-openshift-control-plane-2 -n openshift-etcd

secret "etcd-serving-openshift-control-plane-2" deleted

Copy to Clipboard

Toggle word wrap

컨트롤 플레인 시스템을 삭제합니다.

설치 프로그램에서 제공한 인프라를 실행 중이거나 Machine API를 사용하여 컴퓨터를 만든 경우 다음 단계를 수행합니다. 그렇지 않으면 원래 생성하는 데 사용된 방법과 동일한 방법으로 새 컨트롤 플레인 노드를 생성해야 합니다.

비정상 멤버의 컴퓨터를 가져옵니다.

클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.

oc get machines -n openshift-machine-api -o wide

$ oc get machines -n openshift-machine-api -o wide

Copy to Clipboard

Toggle word wrap

출력 예

NAME                              PHASE     TYPE   REGION   ZONE   AGE     NODE                               PROVIDERID                                                                                              STATE
examplecluster-control-plane-0    Running                          3h11m   openshift-control-plane-0   baremetalhost:///openshift-machine-api/openshift-control-plane-0/da1ebe11-3ff2-41c5-b099-0aa41222964e   externally provisioned 
examplecluster-control-plane-1    Running                          3h11m   openshift-control-plane-1   baremetalhost:///openshift-machine-api/openshift-control-plane-1/d9f9acbc-329c-475e-8d81-03b20280a3e1   externally provisioned
examplecluster-control-plane-2    Running                          3h11m   openshift-control-plane-2   baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135   externally provisioned
examplecluster-compute-0          Running                          165m    openshift-compute-0         baremetalhost:///openshift-machine-api/openshift-compute-0/3d685b81-7410-4bb3-80ec-13a31858241f         provisioned
examplecluster-compute-1          Running                          165m    openshift-compute-1         baremetalhost:///openshift-machine-api/openshift-compute-1/0fdae6eb-2066-4241-91dc-e7ea72ab13b9         provisioned

NAME                              PHASE     TYPE   REGION   ZONE   AGE     NODE                               PROVIDERID                                                                                              STATE
examplecluster-control-plane-0    Running                          3h11m   openshift-control-plane-0   baremetalhost:///openshift-machine-api/openshift-control-plane-0/da1ebe11-3ff2-41c5-b099-0aa41222964e   externally provisioned


examplecluster-control-plane-1    Running                          3h11m   openshift-control-plane-1   baremetalhost:///openshift-machine-api/openshift-control-plane-1/d9f9acbc-329c-475e-8d81-03b20280a3e1   externally provisioned
examplecluster-control-plane-2    Running                          3h11m   openshift-control-plane-2   baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135   externally provisioned
examplecluster-compute-0          Running                          165m    openshift-compute-0         baremetalhost:///openshift-machine-api/openshift-compute-0/3d685b81-7410-4bb3-80ec-13a31858241f         provisioned
examplecluster-compute-1          Running                          165m    openshift-compute-1         baremetalhost:///openshift-machine-api/openshift-compute-1/0fdae6eb-2066-4241-91dc-e7ea72ab13b9         provisioned

Copy to Clipboard

Toggle word wrap

1: 비정상 노드의 컨트롤 플레인 시스템 예cluster-control-plane-2.

시스템 설정을 파일 시스템의 파일에 저장합니다.

oc get machine examplecluster-control-plane-2 \
    -n openshift-machine-api \
    -o yaml \
    > new-master-machine.yaml

$ oc get machine examplecluster-control-plane-2 \


    -n openshift-machine-api \
    -o yaml \
    > new-master-machine.yaml

Copy to Clipboard

Toggle word wrap

1: 비정상 노드의 컨트롤 플레인 시스템의 이름을 지정합니다.

이전 단계에서 만든 new-master-machine.yaml 파일을 편집하여 새 이름을 할당하고 불필요한 필드를 제거합니다.

전체 status 섹션을 삭제합니다.

status:
  addresses:
  - address: ""
    type: InternalIP
  - address: fe80::4adf:37ff:feb0:8aa1%ens1f1.373
    type: InternalDNS
  - address: fe80::4adf:37ff:feb0:8aa1%ens1f1.371
    type: Hostname
  lastUpdated: "2020-04-20T17:44:29Z"
  nodeRef:
    kind: Machine
    name: fe80::4adf:37ff:feb0:8aa1%ens1f1.372
    uid: acca4411-af0d-4387-b73e-52b2484295ad
  phase: Running
  providerStatus:
    apiVersion: machine.openshift.io/v1beta1
    conditions:
    - lastProbeTime: "2020-04-20T16:53:50Z"
      lastTransitionTime: "2020-04-20T16:53:50Z"
      message: machine successfully created
      reason: MachineCreationSucceeded
      status: "True"
      type: MachineCreation
    instanceId: i-0fdb85790d76d0c3f
    instanceState: stopped
    kind: Machine

status:
  addresses:
  - address: ""
    type: InternalIP
  - address: fe80::4adf:37ff:feb0:8aa1%ens1f1.373
    type: InternalDNS
  - address: fe80::4adf:37ff:feb0:8aa1%ens1f1.371
    type: Hostname
  lastUpdated: "2020-04-20T17:44:29Z"
  nodeRef:
    kind: Machine
    name: fe80::4adf:37ff:feb0:8aa1%ens1f1.372
    uid: acca4411-af0d-4387-b73e-52b2484295ad
  phase: Running
  providerStatus:
    apiVersion: machine.openshift.io/v1beta1
    conditions:
    - lastProbeTime: "2020-04-20T16:53:50Z"
      lastTransitionTime: "2020-04-20T16:53:50Z"
      message: machine successfully created
      reason: MachineCreationSucceeded
      status: "True"
      type: MachineCreation
    instanceId: i-0fdb85790d76d0c3f
    instanceState: stopped
    kind: Machine

Copy to Clipboard

Toggle word wrap

metadata.name 필드를 새 이름으로 변경합니다.

이전 시스템과 동일한 기본 이름을 유지하고 마지막 번호를 사용 가능한 다음 번호로 변경하는 것이 좋습니다. 이 예에서 examplecluster-control-plane-2 는 examplecluster-control-plane-3 으로 변경되었습니다.

예를 들어 다음과 같습니다.

apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  ...
  name: examplecluster-control-plane-3
  ...

apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  ...
  name: examplecluster-control-plane-3
  ...

Copy to Clipboard

Toggle word wrap

spec.providerID 필드를 삭제합니다.

  providerID: baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135

  providerID: baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135

Copy to Clipboard

Toggle word wrap

metadata.annotations 및 metadata.generation 필드를 제거합니다.

  annotations:
    machine.openshift.io/instance-state: externally provisioned
  ...
  generation: 2

  annotations:
    machine.openshift.io/instance-state: externally provisioned
  ...
  generation: 2

Copy to Clipboard

Toggle word wrap

spec.conditions,spec.lastUpdated,spec.nodeRef 및 spec.phase 필드를 제거합니다.

  lastTransitionTime: "2022-08-03T08:40:36Z"
message: 'Drain operation currently blocked by: [{Name:EtcdQuorumOperator Owner:clusteroperator/etcd}]'
reason: HookPresent
severity: Warning
status: "False"

type: Drainable
lastTransitionTime: "2022-08-03T08:39:55Z"
status: "True"
type: InstanceExists

lastTransitionTime: "2022-08-03T08:36:37Z"
status: "True"
type: Terminable
lastUpdated: "2022-08-03T08:40:36Z"
nodeRef:
kind: Node
name: openshift-control-plane-2
uid: 788df282-6507-4ea2-9a43-24f237ccbc3c
phase: Running

  lastTransitionTime: "2022-08-03T08:40:36Z"
message: 'Drain operation currently blocked by: [{Name:EtcdQuorumOperator Owner:clusteroperator/etcd}]'
reason: HookPresent
severity: Warning
status: "False"

type: Drainable
lastTransitionTime: "2022-08-03T08:39:55Z"
status: "True"
type: InstanceExists

lastTransitionTime: "2022-08-03T08:36:37Z"
status: "True"
type: Terminable
lastUpdated: "2022-08-03T08:40:36Z"
nodeRef:
kind: Node
name: openshift-control-plane-2
uid: 788df282-6507-4ea2-9a43-24f237ccbc3c
phase: Running

Copy to Clipboard

Toggle word wrap

다음 명령을 실행하여 Bare Metal Operator를 사용할 수 있는지 확인합니다.

oc get clusteroperator baremetal

$ oc get clusteroperator baremetal

Copy to Clipboard

Toggle word wrap

출력 예

NAME        VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
baremetal   4.11.3    True        False         False      3d15h

NAME        VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
baremetal   4.11.3    True        False         False      3d15h

Copy to Clipboard

Toggle word wrap

다음 명령을 실행하여 이전 BareMetalHost 오브젝트를 제거합니다.

oc delete bmh openshift-control-plane-2 -n openshift-machine-api

$ oc delete bmh openshift-control-plane-2 -n openshift-machine-api

Copy to Clipboard

Toggle word wrap

출력 예

baremetalhost.metal3.io "openshift-control-plane-2" deleted

baremetalhost.metal3.io "openshift-control-plane-2" deleted

Copy to Clipboard

Toggle word wrap

다음 명령을 실행하여 비정상 멤버의 시스템을 삭제합니다.
```
oc delete machine -n openshift-machine-api examplecluster-control-plane-2
```
```
$ oc delete machine -n openshift-machine-api examplecluster-control-plane-2
```
Copy to Clipboard Toggle word wrap
BareMetalHost 및 Machine 오브젝트를 제거한 후 머신 컨트롤러에서 Node 오브젝트를 자동으로 삭제합니다.
어떤 이유로든 머신 삭제가 지연되거나 명령이 차단되고 지연되면 machine object finalizer 필드를 제거하여 강제로 삭제할 수 있습니다.
중요
Ctrl+c 를 눌러 머신 삭제를 중단하지 마십시오. 명령이 완료될 수 있도록 허용해야 합니다. 새 터미널 창을 열어 편집하여 종료자 필드를 삭제합니다.
1. 다음 명령을 실행하여 머신 구성을 편집합니다.
  $ oc edit machine -n openshift-machine-api examplecluster-control-plane-2
  Copy to Clipboard Toggle word wrap
2. Machine 사용자 정의 리소스에서 다음 필드를 삭제한 다음 업데이트된 파일을 저장합니다.
  finalizers: - machine.machine.openshift.io
  Copy to Clipboard Toggle word wrap
  출력 예
  machine.machine.openshift.io/examplecluster-control-plane-2 edited
  
  Copy to Clipboard Toggle word wrap

다음 명령을 실행하여 시스템이 삭제되었는지 확인합니다.

oc get machines -n openshift-machine-api -o wide

$ oc get machines -n openshift-machine-api -o wide

Copy to Clipboard

Toggle word wrap

출력 예

NAME                              PHASE     TYPE   REGION   ZONE   AGE     NODE                                 PROVIDERID                                                                                       STATE
examplecluster-control-plane-0    Running                          3h11m   openshift-control-plane-0   baremetalhost:///openshift-machine-api/openshift-control-plane-0/da1ebe11-3ff2-41c5-b099-0aa41222964e   externally provisioned
examplecluster-control-plane-1    Running                          3h11m   openshift-control-plane-1   baremetalhost:///openshift-machine-api/openshift-control-plane-1/d9f9acbc-329c-475e-8d81-03b20280a3e1   externally provisioned
examplecluster-compute-0          Running                          165m    openshift-compute-0         baremetalhost:///openshift-machine-api/openshift-compute-0/3d685b81-7410-4bb3-80ec-13a31858241f         provisioned
examplecluster-compute-1          Running                          165m    openshift-compute-1         baremetalhost:///openshift-machine-api/openshift-compute-1/0fdae6eb-2066-4241-91dc-e7ea72ab13b9         provisioned

NAME                              PHASE     TYPE   REGION   ZONE   AGE     NODE                                 PROVIDERID                                                                                       STATE
examplecluster-control-plane-0    Running                          3h11m   openshift-control-plane-0   baremetalhost:///openshift-machine-api/openshift-control-plane-0/da1ebe11-3ff2-41c5-b099-0aa41222964e   externally provisioned
examplecluster-control-plane-1    Running                          3h11m   openshift-control-plane-1   baremetalhost:///openshift-machine-api/openshift-control-plane-1/d9f9acbc-329c-475e-8d81-03b20280a3e1   externally provisioned
examplecluster-compute-0          Running                          165m    openshift-compute-0         baremetalhost:///openshift-machine-api/openshift-compute-0/3d685b81-7410-4bb3-80ec-13a31858241f         provisioned
examplecluster-compute-1          Running                          165m    openshift-compute-1         baremetalhost:///openshift-machine-api/openshift-compute-1/0fdae6eb-2066-4241-91dc-e7ea72ab13b9         provisioned

Copy to Clipboard

Toggle word wrap

다음 명령을 실행하여 노드가 삭제되었는지 확인합니다.

oc get nodes

NAME                     STATUS ROLES   AGE   VERSION
openshift-control-plane-0 Ready master 3h24m v1.24.0+9546431
openshift-control-plane-1 Ready master 3h24m v1.24.0+9546431
openshift-compute-0       Ready worker 176m v1.24.0+9546431
openshift-compute-1       Ready worker 176m v1.24.0+9546431

$ oc get nodes

NAME                     STATUS ROLES   AGE   VERSION
openshift-control-plane-0 Ready master 3h24m v1.24.0+9546431
openshift-control-plane-1 Ready master 3h24m v1.24.0+9546431
openshift-compute-0       Ready worker 176m v1.24.0+9546431
openshift-compute-1       Ready worker 176m v1.24.0+9546431

Copy to Clipboard

Toggle word wrap

새 BareMetalHost 오브젝트와 시크릿을 생성하여 BMC 자격 증명을 저장합니다.

cat <<EOF | oc apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: openshift-control-plane-2-bmc-secret
  namespace: openshift-machine-api
data:
  password: <password>
  username: <username>
type: Opaque
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
  name: openshift-control-plane-2
  namespace: openshift-machine-api
spec:
  automatedCleaningMode: disabled
  bmc:
    address: redfish://10.46.61.18:443/redfish/v1/Systems/1
    credentialsName: openshift-control-plane-2-bmc-secret
    disableCertificateVerification: true
  bootMACAddress: 48:df:37:b0:8a:a0
  bootMode: UEFI
  externallyProvisioned: false
  online: true
  rootDeviceHints:
    deviceName: /dev/sda
  userData:
    name: master-user-data-managed
    namespace: openshift-machine-api
EOF

$ cat <<EOF | oc apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: openshift-control-plane-2-bmc-secret
  namespace: openshift-machine-api
data:
  password: <password>
  username: <username>
type: Opaque
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
  name: openshift-control-plane-2
  namespace: openshift-machine-api
spec:
  automatedCleaningMode: disabled
  bmc:
    address: redfish://10.46.61.18:443/redfish/v1/Systems/1
    credentialsName: openshift-control-plane-2-bmc-secret
    disableCertificateVerification: true
  bootMACAddress: 48:df:37:b0:8a:a0
  bootMode: UEFI
  externallyProvisioned: false
  online: true
  rootDeviceHints:
    deviceName: /dev/sda
  userData:
    name: master-user-data-managed
    namespace: openshift-machine-api
EOF

Copy to Clipboard

Toggle word wrap

참고

사용자 이름과 암호는 다른 베어 메탈 호스트의 시크릿에서 찾을 수 있습니다. bmc:address 에 사용할 프로토콜은 다른 bmh 개체에서 가져올 수 있습니다.

중요

기존 컨트롤 플레인 호스트에서 BareMetalHost 오브젝트 정의를 재사용하는 경우 external Provisioned 필드를 true 로 설정하지 마십시오.

기존 컨트롤 플레인 BareMetalHost 오브젝트는 OpenShift Container Platform 설치 프로그램에서 프로비저닝한 경우 외부Provisioned 플래그를 true 로 설정할 수 있습니다.

검사가 완료되면 BareMetalHost 오브젝트가 생성되고 프로비저닝할 수 있습니다.

사용 가능한 BareMetalHost 오브젝트를 사용하여 생성 프로세스를 확인합니다.

oc get bmh -n openshift-machine-api

NAME                      STATE                  CONSUMER                      ONLINE ERROR   AGE
openshift-control-plane-0 externally provisioned examplecluster-control-plane-0 true         4h48m
openshift-control-plane-1 externally provisioned examplecluster-control-plane-1 true         4h48m
openshift-control-plane-2 available              examplecluster-control-plane-3 true         47m
openshift-compute-0       provisioned            examplecluster-compute-0       true         4h48m
openshift-compute-1       provisioned            examplecluster-compute-1       true         4h48m

$ oc get bmh -n openshift-machine-api

NAME                      STATE                  CONSUMER                      ONLINE ERROR   AGE
openshift-control-plane-0 externally provisioned examplecluster-control-plane-0 true         4h48m
openshift-control-plane-1 externally provisioned examplecluster-control-plane-1 true         4h48m
openshift-control-plane-2 available              examplecluster-control-plane-3 true         47m
openshift-compute-0       provisioned            examplecluster-compute-0       true         4h48m
openshift-compute-1       provisioned            examplecluster-compute-1       true         4h48m

Copy to Clipboard

Toggle word wrap

new-master-machine.yaml 파일을 사용하여 새 컨트롤 플레인 시스템을 생성합니다.
```
oc apply -f new-master-machine.yaml
```
```
$ oc apply -f new-master-machine.yaml
```
Copy to Clipboard Toggle word wrap

새 시스템이 생성되었는지 확인합니다.

oc get machines -n openshift-machine-api -o wide

$ oc get machines -n openshift-machine-api -o wide

Copy to Clipboard

Toggle word wrap

출력 예

NAME                                   PHASE     TYPE   REGION   ZONE   AGE     NODE                              PROVIDERID                                                                                            STATE
examplecluster-control-plane-0         Running                          3h11m   openshift-control-plane-0   baremetalhost:///openshift-machine-api/openshift-control-plane-0/da1ebe11-3ff2-41c5-b099-0aa41222964e   externally provisioned 
examplecluster-control-plane-1         Running                          3h11m   openshift-control-plane-1   baremetalhost:///openshift-machine-api/openshift-control-plane-1/d9f9acbc-329c-475e-8d81-03b20280a3e1   externally provisioned
examplecluster-control-plane-2         Running                          3h11m   openshift-control-plane-2   baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135   externally provisioned
examplecluster-compute-0               Running                          165m    openshift-compute-0         baremetalhost:///openshift-machine-api/openshift-compute-0/3d685b81-7410-4bb3-80ec-13a31858241f         provisioned
examplecluster-compute-1               Running                          165m    openshift-compute-1         baremetalhost:///openshift-machine-api/openshift-compute-1/0fdae6eb-2066-4241-91dc-e7ea72ab13b9         provisioned

NAME                                   PHASE     TYPE   REGION   ZONE   AGE     NODE                              PROVIDERID                                                                                            STATE
examplecluster-control-plane-0         Running                          3h11m   openshift-control-plane-0   baremetalhost:///openshift-machine-api/openshift-control-plane-0/da1ebe11-3ff2-41c5-b099-0aa41222964e   externally provisioned


examplecluster-control-plane-1         Running                          3h11m   openshift-control-plane-1   baremetalhost:///openshift-machine-api/openshift-control-plane-1/d9f9acbc-329c-475e-8d81-03b20280a3e1   externally provisioned
examplecluster-control-plane-2         Running                          3h11m   openshift-control-plane-2   baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135   externally provisioned
examplecluster-compute-0               Running                          165m    openshift-compute-0         baremetalhost:///openshift-machine-api/openshift-compute-0/3d685b81-7410-4bb3-80ec-13a31858241f         provisioned
examplecluster-compute-1               Running                          165m    openshift-compute-1         baremetalhost:///openshift-machine-api/openshift-compute-1/0fdae6eb-2066-4241-91dc-e7ea72ab13b9         provisioned

Copy to Clipboard

Toggle word wrap

1: 새 시스템 clustername-8qw5l-master-3 이 생성되고 단계가 Provisioning 에서 Running 으로 변경되면 시스템이 준비 상태가 됩니다.

새 시스템을 생성하는 데 몇 분이 걸릴 수 있습니다. etcd 클러스터 Operator는 머신 또는 노드가 정상 상태로 돌아 오면 자동으로 동기화됩니다.

베어 메탈 호스트가 프로비저닝되고 다음 명령을 실행하여 오류가 보고되지 않았는지 확인합니다.

oc get bmh -n openshift-machine-api

$ oc get bmh -n openshift-machine-api

Copy to Clipboard

Toggle word wrap

출력 예

oc get bmh -n openshift-machine-api
NAME                      STATE                  CONSUMER                       ONLINE ERROR AGE
openshift-control-plane-0 externally provisioned examplecluster-control-plane-0 true         4h48m
openshift-control-plane-1 externally provisioned examplecluster-control-plane-1 true         4h48m
openshift-control-plane-2 provisioned            examplecluster-control-plane-3 true          47m
openshift-compute-0       provisioned            examplecluster-compute-0       true         4h48m
openshift-compute-1       provisioned            examplecluster-compute-1       true         4h48m

$ oc get bmh -n openshift-machine-api
NAME                      STATE                  CONSUMER                       ONLINE ERROR AGE
openshift-control-plane-0 externally provisioned examplecluster-control-plane-0 true         4h48m
openshift-control-plane-1 externally provisioned examplecluster-control-plane-1 true         4h48m
openshift-control-plane-2 provisioned            examplecluster-control-plane-3 true          47m
openshift-compute-0       provisioned            examplecluster-compute-0       true         4h48m
openshift-compute-1       provisioned            examplecluster-compute-1       true         4h48m

Copy to Clipboard

Toggle word wrap

다음 명령을 실행하여 새 노드가 추가되고 준비 상태에 있는지 확인합니다.

oc get nodes

$ oc get nodes

Copy to Clipboard

Toggle word wrap

출력 예

oc get nodes
NAME                     STATUS ROLES   AGE   VERSION
openshift-control-plane-0 Ready master 4h26m v1.24.0+9546431
openshift-control-plane-1 Ready master 4h26m v1.24.0+9546431
openshift-control-plane-2 Ready master 12m   v1.24.0+9546431
openshift-compute-0       Ready worker 3h58m v1.24.0+9546431
openshift-compute-1       Ready worker 3h58m v1.24.0+9546431

$ oc get nodes
NAME                     STATUS ROLES   AGE   VERSION
openshift-control-plane-0 Ready master 4h26m v1.24.0+9546431
openshift-control-plane-1 Ready master 4h26m v1.24.0+9546431
openshift-control-plane-2 Ready master 12m   v1.24.0+9546431
openshift-compute-0       Ready worker 3h58m v1.24.0+9546431
openshift-compute-1       Ready worker 3h58m v1.24.0+9546431

Copy to Clipboard

Toggle word wrap

다음 명령을 입력하여 쿼럼 보호기를 다시 켭니다.

oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": null}}'

$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": null}}'

Copy to Clipboard

Toggle word wrap

다음 명령을 입력하여 unsupportedConfigOverrides 섹션이 오브젝트에서 제거되었는지 확인할 수 있습니다.
```
oc get etcd/cluster -oyaml
```
```
$ oc get etcd/cluster -oyaml
```
Copy to Clipboard Toggle word wrap

단일 노드 OpenShift를 사용하는 경우 노드를 다시 시작합니다. 그렇지 않으면 etcd 클러스터 Operator에서 다음 오류가 발생할 수 있습니다.

출력 예

EtcdCertSignerControllerDegraded: [Operation cannot be fulfilled on secrets "etcd-peer-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-metrics-sno-0": the object has been modified; please apply your changes to the latest version and try again]

EtcdCertSignerControllerDegraded: [Operation cannot be fulfilled on secrets "etcd-peer-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-metrics-sno-0": the object has been modified; please apply your changes to the latest version and try again]

Copy to Clipboard

Toggle word wrap

검증

모든 etcd pod가 올바르게 실행되고 있는지 확인합니다.
클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.
```
oc -n openshift-etcd get pods -l k8s-app=etcd
```
```
$ oc -n openshift-etcd get pods -l k8s-app=etcd
```
Copy to Clipboard Toggle word wrap
출력 예
```
etcd-openshift-control-plane-0      5/5     Running     0     105m
etcd-openshift-control-plane-1      5/5     Running     0     107m
etcd-openshift-control-plane-2      5/5     Running     0     103m
```
```
etcd-openshift-control-plane-0      5/5     Running     0     105m
etcd-openshift-control-plane-1      5/5     Running     0     107m
etcd-openshift-control-plane-2      5/5     Running     0     103m
```
Copy to Clipboard Toggle word wrap
이전 명령의 출력에 두 개의 pod만 나열되는 경우 수동으로 etcd 재배포를 강제 수행할 수 있습니다. 클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.
```
oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge
```
```
$ oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge 
```
1
Copy to Clipboard Toggle word wrap
1
forceRedeploymentReason 값은 고유해야하므로 타임 스탬프가 추가됩니다.
정확히 세 개의 etcd 멤버가 있는지 확인하려면 실행중인 etcd 컨테이너에 연결하고 영향을 받는 노드에 없는 pod 이름을 전달합니다. 클러스터에 액세스할 수 있는 터미널에서 cluster-admin 사용자로 다음 명령을 실행합니다.
```
oc rsh -n openshift-etcd etcd-openshift-control-plane-0
```
```
$ oc rsh -n openshift-etcd etcd-openshift-control-plane-0
```
Copy to Clipboard Toggle word wrap

멤버 목록을 확인합니다.

etcdctl member list -w table

sh-4.2# etcdctl member list -w table

Copy to Clipboard

Toggle word wrap

출력 예

+------------------+---------+--------------------+---------------------------+---------------------------+-----------------+
|        ID        | STATUS  |        NAME        |        PEER ADDRS         |       CLIENT ADDRS        |    IS LEARNER    |
+------------------+---------+--------------------+---------------------------+---------------------------+-----------------+
| 7a8197040a5126c8 | started | openshift-control-plane-2 | https://192.168.10.11:2380 | https://192.168.10.11:2379 |   false |
| 8d5abe9669a39192 | started | openshift-control-plane-1 | https://192.168.10.10:2380 | https://192.168.10.10:2379 |   false |
| cc3830a72fc357f9 | started | openshift-control-plane-0 | https://192.168.10.9:2380 | https://192.168.10.9:2379 |     false |
+------------------+---------+--------------------+---------------------------+---------------------------+-----------------+

+------------------+---------+--------------------+---------------------------+---------------------------+-----------------+
|        ID        | STATUS  |        NAME        |        PEER ADDRS         |       CLIENT ADDRS        |    IS LEARNER    |
+------------------+---------+--------------------+---------------------------+---------------------------+-----------------+
| 7a8197040a5126c8 | started | openshift-control-plane-2 | https://192.168.10.11:2380 | https://192.168.10.11:2379 |   false |
| 8d5abe9669a39192 | started | openshift-control-plane-1 | https://192.168.10.10:2380 | https://192.168.10.10:2379 |   false |
| cc3830a72fc357f9 | started | openshift-control-plane-0 | https://192.168.10.9:2380 | https://192.168.10.9:2379 |     false |
+------------------+---------+--------------------+---------------------------+---------------------------+-----------------+

Copy to Clipboard

Toggle word wrap

참고

이전 명령의 출력에 세 개 이상의 etcd 멤버가 나열된 경우 원하지 않는 멤버를 신중하게 제거해야 합니다.

다음 명령을 실행하여 모든 etcd 멤버가 정상인지 확인합니다.

etcdctl endpoint health --cluster

# etcdctl endpoint health --cluster

Copy to Clipboard

Toggle word wrap

출력 예

https://192.168.10.10:2379 is healthy: successfully committed proposal: took = 8.973065ms
https://192.168.10.9:2379 is healthy: successfully committed proposal: took = 11.559829ms
https://192.168.10.11:2379 is healthy: successfully committed proposal: took = 11.665203ms

https://192.168.10.10:2379 is healthy: successfully committed proposal: took = 8.973065ms
https://192.168.10.9:2379 is healthy: successfully committed proposal: took = 11.559829ms
https://192.168.10.11:2379 is healthy: successfully committed proposal: took = 11.665203ms

Copy to Clipboard

Toggle word wrap

다음 명령을 실행하여 모든 노드가 최신 버전인지 확인합니다.

oc get etcd -o=jsonpath='{range.items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}'

$ oc get etcd -o=jsonpath='{range.items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}'

Copy to Clipboard

Toggle word wrap

AllNodesAtLatestRevision

AllNodesAtLatestRevision

Copy to Clipboard

Toggle word wrap

5.2. 비정상적인 etcd 멤버 교체

5.2.1. 사전 요구 사항
링크 복사

5.2.2. 비정상 etcd 멤버 식별
링크 복사

5.2.3. 비정상적인 etcd 멤버의 상태 확인
링크 복사

5.2.4. 비정상적인 etcd 멤버 교체
링크 복사

5.2.4.1. 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 비정상적인 etcd 멤버 교체
링크 복사

5.2.4.2. etcd pod가 크래시 루프 상태인 비정상적인 etcd 멤버 교체
링크 복사

5.2.4.3. 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 비정상적인 베어 메탈 etcd 멤버 교체
링크 복사

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat 소개

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

5.2. 비정상적인 etcd 멤버 교체

5.2.1. 사전 요구 사항링크 복사링크가 클립보드에 복사되었습니다!

5.2.2. 비정상 etcd 멤버 식별링크 복사링크가 클립보드에 복사되었습니다!

5.2.3. 비정상적인 etcd 멤버의 상태 확인링크 복사링크가 클립보드에 복사되었습니다!

5.2.4. 비정상적인 etcd 멤버 교체링크 복사링크가 클립보드에 복사되었습니다!

5.2.4.1. 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 비정상적인 etcd 멤버 교체링크 복사링크가 클립보드에 복사되었습니다!

5.2.4.2. etcd pod가 크래시 루프 상태인 비정상적인 etcd 멤버 교체링크 복사링크가 클립보드에 복사되었습니다!

5.2.4.3. 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 비정상적인 베어 메탈 etcd 멤버 교체링크 복사링크가 클립보드에 복사되었습니다!

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat 소개

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

5.2.1. 사전 요구 사항
링크 복사

5.2.2. 비정상 etcd 멤버 식별
링크 복사

5.2.3. 비정상적인 etcd 멤버의 상태 확인
링크 복사

5.2.4. 비정상적인 etcd 멤버 교체
링크 복사

5.2.4.1. 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 비정상적인 etcd 멤버 교체
링크 복사

5.2.4.2. etcd pod가 크래시 루프 상태인 비정상적인 etcd 멤버 교체
링크 복사

5.2.4.3. 시스템이 실행되고 있지 않거나 노드가 준비되지 않은 비정상적인 베어 메탈 etcd 멤버 교체
링크 복사