백업 및 복원
OpenShift Container Platform 클러스터 백업 및 복원
초록
1장. 백업 및 복원
1.1. 컨트롤 플레인 백업 및 복원 작업
클러스터 관리자는 일정 기간 동안 OpenShift Container Platform 클러스터를 중지하고 나중에 다시 시작해야 할 수 있습니다. 클러스터를 다시 시작하는 몇 가지 이유는 클러스터에서 유지 관리를 수행하거나 리소스 비용을 줄여야 하기 때문입니다. OpenShift Container Platform에서는 나중에 클러스터를 쉽게 다시 시작할 수 있도록 클러스터의 정상 종료 를 수행할 수 있습니다.
클러스터를 종료하기 전에 etcd 데이터를 백업해야 합니다. etcd는 모든 리소스 오브젝트의 상태를 유지하는 OpenShift Container Platform의 키-값 저장소입니다. etcd 백업은 재해 복구에서 중요한 역할을 합니다. OpenShift Container Platform에서는 비정상적인 etcd 멤버를 교체할 수도 있습니다.
클러스터를 다시 실행하려면 클러스터를 정상적으로 다시 시작합니다.
클러스터의 인증서는 설치 날짜 이후 1년 후에 만료됩니다. 클러스터를 종료하고 인증서가 계속 유효한 동안 정상적으로 다시 시작할 것으로 예상할 수 있습니다. 클러스터가 만료된 컨트롤 플레인 인증서를 자동으로 검색하지만 CSR(인증서 서명 요청)을 계속 승인해야 합니다.
다음과 같이 OpenShift Container Platform이 예상대로 작동하지 않는 몇 가지 상황이 발생할 수 있습니다.
- 노드 오류 또는 네트워크 연결 문제와 같은 예기치 않은 조건으로 인해 재시작 후 작동하지 않는 클러스터가 있습니다.
- 클러스터에서 실수로 중요한 것을 삭제했습니다.
- 대부분의 컨트롤 플레인 호스트가 손실되어 etcd 쿼럼이 손실됩니다.
저장된 etcd 스냅샷을 사용하여 클러스터를 이전 상태로 복원하여 재해 상황에서 항상 복구할 수 있습니다.
추가 리소스
1.2. 애플리케이션 백업 및 복원 작업
클러스터 관리자는 OADP(OpenShift API for Data Protection)를 사용하여 OpenShift Container Platform에서 실행되는 애플리케이션을 백업하고 복원할 수 있습니다.
OADP는 Velero CLI 툴 다운로드 의 표에 따라 설치하는 OADP 버전에 적합한 Velero 버전을 사용하여 네임스페이스 단위로 Kubernetes 리소스 및 내부 이미지를 백업하고 복원합니다. OADP는 스냅샷 또는 Restic을 사용하여 PV(영구 볼륨)를 백업하고 복원합니다. 자세한 내용은 OADP 기능을 참조하십시오.
1.2.1. OADP 요구사항
OADP에는 다음과 같은 요구 사항이 있습니다.
-
cluster-admin
역할의 사용자로 로그인해야 합니다. 다음 스토리지 유형 중 하나와 같이 백업을 저장하기 위한 오브젝트 스토리지가 있어야 합니다.
- OpenShift Data Foundation
- Amazon Web Services
- Microsoft Azure
- Google Cloud Platform
- S3 호환 오브젝트 스토리지
- IBM Cloud® Object Storage S3
OCP 4.11 이상에서 CSI 백업을 사용하려면 OADP 1.1.x 를 설치합니다.
OADP 1.0.x 는 OCP 4.11 이상에서 CSI 백업을 지원하지 않습니다. OADP 1.0.x에는 Velero 1.7 x가 포함되어 있으며 OCP 4.11 이상에는 존재하지 않는 API 그룹 snapshot.storage.k8s.io/v1beta1
이 예상됩니다.
S3 스토리지용 CloudStorage
API는 기술 프리뷰 기능 전용입니다. 기술 프리뷰 기능은 Red Hat 프로덕션 서비스 수준 계약(SLA)에서 지원되지 않으며 기능적으로 완전하지 않을 수 있습니다. 따라서 프로덕션 환경에서 사용하는 것은 권장하지 않습니다. 이러한 기능을 사용하면 향후 제품 기능을 조기에 이용할 수 있어 개발 과정에서 고객이 기능을 테스트하고 피드백을 제공할 수 있습니다.
Red Hat 기술 프리뷰 기능의 지원 범위에 대한 자세한 내용은 기술 프리뷰 기능 지원 범위를 참조하십시오.
스냅샷을 사용하여 PV를 백업하려면 기본 스냅샷 API가 있거나 다음 공급자와 같은 CSI(Container Storage Interface) 스냅샷을 지원하는 클라우드 스토리지가 있어야 합니다.
- Amazon Web Services
- Microsoft Azure
- Google Cloud Platform
- CSI 스냅샷 지원 클라우드 스토리지(예: Ceph RBD 또는 Ceph FS)
스냅샷을 사용하여 PV를 백업하지 않으려면 기본적으로 OADP Operator에 의해 설치된 Restic 을 사용할 수 있습니다.
1.2.2. 애플리케이션 백업 및 복원
Backup
CR(사용자 정의 리소스)을 생성하여 애플리케이션을 백업합니다. 백업 CR 생성을 참조하십시오. 다음 백업 옵션을 구성할 수 있습니다.
- 백업 작업 전후에 명령을 실행할 백업 후크 생성
- 백업 예약
- 파일 시스템 백업을 사용하여 애플리케이션 백업: Kopia 또는 Restic
-
Restore
(CR)을 생성하여 애플리케이션 백업을 복원합니다. 복원 CR 생성을 참조하십시오. - 복원 작업 중에 init 컨테이너 또는 애플리케이션 컨테이너에서 명령을 실행하도록 복원 후크 를 구성할 수 있습니다.
2장. 클러스터를 안전하게 종료
이 문서에서는 클러스터를 안전하게 종료하는 프로세스를 설명합니다. 유지 관리를 위해 또는 리소스 비용을 절약하기 위해 일시적으로 클러스터를 종료해야 할 수 있습니다.
2.1. 전제 조건
클러스터를 종료하기 전에 etcd 백업을 수행합니다.
중요클러스터를 다시 시작할 때 문제가 발생할 경우 클러스터를 복원 할 수 있도록 이 단계를 수행하기 전에 etcd 백업을 해 두는 것이 중요합니다.
예를 들어 다음 조건으로 인해 재시작된 클러스터의 오작동이 발생할 수 있습니다.
- 종료 중 etcd 데이터 손상
- 하드웨어로 인한 노드 오류
- 네트워크 연결 문제
클러스터를 복구하지 못하는 경우 이전 클러스터 상태로 복원 단계를 따르십시오.
2.2. 클러스터 종료
나중에 클러스터를 다시 시작하기 위해 안전한 방법으로 클러스터를 종료할 수 있습니다.
설치 날짜부터 1년까지 클러스터를 종료하고 정상적으로 다시 시작할 수 있습니다. 설치 날짜로부터 1년 후에는 클러스터 인증서가 만료됩니다. 그러나 클러스터를 다시 시작할 때 kubelet 인증서를 복구하려면 대기 중인 CSR(인증서 서명 요청)을 수동으로 승인해야 할 수 있습니다.
사전 요구 사항
-
cluster-admin
역할의 사용자로 클러스터에 액세스할 수 있습니다. - etcd 백업이 수행되었습니다.
프로세스
연장된 기간 동안 클러스터를 종료하는 경우 인증서 만료 날짜를 확인하고 다음 명령을 실행합니다.
$ oc -n openshift-kube-apiserver-operator get secret kube-apiserver-to-kubelet-signer -o jsonpath='{.metadata.annotations.auth\.openshift\.io/certificate-not-after}'
출력 예
2022-08-05T14:37:50Zuser@user:~ $ 1
- 1
- 클러스터를 정상적으로 다시 시작할 수 있도록 지정된 날짜 또는 그 이전에 클러스터를 다시 시작하도록 계획합니다. 클러스터가 재시작되면 프로세스에서 kubelet 인증서를 복구하기 위해 보류 중인 인증서 서명 요청(CSR)을 수동으로 승인해야 할 수 있습니다.
클러스터의 모든 노드를 예약 불가로 표시합니다. 클라우드 공급자의 웹 콘솔에서 이 작업을 수행하거나 다음 반복문을 실행하여 수행할 수 있습니다.
$ for node in $(oc get nodes -o jsonpath='{.items[*].metadata.name}'); do echo ${node} ; oc adm cordon ${node} ; done
출력 예
ci-ln-mgdnf4b-72292-n547t-master-0 node/ci-ln-mgdnf4b-72292-n547t-master-0 cordoned ci-ln-mgdnf4b-72292-n547t-master-1 node/ci-ln-mgdnf4b-72292-n547t-master-1 cordoned ci-ln-mgdnf4b-72292-n547t-master-2 node/ci-ln-mgdnf4b-72292-n547t-master-2 cordoned ci-ln-mgdnf4b-72292-n547t-worker-a-s7ntl node/ci-ln-mgdnf4b-72292-n547t-worker-a-s7ntl cordoned ci-ln-mgdnf4b-72292-n547t-worker-b-cmc9k node/ci-ln-mgdnf4b-72292-n547t-worker-b-cmc9k cordoned ci-ln-mgdnf4b-72292-n547t-worker-c-vcmtn node/ci-ln-mgdnf4b-72292-n547t-worker-c-vcmtn cordoned
다음 방법을 사용하여 Pod를 비웁니다.
$ for node in $(oc get nodes -l node-role.kubernetes.io/worker -o jsonpath='{.items[*].metadata.name}'); do echo ${node} ; oc adm drain ${node} --delete-emptydir-data --ignore-daemonsets=true --timeout=15s --force ; done
클러스터의 모든 노드를 종료합니다. 클라우드 공급자 웹 콘솔의 웹 콘솔에서 또는 다음 반복문을 실행하여 이 작업을 수행할 수 있습니다. 이러한 방법 중 하나를 사용하여 노드를 종료하면 Pod가 정상적으로 종료되어 데이터 손상 가능성을 줄일 수 있습니다.
참고할당된 API VIP가 있는 컨트롤 플레인 노드가 루프에서 처리된 마지막 노드인지 확인합니다. 그렇지 않으면 shutdown 명령이 실패합니다.
$ for node in $(oc get nodes -o jsonpath='{.items[*].metadata.name}'); do oc debug node/${node} -- chroot /host shutdown -h 1; done 1
- 1
-h 1
은 컨트롤 플레인 노드가 종료되기 전에 이 프로세스가 지속됩니다. 10개 이상의 노드가 있는 대규모 클러스터의 경우-h 10
이상으로 설정하여 모든 컴퓨팅 노드를 먼저 종료할 시간이 있는지 확인합니다.
출력 예
Starting pod/ip-10-0-130-169us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` Shutdown scheduled for Mon 2021-09-13 09:36:17 UTC, use 'shutdown -c' to cancel. Removing debug pod ... Starting pod/ip-10-0-150-116us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` Shutdown scheduled for Mon 2021-09-13 09:36:29 UTC, use 'shutdown -c' to cancel.
참고종료하기 전에 OpenShift Container Platform과 함께 제공되는 표준 Pod의 컨트롤 플레인 노드를 드레인할 필요가 없습니다. 클러스터 관리자는 클러스터를 다시 시작한 후 워크로드를 완전히 다시 시작해야 합니다. 사용자 지정 워크로드로 인해 종료하기 전에 컨트롤 플레인 노드를 드레 이한 경우 다시 시작한 후 클러스터가 다시 작동하기 전에 컨트롤 플레인 노드를 스케줄 대상으로 표시해야합니다.
외부 스토리지 또는 LDAP 서버와 같이 더 이상 필요하지 않은 클러스터 종속성을 중지합니다. 이 작업을 수행하기 전에 공급 업체의 설명서를 확인하십시오.
중요클라우드 공급자 플랫폼에 클러스터를 배포한 경우 연결된 클라우드 리소스를 종료, 일시 중지 또는 삭제하지 마십시오. 일시 중지된 가상 머신의 클라우드 리소스를 삭제하면 OpenShift Container Platform이 성공적으로 복원되지 않을 수 있습니다.
2.3. 추가 리소스
3장. 클러스터를 정상적으로 다시 시작
이 문서에서는 정상 종료 후 클러스터를 다시 시작하는 프로세스에 대해 설명합니다.
다시 시작한 후 클러스터가 정상적으로 작동할 것으로 예상되지만 예상치 못한 상황으로 인해 클러스터가 복구되지 않을 수 있습니다. 예를 들면 다음과 같습니다.
- 종료 중 etcd 데이터 손상
- 하드웨어로 인한 노드 오류
- 네트워크 연결 문제
클러스터를 복구하지 못하는 경우 이전 클러스터 상태로 복원 단계를 따르십시오.
3.1. 전제 조건
3.2. 클러스터를 다시 시작
클러스터가 정상적으로 종료된 후 클러스터를 다시 시작할 수 있습니다.
전제 조건
-
cluster-admin
역할의 사용자로 클러스터에 액세스할 수 있습니다. - 이 프로세스에서는 클러스터를 정상적으로 종료하고 있는 것을 전제로 하고 있습니다.
프로세스
- 외부 스토리지 또는 LDAP 서버와 같은 클러스터의 종속 장치를 시작합니다.
모든 클러스터 시스템을 시작합니다.
클라우드 제공 업체의 웹 콘솔에서 시스템을 시작하는 것과 같이 클라우드 환경에 적합한 방법을 사용하여 시스템을 시작합니다.
약 10분 정도 기다린 후 컨트롤 플레인 노드의 상태를 확인합니다.
모든 컨트롤 플레인 노드가 준비되었는지 확인합니다.
$ oc get nodes -l node-role.kubernetes.io/master
다음 출력에 표시된 대로 노드의 상태가
Ready
인 경우 컨트롤 플레인 노드는 준비된 것입니다.NAME STATUS ROLES AGE VERSION ip-10-0-168-251.ec2.internal Ready control-plane,master 75m v1.29.4 ip-10-0-170-223.ec2.internal Ready control-plane,master 75m v1.29.4 ip-10-0-211-16.ec2.internal Ready control-plane,master 75m v1.29.4
컨트롤 플레인 노드가 준비되지 않은 경우 승인해야하는 보류중인 인증서 서명 요청(CSR)이 있는지 확인합니다.
현재 CSR의 목록을 가져옵니다.
$ oc get csr
CSR의 세부 사항을 검토하여 CSR이 유효한지 확인합니다.
$ oc describe csr <csr_name> 1
- 1
<csr_name>
은 현재 CSR 목록에 있는 CSR의 이름입니다.
각각의 유효한 CSR을 승인합니다.
$ oc adm certificate approve <csr_name>
컨트롤 플레인 노드가 준비되면 모든 작업자 노드가 준비되었는지 확인합니다.
$ oc get nodes -l node-role.kubernetes.io/worker
다음 출력에 표시된 대로 작업자 노드의 상태가
Ready
인 경우 작업자 노드는 준비된 것입니다.NAME STATUS ROLES AGE VERSION ip-10-0-179-95.ec2.internal Ready worker 64m v1.29.4 ip-10-0-182-134.ec2.internal Ready worker 64m v1.29.4 ip-10-0-250-100.ec2.internal Ready worker 64m v1.29.4
작업자 노드가 준비되지 않은 경우 승인해야하는 보류중인 인증서 서명 요청(CSR)이 있는지 확인합니다.
현재 CSR의 목록을 가져옵니다.
$ oc get csr
CSR의 세부 사항을 검토하여 CSR이 유효한지 확인합니다.
$ oc describe csr <csr_name> 1
- 1
<csr_name>
은 현재 CSR 목록에 있는 CSR의 이름입니다.
각각의 유효한 CSR을 승인합니다.
$ oc adm certificate approve <csr_name>
컨트롤 플레인 및 작업자 노드가 준비되면 클러스터의 모든 노드를 예약 가능으로 표시합니다. 다음 명령을 실행합니다.
for node in $(oc get nodes -o jsonpath='{.items[*].metadata.name}'); do echo ${node} ; oc adm uncordon ${node} ; done
클러스터가 제대로 시작되었는지 확인합니다.
성능이 저하된 클러스터 Operator가 없는지 확인합니다.
$ oc get clusteroperators
DEGRADED
조건이True
로 설정된 클러스터 Operator가 없는지 확인합니다.NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.16.0 True False False 59m cloud-credential 4.16.0 True False False 85m cluster-autoscaler 4.16.0 True False False 73m config-operator 4.16.0 True False False 73m console 4.16.0 True False False 62m csi-snapshot-controller 4.16.0 True False False 66m dns 4.16.0 True False False 76m etcd 4.16.0 True False False 76m ...
모든 노드가
Ready
상태에 있는지 확인합니다.$ oc get nodes
모든 노드의 상태가
Ready
상태인지 확인합니다.NAME STATUS ROLES AGE VERSION ip-10-0-168-251.ec2.internal Ready control-plane,master 82m v1.29.4 ip-10-0-170-223.ec2.internal Ready control-plane.master 82m v1.29.4 ip-10-0-179-95.ec2.internal Ready worker 70m v1.29.4 ip-10-0-182-134.ec2.internal Ready worker 70m v1.29.4 ip-10-0-211-16.ec2.internal Ready control-plane,master 82m v1.29.4 ip-10-0-250-100.ec2.internal Ready worker 69m v1.29.4
클러스터가 제대로 시작되지 않은 경우 etcd 백업을 사용하여 클러스터를 복원해야 할 수 있습니다.
추가 리소스
4장. OADP 애플리케이션 백업 및 복원
4.1. 데이터 보호를 위한 OpenShift API 소개
OADP(OpenShift API for Data Protection) 제품은 OpenShift Container Platform에서 고객 애플리케이션을 보호합니다. OpenShift Container Platform 애플리케이션, 애플리케이션 관련 클러스터 리소스, 영구 볼륨 및 내부 이미지를 다루는 포괄적인 재해 복구 보호 기능을 제공합니다. OADP는 컨테이너화된 애플리케이션과 VM(가상 머신)을 모두 백업할 수 있습니다.
그러나 OADP는 etcd 또는 OpenShift Operator의 재해 복구 솔루션 역할을 하지 않습니다.
OADP 지원은 고객 워크로드 네임스페이스 및 클러스터 범위 리소스에 제공됩니다.
4.1.1. OpenShift API for Data Protection API
OADP(OpenShift API for Data Protection)는 여러 가지 접근 방식을 통해 백업을 사용자 지정하고 불필요하거나 부적절한 리소스가 포함되지 않도록 하는 API를 제공합니다.
OADP는 다음 API를 제공합니다.
4.1.1.1. OpenShift API for Data Protection 지원
버전 | OCP 버전 | 정식 출시일 (GA) | 완전 지원 종료 | 유지 관리 종료 | Extended Update Support (EUS) | EUS (Extended Update Support Term 2) |
1.4 |
| 2024년 7월 10일 | 1.5 릴리스 | 1.6 릴리스 | 2026년 6월 27일 EUS는 OCP 4.16에 있어야 합니다. | 2027년 6월 27일 EUS 기간 2는 OCP 4.16에 있어야 합니다. |
1.3 |
| 2023년 11월 29일 | 2024년 7월 10일 | 1.5 릴리스 | 2025년 10월 31일 EUS는 OCP 4.14에 있어야 합니다. | 2026년 10월 31일 EUS 용어 2는 OCP 4.14에 있어야 합니다. |
4.1.1.1.1. 지원되지 않는 OADP Operator 버전
버전 | 정식 출시일 (GA) | 완전 지원 종료 | 유지 관리 종료 |
1.2 | 2023년 6월 14일 | 2023년 11월 29일 | 2024년 7월 10일 |
1.1 | 2022년 9월 01일 | 2023년 6월 14일 | 2023년 11월 29일 |
1.0 | 2022년 2월 9일 | 2022년 9월 01일 | 2023년 6월 14일 |
EUS에 대한 자세한 내용은 확장 업데이트 지원을 참조하십시오.
EUS 용어 2에 대한 자세한 내용은 Extended Update Support Term 2 를 참조하십시오.
추가 리소스
4.2. OADP 릴리스 노트
4.2.1. OADP 1.4 릴리스 노트
OADP(OpenShift API for Data Protection)의 릴리스 노트는 새로운 기능 및 개선 사항, 더 이상 사용되지 않는 기능, 제품 권장 사항, 알려진 문제, 해결된 문제를 설명합니다.
OADP에 대한 자세한 내용은 OADP (OpenShift API for Data Protection) FAQ를 참조하십시오.
4.2.1.1. OADP 1.4.2 릴리스 노트
OADP(OpenShift API for Data Protection) 1.4.2 릴리스 노트에는 새로운 기능, 해결된 문제 및 버그 및 알려진 문제가 나열됩니다.
4.2.1.1.1. 새로운 기능
VolumePolicy 기능을 사용하여 동일한 네임스페이스에 다른 볼륨을 백업할 수 있음
이번 릴리스에서는 Velero가 VolumePolicy
기능을 사용하여 동일한 네임스페이스에 다른 볼륨을 백업하는 리소스 정책을 제공합니다. 다른 볼륨을 백업하기 위해 지원되는 VolumePolicy
기능에는 skip
,snapshot
, fs-backup
작업이 포함됩니다. OADP-1071
파일 시스템 백업 및 데이터 이동기에서 단기 자격 증명을 사용할 수 있음
파일 시스템 백업 및 데이터 이동자는 이제 AWS STS(보안 토큰 서비스) 및 GCP WIF와 같은 단기 인증 정보를 사용할 수 있습니다. 이 지원을 통해 PartiallyFailed
상태 없이 백업이 성공적으로 완료됩니다. OADP-5095
4.2.1.1.2. 해결된 문제
이제 VSL에 잘못된 공급자 값이 포함된 경우 DPA에서 오류를 보고합니다.
이전에는 VSL( Volume Snapshot Location) 사양 공급자가 잘못된 경우 DPA(데이터 보호 애플리케이션)가 성공적으로 조정되었습니다. 이번 업데이트를 통해 DPA는 유효한 공급자 값에 대한 오류 및 요청을 보고합니다. OADP-5044
데이터 Mover 복원은 백업 및 복원을 위해 다른 OADP 네임스페이스를 사용하는 것과 관계없이 성공
이전 버전에서는 하나의 네임스페이스에 설치된 OADP를 사용하여 백업 작업을 실행했지만 다른 네임스페이스에 설치된 OADP를 사용하여 복원할 때 Data Mover 복원에 실패했습니다. 이번 업데이트를 통해 Data Mover 복원이 성공적으로 수행됩니다. OADP-5460
SSE-C 백업은 계산된 MD5 보안 키에서 작동합니다.
이전에는 백업에 다음 오류와 함께 실패했습니다.
Requests specifying Server Side Encryption with Customer provided keys must provide the client calculated MD5 of the secret key.
이번 업데이트를 통해 이제 SSE-C(Customer-C) base64 및 MD5 해시를 사용하여 Server-Side 암호화가 수정되었습니다. 결과적으로 SSE-C 백업은 계산된 MD5의 시크릿 키와 함께 작동합니다. 또한 customerKey
크기에 대한 잘못된 오류
도 수정되었습니다. OADP-5388
이 릴리스에서 해결된 모든 문제의 전체 목록은 Jira의 OADP 1.4.2 해결 문제 목록을 참조하십시오.
4.2.1.1.3. 확인된 문제
Data Mover 복원 작업에 nodeSelector 사양이 지원되지 않음
nodeAgent
매개변수에 설정된 nodeSelector
필드를 사용하여 DPA(데이터 보호 애플리케이션)가 생성되면 복원 작업을 완료하는 대신 데이터 Mover 복원이 부분적으로 실패합니다. OADP-5260
TLS 건너뛰기 확인이 지정되면 S3 스토리지에서 프록시 환경을 사용하지 않습니다.
이미지 레지스트리 백업에서 insecureSkipTLSVerify
매개변수가 true
로 설정된 경우 S3 스토리지에서 프록시 환경을 사용하지 않습니다. OADP-3143
Kopia는 백업 만료 후 아티팩트를 삭제하지 않습니다.
백업을 삭제한 후에도 Kopia는 백업이 만료된 후 ${bucket_name}/kopia/$openshift-adp
에서 볼륨 아티팩트를 삭제하지 않습니다. 자세한 내용은 "Kopia 리포지토리 유지 관리"를 참조하십시오. OADP-5131
추가 리소스
4.2.1.2. OADP 1.4.1 릴리스 노트
OADP(OpenShift API for Data Protection) 1.4.1 릴리스 노트에는 새로운 기능, 해결된 문제 및 버그 및 알려진 문제가 나열됩니다.
4.2.1.2.1. 새로운 기능
클라이언트 qps 및 버스트 업데이트를 위한 새로운 DPA 필드
새로운 DPA(Data Protection Application) 필드를 사용하여 초당 Velero 서버 Kubernetes API 쿼리 및 버스트 값을 변경할 수 있습니다. 새 DPA 필드는 spec.configuration.velero.client-qps
및 spec.configuration.velero.client-burst
입니다. 기본값은 100입니다. OADP-4076
Kopia를 사용하여 기본이 아닌 알고리즘 활성화
이번 업데이트를 통해 이제 Kopia에서 해시, 암호화 및 분할 알고리즘을 구성하여 기본이 아닌 옵션을 선택하여 다른 백업 워크로드에 대한 성능을 최적화할 수 있습니다.
이러한 알고리즘을 구성하려면 DPA(DataProtectionApplication) 구성의 podConfig
섹션에서 velero
Pod의 env
변수를 설정합니다. 이 변수가 설정되지 않았거나 지원되지 않는 알고리즘이 선택되면 Kopia는 기본적으로 표준 알고리즘으로 설정됩니다. OADP-4640
4.2.1.2.2. 해결된 문제
Pod 없이 백업 복원이 성공적으로 수행됨
이전 버전에서는 Pod 없이 백업을 복원하고 StorageClass VolumeBindingMode
가 WaitForFirstConsumer
로 설정된 경우 다음과 같은 오류가 발생했습니다. err: 컨텍스트 데드가 초과
되었습니다. 이번 업데이트를 통해 동적 PV 패치를 건너뛰고 백업 복원은
PartiallyFailed
상태 없이 성공적으로 수행됩니다. OADP-4231
PodVolumeBackup CR에 올바른 메시지가 표시됨
이전에는 PodVolumeBackup
CR(사용자 정의 리소스)에서 잘못된 메시지를 생성했습니다. 즉, 서버를 시작하는 동안 "InProgress" 상태의 podvolumebackup을 가져와서 "Failed"로
표시했습니다. 이번 업데이트를 통해 생성된 메시지가 이제 다음과 같습니다.
found a podvolumebackup with status "InProgress" during the server starting, mark it as "Failed".
DPA를 사용하여 imagePullPolicy를 덮어쓸 수 있음
이전에는 OADP가 모든 이미지에 대해 imagePullPolicy
매개변수를 Always
로 설정합니다. 이번 업데이트를 통해 OADP는 각 이미지에 sha256
또는 sha512
다이제스트가 포함되어 있는지 확인한 다음 imagePullPolicy
를 IfNotPresent
로 설정합니다. 그러지 않으면 imagePullPolicy
가 Always
로 설정됩니다. 이제 새 spec.containerImagePullPolicy
DPA 필드를 사용하여 이 정책을 덮어쓸 수 있습니다. OADP-4172
초기 업데이트가 실패하면 OADP Velero에서 복원 상태 업데이트를 다시 시도할 수 있습니다.
이전에는 OADP Velero가 복원된 CR 상태를 업데이트하지 못했습니다. 그러면 InProgress
가 무기한 상태가 되었습니다. 백업에 의존하여 CR 상태를 복원하여 완료를 확인하는 구성 요소가 실패했습니다. 이번 업데이트를 통해 복원 CR 상태가 Completed
또는 Failed
상태로 올바르게 진행됩니다. OADP-3227
다른 클러스터에서 BuildConfig 빌드 복원은 오류 없이 성공적으로 수행됩니다.
이전 버전에서는 다른 클러스터에서 BuildConfig
Build 리소스의 복원을 수행할 때 애플리케이션에서 내부 이미지 레지스트리에 대한 TLS 확인에 오류가 발생했습니다. 이로 인해 인증서 확인 실패: x509: certificate signed by unknown authority
오류가 발생했습니다. 이번 업데이트를 통해 BuildConfig
빌드 리소스를 다른 클러스터로 복원하면 인증서 오류를 확인하지 못했습니다
. OADP-4692
빈 PVC를 복원하는 데 성공
이전에는 PVC(영구 볼륨 클레임)를 복원하는 동안 데이터를 다운로드하지 못했습니다. 다음과 같은 오류와 함께 실패했습니다.
data path restore failed: Failed to run kopia restore: Unable to load snapshot : snapshot not found
이번 업데이트를 통해 빈 PVC를 복원할 때 데이터 다운로드가 올바른 것으로 진행되며 오류 메시지가 생성되지 않습니다. OADP-3106
CSI 및 DataMover 플러그인에는 Velero 메모리 누출이 없습니다.
이전에는 Velero 메모리 누수가 CSI 및 DataMover 플러그인을 사용하여 발생했습니다. 백업이 종료되면 Velero 플러그인 인스턴스가 삭제되지 않았으며 Velero Pod에서 OOM( Out of Memory
) 조건이 생성될 때까지 메모리 누수에 메모리가 사용되었습니다. 이번 업데이트를 통해 CSI 및 DataMover 플러그인을 사용할 때 Velero 메모리 누출이 발생하지 않습니다. OADP-4448
관련 PV가 릴리스되기 전에 후크 후 작업이 시작되지 않음
이전 버전에서는 Data Mover 작업의 비동기적 특성으로 인해 PVC(데이터 Mover 영구 볼륨 클레임)가 관련 Pod의 PV(영구 볼륨)를 릴리스하기 전에 후크 후 시도할 수 있었습니다. 이 문제로 인해 PartiallyFailed
상태로 백업이 실패했습니다. 이번 업데이트를 통해 관련 PV가 Data Mover PVC에 의해 릴리스될 때까지 단계 후 작업이 시작되어 PartiallyFailed
백업 상태가 제거됩니다. OADP-3140
DPA 배포는 37자를 초과하는 네임스페이스에서 예상대로 작동합니다.
새 DPA를 생성하기 위해 37자 이상의 네임스페이스에 OADP Operator를 설치하면 "cloud-credentials" 시크릿 레이블이 실패하고 DPA에서 다음 오류를 보고합니다.
The generated label name is too long.
이번 업데이트를 통해 DPA 생성은 이름에 37자를 초과하는 네임스페이스에서 실패하지 않습니다. OADP-3960
시간 초과 오류를 재정의하여 복원이 성공적으로 완료되었습니다.
이전 버전에서는 대규모 환경에서 복원 작업으로 인해 오류가 있는 Partiallyfailed
상태가 발생하고 동적 PV를 패치할 수 없었습니다. err: 컨텍스트 데드가 초과되었습니다
. 이번 업데이트를 통해 resourceTimeout
Velero 서버 인수가 이 시간 초과 오류를 재정의하여 성공적으로 복원됩니다. OADP-4344
이 릴리스에서 해결된 모든 문제의 전체 목록은 Jira의 OADP 1.4.1 해결 문제 목록을 참조하십시오.
4.2.1.2.3. 확인된 문제
Cryostat 애플리케이션 pod는 OADP를 복원한 후 CrashLoopBackoff
상태로 들어갑니다.
OADP 복원 후 Cryostat 애플리케이션 Pod가 CrashLoopBackoff
상태가 될 수 있습니다. 이 문제를 해결하려면 OADP를 복원한 후 CrashLoopBackoff
오류를 반환하는 StatefulSet
Pod를 삭제합니다. 그런 다음 StatefulSet
컨트롤러는 이러한 Pod를 다시 생성하고 정상적으로 실행됩니다. OADP-4407
ImageStream을 참조하는 배포가 제대로 복원되지 않아 손상된 Pod 및 볼륨 콘텐츠
FSB(파일 시스템 백업) 복원 작업 중에 ImageStream
을 참조하는 Deployment
리소스가 제대로 복원되지 않습니다. FSB를 실행하는 복원된 Pod 및 postHook
은 조기에 종료됩니다.
복원 작업 중에 OpenShift Container Platform 컨트롤러는 Deployment
리소스의 spec.template.spec.containers[0].image
필드를 업데이트된 ImageStreamTag
해시로 업데이트합니다. 이번 업데이트에서는 새 Pod의 롤아웃을 트리거하여 velero
가 후크 후 함께 FSB를 실행하는 Pod를 종료합니다. 이미지 스트림 트리거에 대한 자세한 내용은 이미지 스트림 변경에 대한 업데이트 트리거 를 참조하십시오.
이 동작에 대한 해결방법은 2단계 복원 프로세스입니다.
배포
리소스를 제외한 복원을 수행합니다. 예를 들면 다음과 같습니다.$ velero restore create <RESTORE_NAME> \ --from-backup <BACKUP_NAME> \ --exclude-resources=deployment.apps
첫 번째 복원이 성공하면 다음 리소스를 포함하여 두 번째 복원을 수행합니다. 예를 들면 다음과 같습니다.
$ velero restore create <RESTORE_NAME> \ --from-backup <BACKUP_NAME> \ --include-resources=deployment.apps
4.2.1.3. OADP 1.4.0 릴리스 노트
OADP(OpenShift API for Data Protection) 1.4.0 릴리스 노트에는 해결된 문제 및 알려진 문제가 나열됩니다.
4.2.1.3.1. 해결된 문제
OpenShift Container Platform 4.16에서 복원이 올바르게 작동합니다.
이전 버전에서는 삭제된 애플리케이션 네임스페이스를 복원하는 동안 리소스 이름으로 복원 작업이 부분적으로 실패했습니다. OpenShift Container Platform 4.16에서 복원 오류가 비어 있지 않을 수
있었습니다. 이번 업데이트를 통해 복원은 OpenShift Container Platform 4.16에서 예상대로 작동합니다. OADP-4075
OpenShift Container Platform 4.16 클러스터에서 데이터 Mover 백업이 제대로 작동함
이전에는 Velero에서 Spec.SourceVolumeMode
필드가 없는 이전 버전의 SDK를 사용하고 있었습니다. 그 결과 버전 4.2가 있는 외부 스냅샷터의 OpenShift Container Platform 4.16 클러스터에서 Data Mover 백업이 실패했습니다. 이번 업데이트를 통해 외부 스냅샷터가 버전 7.0 이상으로 업그레이드됩니다. 결과적으로 OpenShift Container Platform 4.16 클러스터에서 백업이 실패하지 않습니다. OADP-3922
이 릴리스에서 해결된 모든 문제의 전체 목록은 Jira의 OADP 1.4.0 해결 문제 목록을 참조하십시오.
4.2.1.3.2. 확인된 문제
MCG에 대해 checksumAlgorithm이 설정되지 않은 경우 백업이 실패합니다.
Noobaa를 백업 위치로 사용하여 모든 애플리케이션의 백업을 수행하는 동안 checksumAlgorithm
구성 매개변수가 설정되지 않은 경우 백업이 실패합니다. 이 문제를 해결하려면 백업 스토리지 위치(BSL) 구성에 checksumAlgorithm
에 대한 값을 지정하지 않으면 빈 값이 추가됩니다. 빈 값은 DPA(Data Protection Application) 사용자 정의 리소스(CR)를 사용하여 생성된 BSL에 대해서만 추가되며 BSL이 다른 방법을 사용하여 생성된 경우에는 이 값이 추가되지 않습니다. OADP-4274
이 릴리스에서 알려진 모든 문제의 전체 목록은 Jira의 OADP 1.4.0 알려진 문제 목록을 참조하십시오.
4.2.1.3.3. 업그레이드 노트
항상 다음 마이너 버전으로 업그레이드합니다. 버전을 건너뛰지 마십시오. 이후 버전으로 업데이트하려면 한 번에 하나의 채널만 업그레이드합니다. 예를 들어 OADP(OpenShift API for Data Protection) 1.1에서 1.3으로 업그레이드하려면 먼저 1.2로 업그레이드한 다음 1.3으로 업그레이드합니다.
4.2.1.3.3.1. OADP 1.3에서 1.4로 변경
Velero 서버가 버전 1.12에서 1.14로 업데이트되었습니다. DPA(Data Protection Application)에 변경 사항이 없습니다.
이렇게 하면 다음이 변경됩니다.
-
이제 Velero 코드에서
velero-plugin-for-csi
코드를 사용할 수 있으므로 플러그인에init
컨테이너가 더 이상 필요하지 않습니다. - Velero가 클라이언트 Burst 및 QPS 기본값을 각각 30 및 20에서 100으로 변경했습니다.
velero-plugin-for-aws
플러그인은""
에서CRC32
알고리즘으로BackupStorageLocation
오브젝트(BSL)의spec.config.checksumAlgorithm
필드의 기본값을 업데이트했습니다. 체크섬 알고리즘 유형은 AWS에서만 작동하는 것으로 알려져 있습니다. 여러 S3 공급자를 사용하려면 체크섬 알고리즘을""
로 설정하여md5sum
을 비활성화해야 합니다. 스토리지 공급자를 사용한md5sum
알고리즘 지원 및 구성을 확인합니다.OADP 1.4에서 이 구성에 대해 DPA 내에서 생성된 BSL의 기본값은
""
입니다. 이 기본값은 OADP 1.3과 일치하는md5sum
이 확인되지 않음을 의미합니다. DPA 내에서 생성된 BSL의 경우 DPA의spec.backupLocations[].velero.config.checksumAlgorithm
필드를 사용하여 업데이트합니다. BSL이 DPA 외부에서 생성되는 경우 BSL에서spec.config.checksumAlgorithm
을 사용하여 이 구성을 업데이트할 수 있습니다.
4.2.1.3.3.2. DPA 구성 백업
현재 DPA( DataProtectionApplication
) 구성을 백업해야 합니다.
프로세스
다음 명령을 실행하여 현재 DPA 구성을 저장합니다.
명령 예
$ oc get dpa -n openshift-adp -o yaml > dpa.orig.backup
4.2.1.3.3.3. OADP Operator 업그레이드
OADP(OpenShift API for Data Protection) Operator를 업그레이드할 때 다음 절차를 사용하십시오.
프로세스
-
OADP Operator의 서브스크립션 채널을
stable-1.3
에서stable-1.4
로 변경합니다. - Operator 및 컨테이너가 업데이트되고 다시 시작될 때까지 기다립니다.
추가 리소스
4.2.1.3.4. DPA를 새 버전으로 변환
OADP 1.3에서 1.4로 업그레이드하려면 DPA(Data Protection Application)를 변경할 필요가 없습니다.
4.2.1.3.5. 업그레이드 확인
다음 절차를 사용하여 업그레이드를 확인합니다.
프로세스
다음 명령을 실행하여 OADP(OpenShift API for Data Protection) 리소스를 확인하여 설치를 확인합니다.
$ oc get all -n openshift-adp
출력 예
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/restic-9cq4q 1/1 Running 0 94s pod/restic-m4lts 1/1 Running 0 94s pod/restic-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/restic 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
다음 명령을 실행하여 DPA(
DataProtectionApplication
)가 조정되었는지 확인합니다.$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
출력 예
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
-
유형이
Reconciled
으로 설정되어 있는지 확인합니다. 백업 스토리지 위치를 확인하고 다음 명령을 실행하여
PHASE
가 사용
가능한지 확인합니다.$ oc get backupstoragelocations.velero.io -n openshift-adp
출력 예
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
4.3. OADP 성능
4.3.1. OADP 권장 네트워크 설정
OADP(OpenShift API for Data Protection)에서 지원되는 환경을 위해 OpenShift 노드, S3 스토리지 및 OpenShift 네트워크 요구 사항 권장 사항을 충족하는 지원되는 클라우드 환경에서 안정적이고 탄력적인 네트워크가 있어야 합니다.
최적의 데이터 경로가 있는 원격 S3 버킷을 사용한 배포에 대한 백업 및 복원 작업을 성공적으로 수행하려면 네트워크 설정이 최적의 조건에서 다음과 같은 최소 요구 사항을 충족하는 것이 좋습니다.
- 대역폭(오브젝트 스토리지에 대한 네트워크 업로드 속도): 작은 백업의 경우 2Mbps 이상, 더 큰 백업의 데이터 볼륨에 따라 10-100Mbps입니다.
- 패킷 손실: 1%
- 패킷 손상: 1%
- 대기 시간: 100ms
OpenShift Container Platform 네트워크가 최적으로 작동하고 OpenShift Container Platform 네트워크 요구 사항을 충족하는지 확인하십시오.
Red Hat은 표준 백업 및 복원 실패를 지원하지만 권장 임계값을 충족하지 않는 네트워크 설정으로 인한 오류를 지원하지 않습니다.
4.4. OADP 기능 및 플러그인
OADP(OpenShift API for Data Protection) 기능은 애플리케이션 백업 및 복원 옵션을 제공합니다.
기본 플러그인을 사용하면 Velero가 특정 클라우드 공급자와 통합하고 OpenShift Container Platform 리소스를 백업 및 복원할 수 있습니다.
4.4.1. OADP 기능
OADP(OpenShift API for Data Protection)는 다음 기능을 지원합니다.
- Backup
OADP를 사용하여 OpenShift Platform의 모든 애플리케이션을 백업하거나 유형, 네임스페이스 또는 레이블별로 리소스를 필터링할 수 있습니다.
OADP는 Kubernetes 오브젝트 및 내부 이미지를 오브젝트 스토리지에 아카이브 파일로 저장하여 백업합니다. OADP는 기본 클라우드 스냅샷 API 또는 CSI(Container Storage Interface)를 사용하여 스냅샷을 생성하여 PV(영구 볼륨)를 백업합니다. 스냅샷을 지원하지 않는 클라우드 공급자의 경우 OADP는 Restic을 사용하여 리소스 및 PV 데이터를 백업합니다.
참고백업 및 복원을 성공하려면 애플리케이션 백업에서 Operator를 제외해야 합니다.
- Restore
백업에서 리소스 및 PV를 복원할 수 있습니다. 백업의 모든 오브젝트를 복원하거나 네임스페이스, PV 또는 라벨별로 오브젝트를 필터링할 수 있습니다.
참고백업 및 복원을 성공하려면 애플리케이션 백업에서 Operator를 제외해야 합니다.
- 스케줄
- 지정된 간격으로 백업을 예약할 수 있습니다.
- 후크
-
후크를 사용하여 Pod의 컨테이너에서 명령을 실행할 수 있습니다(예:
fsfreeze
를 사용하여 파일 시스템을 정지). 백업 또는 복원 전후에 실행되도록 후크를 구성할 수 있습니다. 복원 후크는 init 컨테이너 또는 애플리케이션 컨테이너에서 실행될 수 있습니다.
4.4.2. OADP 플러그인
OADP(OpenShift API for Data Protection)는 스토리지 공급자와 통합된 기본 Velero 플러그인을 제공하여 백업 및 스냅샷 작업을 지원합니다. Velero 플러그인을 기반으로 사용자 지정 플러그인 을 생성할 수 있습니다.
OADP는 OpenShift Container Platform 리소스 백업, OpenShift Virtualization 리소스 백업 및 CSI(Container Storage Interface) 스냅샷에 대한 플러그인도 제공합니다.
OADP 플러그인 | 함수 | 스토리지 위치 |
---|---|---|
| Kubernetes 오브젝트를 백업하고 복원합니다. | AWS S3 |
스냅샷을 사용하여 볼륨을 백업하고 복원합니다. | AWS EBS | |
| Kubernetes 오브젝트를 백업하고 복원합니다. | Microsoft Azure Blob 스토리지 |
스냅샷을 사용하여 볼륨을 백업하고 복원합니다. | Microsoft Azure Managed Disks | |
| Kubernetes 오브젝트를 백업하고 복원합니다. | Google Cloud Storage |
스냅샷을 사용하여 볼륨을 백업하고 복원합니다. | Google Compute Engine Disks | |
| OpenShift Container Platform 리소스를 백업하고 복원합니다. [1] | 오브젝트 저장소 |
| OpenShift Virtualization 리소스를 백업하고 복원합니다. [2] | 오브젝트 저장소 |
| CSI 스냅샷을 사용하여 볼륨을 백업하고 복원합니다. [3] | CSI 스냅샷을 지원하는 클라우드 스토리지 |
| VolumeSnapshotMover는 클러스터 삭제와 같은 상황에서 상태 저장 애플리케이션을 복구하는 동안 사용할 스냅샷을 클러스터에서 오브젝트 저장소로 재배치합니다. [4] | 오브젝트 저장소 |
- 필수 항목입니다.
- 가상 머신 디스크는 CSI 스냅샷 또는 Restic을 사용하여 백업됩니다.
csi
플러그인은 Kubernetes CSI 스냅샷 API를 사용합니다.-
OADP 1.1 이상에서는
snapshot.storage.k8s.io/v1
을 사용합니다. -
OADP 1.0 uses
snapshot.storage.k8s.io/v1beta1
-
OADP 1.1 이상에서는
- OADP 1.2만 해당
4.4.3. OADP Velero 플러그인 정보
Velero를 설치할 때 두 가지 유형의 플러그인을 구성할 수 있습니다.
- 기본 클라우드 공급자 플러그인
- 사용자 정의 플러그인
두 가지 유형의 플러그인은 모두 선택 사항이지만 대부분의 사용자는 하나 이상의 클라우드 공급자 플러그인을 구성합니다.
4.4.3.1. 기본 Velero 클라우드 공급자 플러그인
배포 중에 oadp_v1alpha1_dpa.yaml
파일을 구성할 때 다음 기본 Velero 클라우드 공급자 플러그인을 설치할 수 있습니다.
-
AWS(
Amazon Web Services) -
GCP
(Google Cloud Platform) -
Azure
(Microsoft Azure) -
openshift
(OpenShift Velero plugin) -
csi
(Container Storage Interface) -
kubevirt
(KubeVirt)
배포 중에 oadp_v1alpha1_dpa.yaml
파일에 원하는 기본 플러그인을 지정합니다.
파일 예
다음 .yaml
파일은 openshift
,aws
,azure
및 gcp
플러그인을 설치합니다.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: dpa-sample spec: configuration: velero: defaultPlugins: - openshift - aws - azure - gcp
4.4.3.2. 사용자 정의 Velero 플러그인
배포 중에 oadp_v1alpha1_dpa.yaml
파일을 구성할 때 플러그인 image
및 name
을 지정하여 사용자 지정 Velero 플러그인을 설치할 수 있습니다.
배포 중에 oadp_v1alpha1_dpa.yaml
파일에 원하는 사용자 지정 플러그인을 지정합니다.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: dpa-sample spec: configuration: velero: defaultPlugins: - openshift - azure - gcp customPlugins: - name: custom-plugin-example image: quay.io/example-repo/custom-velero-plugin
4.4.3.3.
4.4.4.
- s390x
OADP 1.2.0 이상 버전은 ARM64 아키텍처를 지원합니다.
4.4.5. IBM Power 및 IBM Z에 대한 OADP 지원
OADP(OpenShift API for Data Protection)는 플랫폼 중립입니다. 다음 정보는 IBM Power® 및 IBM Z®에만 관련이 있습니다.
- OADP 1.1.7은 IBM Power® 및 IBM Z® 모두에 대해 OpenShift Container Platform 4.11에 대해 성공적으로 테스트되었습니다. 다음 섹션에서는 이러한 시스템의 백업 위치 측면에서 OADP 1.1.7에 대한 테스트 및 지원 정보를 제공합니다.
- OADP 1.2.3은 IBM Power® 및 IBM Z® 모두에 대해 OpenShift Container Platform 4.12, 4.13, 4.14 및 4.15에 대해 성공적으로 테스트되었습니다. 다음 섹션에서는 이러한 시스템의 백업 위치 측면에서 OADP 1.2.3에 대한 테스트 및 지원 정보를 제공합니다.
- OADP 1.3.3은 IBM Power® 및 IBM Z® 모두에 대해 OpenShift Container Platform 4.12, 4.13, 4.14 및 4.15에 대해 성공적으로 테스트되었습니다. 다음 섹션에서는 이러한 시스템의 백업 위치 측면에서 OADP 1.3.3에 대한 테스트 및 지원 정보를 제공합니다.
- OADP 1.4.2는 IBM Power® 및 IBM Z® 모두에 대해 OpenShift Container Platform 4.14, 4.15 및 4.16에 대해 성공적으로 테스트되었습니다. 다음 섹션에서는 이러한 시스템의 백업 위치 측면에서 OADP 1.4.2에 대한 테스트 및 지원 정보를 제공합니다.
4.4.5.1. IBM Power를 사용하여 대상 백업 위치에 대한 OADP 지원
- OpenShift Container Platform 4.11 및 4.12로 실행 중인 IBM Power®는 OADP(OpenShift API for Data Protection) 1.1.7을 사용하여 AWS S3 백업 위치 대상에 대해 성공적으로 테스트되었습니다. 이 테스트에는 AWS S3 대상만 포함되었지만 Red Hat은 OpenShift Container Platform 4.11 및 4.12를 사용하여 IBM Power® 실행을 지원하며 AWS가 아닌 모든 S3 백업 위치 대상에 대해 OADP 1.1.7도 지원합니다.
- OpenShift Container Platform 4.12, 4.13, 4.14, 4.15 및 OADP 1.2.3에서 실행되는 IBM Power®는 AWS S3 백업 위치 대상에 대해 성공적으로 테스트되었습니다. 이 테스트는 AWS S3 대상만 포함했지만 Red Hat은 AWS가 아닌 모든 S3 백업 위치 대상에 대해 OpenShift Container Platform 4.12, 4.13. 4.14, 4.15 및 OADP 1.2.3을 사용하여 IBM Power® 실행을 지원합니다.
- OpenShift Container Platform 4.12, 4.13, 4.14, 4.15 및 OADP 1.3.3에서 실행 중인 IBM Power®는 AWS S3 백업 위치 대상에 대해 성공적으로 테스트되었습니다. 이 테스트에는 AWS S3 대상만 포함되었지만 Red Hat은 AWS가 아닌 모든 S3 백업 위치 대상에 대해 OpenShift Container Platform 4.13, 4.14, 4.15 및 OADP 1.3.3을 사용하여 IBM Power® 실행을 지원합니다.
- OpenShift Container Platform 4.14, 4.15, 4.16 및 OADP 1.4.2에서 실행 중인 IBM Power®는 AWS S3 백업 위치 대상에 대해 성공적으로 테스트되었습니다. 이 테스트는 AWS S3 대상만 포함했지만 Red Hat은 AWS가 아닌 모든 S3 백업 위치 대상에 대해 OpenShift Container Platform 4.14, 4.15, 4.16, OADP 1.4.2를 사용하여 IBM Power® 실행을 지원합니다.
4.4.5.2. OADP 테스트 및 IBM Z를 사용하여 대상 백업 위치 지원
- OpenShift Container Platform 4.11 및 4.12로 실행 중인 IBM Z®는 OADP(OpenShift API for Data Protection) 1.1.7을 사용하여 AWS S3 백업 위치 대상에 대해 성공적으로 테스트되었습니다. 이 테스트에는 AWS S3 대상만 포함되었지만 Red Hat은 OpenShift Container Platform 4.11 및 4.12를 사용하여 IBM Z® 실행을 지원하며 AWS가 아닌 모든 S3 백업 위치 대상에 대해 OADP 1.1.7도 지원합니다.
- OpenShift Container Platform 4.12, 4.13, 4.14, 4.15 및 OADP 1.2.3에서 실행 중인 IBM Z®는 AWS S3 백업 위치 대상에 대해 성공적으로 테스트되었습니다. 이 테스트는 AWS S3 대상만 포함했지만 Red Hat은 AWS가 아닌 모든 S3 백업 위치 대상에 대해 OpenShift Container Platform 4.12, 4.14, 4.15 및 OADP 1.2.3을 사용하여 IBM Z® 실행을 지원합니다.
- OpenShift Container Platform 4.12, 4.13, 4.14, 4.15 및 1.3.3에서 실행 중인 IBM Z®는 AWS S3 백업 위치 대상에 대해 성공적으로 테스트되었습니다. 이 테스트는 AWS S3 대상만 포함했지만 Red Hat은 AWS가 아닌 모든 S3 백업 위치 대상에 대해 OpenShift Container Platform 4.13 4.14 및 4.15 및 1.3.3을 사용하여 IBM Z® 실행을 지원합니다.
- OpenShift Container Platform 4.14, 4.15 및 4.16에서 실행 중인 IBM Z®는 AWS S3 백업 위치 대상에 대해 성공적으로 테스트되었습니다. 이 테스트는 AWS S3 대상만 포함했지만 Red Hat은 AWS가 아닌 모든 S3 백업 위치 대상에 대해 OpenShift Container Platform 4.14, 4.15, 4.16, 1.4.2를 사용하여 IBM Z® 실행을 지원합니다.
4.4.5.2.1. IBM Power(R) 및 IBM Z(R) 플랫폼을 사용하는 OADP의 알려진 문제
- 현재 IBM Power® 및 IBM Z® 플랫폼에 배포된 단일 노드 OpenShift 클러스터에 대한 백업 방법 제한 사항이 있습니다. 현재 NFS 스토리지만 이러한 플랫폼의 단일 노드 OpenShift 클러스터와 호환됩니다. 또한 Kopia 및 Restic과 같은 파일 시스템 백업(FSB) 메서드만 백업 및 복원 작업에 지원됩니다. 현재 이 문제에 대한 해결방법이 없습니다.
4.4.6. OADP 플러그인의 알려진 문제
다음 섹션에서는 OADP(OpenShift API for Data Protection) 플러그인의 알려진 문제에 대해 설명합니다.
4.4.6.1. 시크릿이 누락되어 이미지 스트림 백업 중 Velero 플러그인 패닉
백업 및 백업 스토리지 위치(BSL)가 데이터 보호 애플리케이션(DPA)의 범위 외부에서 관리되는 경우, OADP 컨트롤러는 DPA 조정에서 관련 oadp-<bsl_name>-<bsl_provider>-registry-secret
을 생성하지 않습니다.
백업이 실행되면 다음 패닉 오류와 함께 이미지 스트림 백업에 OpenShift Velero 플러그인이 패닉됩니다.
024-02-27T10:46:50.028951744Z time="2024-02-27T10:46:50Z" level=error msg="Error backing up item" backup=openshift-adp/<backup name> error="error executing custom action (groupResource=imagestreams.image.openshift.io, namespace=<BSL Name>, name=postgres): rpc error: code = Aborted desc = plugin panicked: runtime error: index out of range with length 1, stack trace: goroutine 94…
4.4.6.1.1. 패닉 오류를 방지하기 위한 해결방법
Velero 플러그인 패닉 오류를 방지하려면 다음 단계를 수행합니다.
관련 라벨을 사용하여 사용자 지정 BSL에 레이블을 지정합니다.
$ oc label backupstoragelocations.velero.io <bsl_name> app.kubernetes.io/component=bsl
- 참고
$ oc -n openshift-adp get secret/oadp-<bsl_name>-<bsl_provider>-registry-secret -o json | jq -r '.data'
4.4.6.2.
4.4.6.2.1.
4.5.
4.5.1.
4.5.1.1.
apiVersion: objectbucket.io/v1alpha1 kind: ObjectBucketClaim metadata: name: test-obc 1 namespace: openshift-adp spec: storageClassName: openshift-storage.noobaa.io generateBucketName: test-backup-bucket 2
$ oc create -f <obc_file_name> 1
$ oc extract --to=- cm/test-obc 1
# BUCKET_NAME backup-c20...41fd # BUCKET_PORT 443 # BUCKET_REGION # BUCKET_SUBREGION # BUCKET_HOST s3.openshift-storage.svc
$ oc extract --to=- secret/test-obc
# AWS_ACCESS_KEY_ID ebYR....xLNMc # AWS_SECRET_ACCESS_KEY YXf...+NaCkdyC3QPym
$ oc get route s3 -n openshift-storage
[default] aws_access_key_id=<AWS_ACCESS_KEY_ID> aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
$ oc create secret generic \ cloud-credentials \ -n openshift-adp \ --from-file cloud=cloud-credentials
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: oadp-backup namespace: openshift-adp spec: configuration: nodeAgent: enable: true uploaderType: kopia velero: defaultPlugins: - aws - openshift - csi defaultSnapshotMoveData: true 1 backupLocations: - velero: config: profile: "default" region: noobaa s3Url: https://s3.openshift-storage.svc 2 s3ForcePathStyle: "true" insecureSkipTLSVerify: "true" provider: aws default: true credential: key: cloud name: cloud-credentials objectStorage: bucket: <bucket_name> 3 prefix: oadp
$ oc apply -f <dpa_filename>
$ oc get dpa -o yaml
apiVersion: v1 items: - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: namespace: openshift-adp #...# spec: backupLocations: - velero: config: #...# status: conditions: - lastTransitionTime: "20....9:54:02Z" message: Reconcile complete reason: Complete status: "True" type: Reconciled kind: List metadata: resourceVersion: ""
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 3s 15s true
apiVersion: velero.io/v1 kind: Backup metadata: name: test-backup namespace: openshift-adp spec: includedNamespaces: - <application_namespace> 1
$ oc apply -f <backup_cr_filename>
$ oc describe backup test-backup -n openshift-adp
Name: test-backup Namespace: openshift-adp # ....# Status: Backup Item Operations Attempted: 1 Backup Item Operations Completed: 1 Completion Timestamp: 2024-09-25T10:17:01Z Expiration: 2024-10-25T10:16:31Z Format Version: 1.1.0 Hook Status: Phase: Completed Progress: Items Backed Up: 34 Total Items: 34 Start Timestamp: 2024-09-25T10:16:31Z Version: 1 Events: <none>
4.5.2.
4.5.2.1.
apiVersion: velero.io/v1 kind: Restore metadata: name: test-restore 1 namespace: openshift-adp spec: backupName: <backup_name> 2 restorePVs: true namespaceMapping: <application_namespace>: test-restore-application 3
$ oc apply -f <restore_cr_filename>
$ oc describe restores.velero.io <restore_name> -n openshift-adp
$ oc project test-restore-application
$ oc get pvc,svc,deployment,secret,configmap
NAME STATUS VOLUME persistentvolumeclaim/mysql Bound pvc-9b3583db-...-14b86 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/mysql ClusterIP 172....157 <none> 3306/TCP 2m56s service/todolist ClusterIP 172.....15 <none> 8000/TCP 2m56s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/mysql 0/1 1 0 2m55s NAME TYPE DATA AGE secret/builder-dockercfg-6bfmd kubernetes.io/dockercfg 1 2m57s secret/default-dockercfg-hz9kz kubernetes.io/dockercfg 1 2m57s secret/deployer-dockercfg-86cvd kubernetes.io/dockercfg 1 2m57s secret/mysql-persistent-sa-dockercfg-rgp9b kubernetes.io/dockercfg 1 2m57s NAME DATA AGE configmap/kube-root-ca.crt 1 2m57s configmap/openshift-service-ca.crt 1 2m57s
4.5.3.
4.5.3.1.
apiVersion: objectbucket.io/v1alpha1 kind: ObjectBucketClaim metadata: name: test-obc 1 namespace: openshift-adp spec: storageClassName: openshift-storage.noobaa.io generateBucketName: test-backup-bucket 2
$ oc create -f <obc_file_name>
$ oc extract --to=- cm/test-obc 1
# BUCKET_NAME backup-c20...41fd # BUCKET_PORT 443 # BUCKET_REGION # BUCKET_SUBREGION # BUCKET_HOST s3.openshift-storage.svc
$ oc extract --to=- secret/test-obc
# AWS_ACCESS_KEY_ID ebYR....xLNMc # AWS_SECRET_ACCESS_KEY YXf...+NaCkdyC3QPym
[default] aws_access_key_id=<AWS_ACCESS_KEY_ID> aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
$ oc create secret generic \ cloud-credentials \ -n openshift-adp \ --from-file cloud=cloud-credentials
$ oc get cm/openshift-service-ca.crt \ -o jsonpath='{.data.service-ca\.crt}' | base64 -w0; echo
LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0... ....gpwOHMwaG9CRmk5a3....FLS0tLS0K
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: oadp-backup namespace: openshift-adp spec: configuration: nodeAgent: enable: true uploaderType: kopia velero: defaultPlugins: - aws - openshift - csi defaultSnapshotMoveData: true backupLocations: - velero: config: profile: "default" region: noobaa s3Url: https://s3.openshift-storage.svc s3ForcePathStyle: "true" insecureSkipTLSVerify: "false" 1 provider: aws default: true credential: key: cloud name: cloud-credentials objectStorage: bucket: <bucket_name> 2 prefix: oadp caCert: <ca_cert> 3
$ oc apply -f <dpa_filename>
$ oc get dpa -o yaml
apiVersion: v1 items: - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: namespace: openshift-adp #...# spec: backupLocations: - velero: config: #...# status: conditions: - lastTransitionTime: "20....9:54:02Z" message: Reconcile complete reason: Complete status: "True" type: Reconciled kind: List metadata: resourceVersion: ""
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 3s 15s true
apiVersion: velero.io/v1 kind: Backup metadata: name: test-backup namespace: openshift-adp spec: includedNamespaces: - <application_namespace> 1
$ oc apply -f <backup_cr_filename>
$ oc describe backup test-backup -n openshift-adp
Name: test-backup Namespace: openshift-adp # ....# Status: Backup Item Operations Attempted: 1 Backup Item Operations Completed: 1 Completion Timestamp: 2024-09-25T10:17:01Z Expiration: 2024-10-25T10:16:31Z Format Version: 1.1.0 Hook Status: Phase: Completed Progress: Items Backed Up: 34 Total Items: 34 Start Timestamp: 2024-09-25T10:16:31Z Version: 1 Events: <none>
4.5.4.
4.5.4.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: oadp-backup namespace: openshift-adp spec: configuration: nodeAgent: enable: true uploaderType: kopia velero: defaultPlugins: - legacy-aws 1 - openshift - csi defaultSnapshotMoveData: true backupLocations: - velero: config: profile: "default" region: noobaa s3Url: https://s3.openshift-storage.svc s3ForcePathStyle: "true" insecureSkipTLSVerify: "true" provider: aws default: true credential: key: cloud name: cloud-credentials objectStorage: bucket: <bucket_name> 2 prefix: oadp
$ oc apply -f <dpa_filename>
$ oc get dpa -o yaml
apiVersion: v1 items: - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: namespace: openshift-adp #...# spec: backupLocations: - velero: config: #...# status: conditions: - lastTransitionTime: "20....9:54:02Z" message: Reconcile complete reason: Complete status: "True" type: Reconciled kind: List metadata: resourceVersion: ""
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 3s 15s true
apiVersion: velero.io/v1 kind: Backup metadata: name: test-backup namespace: openshift-adp spec: includedNamespaces: - <application_namespace> 1
$ oc apply -f <backup_cr_filename>
$ oc describe backups.velero.io test-backup -n openshift-adp
Name: test-backup Namespace: openshift-adp # ....# Status: Backup Item Operations Attempted: 1 Backup Item Operations Completed: 1 Completion Timestamp: 2024-09-25T10:17:01Z Expiration: 2024-10-25T10:16:31Z Format Version: 1.1.0 Hook Status: Phase: Completed Progress: Items Backed Up: 34 Total Items: 34 Start Timestamp: 2024-09-25T10:16:31Z Version: 1 Events: <none>
4.6.
4.6.1.
4.6.1.1.
4.6.1.1.1.
4.6.1.1.2.
4.6.1.1.3.
4.6.1.2.
4.6.1.3.
4.6.1.4.
4.6.1.5.
4.6.1.5.1.
|
|
|
|
|
|
|
|
|
|
|
|
4.6.1.5.2.
resources: mds: limits: cpu: "3" memory: 128Gi requests: cpu: "3" memory: 8Gi
4.6.2.
4.6.2.1.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4.6.3.
4.6.3.1.
4.6.3.2.
$ BUCKET=<your_bucket>
$ REGION=<your_region>
$ aws s3api create-bucket \ --bucket $BUCKET \ --region $REGION \ --create-bucket-configuration LocationConstraint=$REGION 1
$ aws iam create-user --user-name velero 1
$ cat > velero-policy.json <<EOF { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ec2:DescribeVolumes", "ec2:DescribeSnapshots", "ec2:CreateTags", "ec2:CreateVolume", "ec2:CreateSnapshot", "ec2:DeleteSnapshot" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:DeleteObject", "s3:PutObject", "s3:AbortMultipartUpload", "s3:ListMultipartUploadParts" ], "Resource": [ "arn:aws:s3:::${BUCKET}/*" ] }, { "Effect": "Allow", "Action": [ "s3:ListBucket", "s3:GetBucketLocation", "s3:ListBucketMultipartUploads" ], "Resource": [ "arn:aws:s3:::${BUCKET}" ] } ] } EOF
$ aws iam put-user-policy \ --user-name velero \ --policy-name velero \ --policy-document file://velero-policy.json
$ aws iam create-access-key --user-name velero
{ "AccessKey": { "UserName": "velero", "Status": "Active", "CreateDate": "2017-07-31T22:24:41.576Z", "SecretAccessKey": <AWS_SECRET_ACCESS_KEY>, "AccessKeyId": <AWS_ACCESS_KEY_ID> } }
$ cat << EOF > ./credentials-velero [default] aws_access_key_id=<AWS_ACCESS_KEY_ID> aws_secret_access_key=<AWS_SECRET_ACCESS_KEY> EOF
4.6.3.3.
4.6.3.3.1.
$ oc create secret generic cloud-credentials -n openshift-adp --from-file cloud=credentials-velero
4.6.3.3.2.
[backupStorage] aws_access_key_id=<AWS_ACCESS_KEY_ID> aws_secret_access_key=<AWS_SECRET_ACCESS_KEY> [volumeSnapshot] aws_access_key_id=<AWS_ACCESS_KEY_ID> aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
$ oc create secret generic cloud-credentials -n openshift-adp --from-file cloud=credentials-velero 1
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp spec: ... backupLocations: - name: default velero: provider: aws default: true objectStorage: bucket: <bucket_name> prefix: <prefix> config: region: us-east-1 profile: "backupStorage" credential: key: cloud name: cloud-credentials snapshotLocations: - velero: provider: aws config: region: us-west-2 profile: "volumeSnapshot"
4.6.3.3.3.
apiVersion: oadp.openshift.io/v1alpha1 kind: BackupStorageLocation metadata: name: default namespace: openshift-adp spec: provider: aws 1 objectStorage: bucket: <bucket_name> 2 prefix: <bucket_prefix> 3 credential: 4 key: cloud 5 name: cloud-credentials 6 config: region: <bucket_region> 7 s3ForcePathStyle: "true" 8 s3Url: <s3_url> 9 publicUrl: <public_s3_url> 10 serverSideEncryption: AES256 11 kmsKeyId: "50..c-4da1-419f-a16e-ei...49f" 12 customerKeyEncryptionFile: "/credentials/customer-key" 13 signatureVersion: "1" 14 profile: "default" 15 insecureSkipTLSVerify: "true" 16 enableSharedConfig: "true" 17 tagging: "" 18 checksumAlgorithm: "CRC32" 19
4.6.3.3.4.
snapshotLocations: - velero: config: profile: default region: <region> provider: aws # ...
$ dd if=/dev/urandom bs=1 count=32 > sse.key
$ cat sse.key | base64 > sse_encoded.key
$ ln -s sse_encoded.key customer-key
$ oc create secret generic cloud-credentials --namespace openshift-adp --from-file cloud=<path>/openshift_aws_credentials,customer-key=<path>/sse_encoded.key
apiVersion: v1 data: cloud: W2Rfa2V5X2lkPSJBS0lBVkJRWUIyRkQ0TlFHRFFPQiIKYXdzX3NlY3JldF9hY2Nlc3Nfa2V5P<snip>rUE1mNWVSbTN5K2FpeWhUTUQyQk1WZHBOIgo= customer-key: v+<snip>TFIiq6aaXPbj8dhos= kind: Secret # ...
spec: backupLocations: - velero: config: customerKeyEncryptionFile: /credentials/customer-key profile: default # ...
주의
$ echo "encrypt me please" > test.txt
$ aws s3api put-object \ --bucket <bucket> \ --key test.txt \ --body test.txt \ --sse-customer-key fileb://sse.key \ --sse-customer-algorithm AES256
$ s3cmd get s3://<bucket>/test.txt test.txt
$ aws s3api get-object \ --bucket <bucket> \ --key test.txt \ --sse-customer-key fileb://sse.key \ --sse-customer-algorithm AES256 \ downloaded.txt
$ cat downloaded.txt
encrypt me please
4.6.3.3.4.1.
$ aws s3api get-object \ --bucket <bucket> \ --key velero/backups/mysql-persistent-customerkeyencryptionfile4/mysql-persistent-customerkeyencryptionfile4.tar.gz \ --sse-customer-key fileb://sse.key \ --sse-customer-algorithm AES256 \ --debug \ velero_download.tar.gz
4.6.3.4.
4.6.3.4.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... configuration: velero: podConfig: nodeSelector: <node_selector> 1 resourceAllocations: 2 limits: cpu: "1" memory: 1024Mi requests: cpu: 200m memory: 256Mi
4.6.3.4.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... backupLocations: - name: default velero: provider: aws default: true objectStorage: bucket: <bucket> prefix: <prefix> caCert: <base64_encoded_cert_string> 1 config: insecureSkipTLSVerify: "false" 2 # ...
4.6.3.4.2.1.
$ alias velero='oc -n openshift-adp exec deployment/velero -c velero -it -- ./velero'
$ velero version Client: Version: v1.12.1-OADP Git commit: - Server: Version: v1.12.1-OADP
$ CA_CERT=$(oc -n openshift-adp get dataprotectionapplications.oadp.openshift.io <dpa-name> -o jsonpath='{.spec.backupLocations[0].velero.objectStorage.caCert}') $ [[ -n $CA_CERT ]] && echo "$CA_CERT" | base64 -d | oc exec -n openshift-adp -i deploy/velero -c velero -- bash -c "cat > /tmp/your-cacert.txt" || echo "DPA BSL has no caCert"
$ velero describe backup <backup_name> --details --cacert /tmp/<your_cacert>.txt
$ velero backup logs <backup_name> --cacert /tmp/<your_cacert.txt>
$ oc exec -n openshift-adp -i deploy/velero -c velero -- bash -c "ls /tmp/your-cacert.txt" /tmp/your-cacert.txt
4.6.3.5.
- 참고
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp 1 spec: configuration: velero: defaultPlugins: - openshift 2 - aws resourceTimeout: 10m 3 nodeAgent: 4 enable: true 5 uploaderType: kopia 6 podConfig: nodeSelector: <node_selector> 7 backupLocations: - name: default velero: provider: aws default: true objectStorage: bucket: <bucket_name> 8 prefix: <prefix> 9 config: region: <region> profile: "default" s3ForcePathStyle: "true" 10 s3Url: <s3_url> 11 credential: key: cloud name: cloud-credentials 12 snapshotLocations: 13 - name: default velero: provider: aws config: region: <region> 14 profile: "default" credential: key: cloud name: cloud-credentials 15
$ oc get all -n openshift-adp
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/node-agent-9cq4q 1/1 Running 0 94s pod/node-agent-m4lts 1/1 Running 0 94s pod/node-agent-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s service/openshift-adp-velero-metrics-svc ClusterIP 172.30.10.0 <none> 8085/TCP 8h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
4.6.3.5.1.
$ oc label node/<node_name> node-role.kubernetes.io/nodeAgent=""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/nodeAgent: ""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/infra: "" node-role.kubernetes.io/worker: ""
4.6.3.6.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: test-dpa namespace: openshift-adp spec: backupLocations: - name: default velero: config: checksumAlgorithm: "" 1 insecureSkipTLSVerify: "true" profile: "default" region: <bucket_region> s3ForcePathStyle: "true" s3Url: <bucket_url> credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: <bucket_name> prefix: velero provider: aws configuration: velero: defaultPlugins: - openshift - aws - csi
4.6.3.7.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: test-dpa namespace: openshift-adp spec: backupLocations: - name: default velero: config: insecureSkipTLSVerify: "true" profile: "default" region: <bucket_region> s3ForcePathStyle: "true" s3Url: <bucket_url> credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: <bucket_name> prefix: velero provider: aws configuration: nodeAgent: enable: true uploaderType: restic velero: client-burst: 500 1 client-qps: 300 2 defaultPlugins: - openshift - aws - kubevirt
4.6.3.8.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication #... backupLocations: - name: aws 1 velero: provider: aws default: true 2 objectStorage: bucket: <bucket_name> 3 prefix: <prefix> 4 config: region: <region_name> 5 profile: "default" credential: key: cloud name: cloud-credentials 6 - name: odf 7 velero: provider: aws default: false objectStorage: bucket: <bucket_name> prefix: <prefix> config: profile: "default" region: <region_name> s3Url: <url> 8 insecureSkipTLSVerify: "true" s3ForcePathStyle: "true" credential: key: cloud name: <custom_secret_name_odf> 9 #...
apiVersion: velero.io/v1 kind: Backup # ... spec: includedNamespaces: - <namespace> 1 storageLocation: <backup_storage_location> 2 defaultVolumesToFsBackup: true
4.6.3.8.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication ... spec: configuration: velero: defaultPlugins: - openshift - csi 1
4.6.3.8.2.
# ... configuration: nodeAgent: enable: false 1 uploaderType: kopia # ...
# ... configuration: nodeAgent: enable: true 1 uploaderType: kopia # ...
4.6.4.
4.6.4.1.
$ ibmcloud plugin install cos -f
$ BUCKET=<bucket_name>
$ REGION=<bucket_region> 1
$ ibmcloud resource group-create <resource_group_name>
$ ibmcloud target -g <resource_group_name>
$ ibmcloud target
API endpoint: https://cloud.ibm.com Region: User: test-user Account: Test Account (fb6......e95) <-> 2...122 Resource group: Default
$ RESOURCE_GROUP=<resource_group> 1
$ ibmcloud resource service-instance-create \ <service_instance_name> \1 <service_name> \2 <service_plan> \3 <region_name> 4
$ ibmcloud resource service-instance-create test-service-instance cloud-object-storage \ 1 standard \ global \ -d premium-global-deployment 2
$ SERVICE_INSTANCE_ID=$(ibmcloud resource service-instance test-service-instance --output json | jq -r '.[0].id')
$ ibmcloud cos bucket-create \// --bucket $BUCKET \// --ibm-service-instance-id $SERVICE_INSTANCE_ID \// --region $REGION
$ ibmcloud resource service-key-create test-key Writer --instance-name test-service-instance --parameters {\"HMAC\":true}
$ cat > credentials-velero << __EOF__ [default] aws_access_key_id=$(ibmcloud resource service-key test-key -o json | jq -r '.[0].credentials.cos_hmac_keys.access_key_id') aws_secret_access_key=$(ibmcloud resource service-key test-key -o json | jq -r '.[0].credentials.cos_hmac_keys.secret_access_key') __EOF__
4.6.4.2.
$ oc create secret generic cloud-credentials -n openshift-adp --from-file cloud=credentials-velero
4.6.4.3.
$ oc create secret generic cloud-credentials -n openshift-adp --from-file cloud=credentials-velero
$ oc create secret generic <custom_secret> -n openshift-adp --from-file cloud=credentials-velero
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp spec: ... backupLocations: - velero: provider: <provider> default: true credential: key: cloud name: <custom_secret> 1 objectStorage: bucket: <bucket_name> prefix: <prefix>
4.6.4.4.
- 참고
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: namespace: openshift-adp name: <dpa_name> spec: configuration: velero: defaultPlugins: - openshift - aws - csi backupLocations: - velero: provider: aws 1 default: true objectStorage: bucket: <bucket_name> 2 prefix: velero config: insecureSkipTLSVerify: 'true' profile: default region: <region_name> 3 s3ForcePathStyle: 'true' s3Url: <s3_url> 4 credential: key: cloud name: cloud-credentials 5
$ oc get all -n openshift-adp
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/node-agent-9cq4q 1/1 Running 0 94s pod/node-agent-m4lts 1/1 Running 0 94s pod/node-agent-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s service/openshift-adp-velero-metrics-svc ClusterIP 172.30.10.0 <none> 8085/TCP 8h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
4.6.4.5.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... configuration: velero: podConfig: nodeSelector: <node_selector> 1 resourceAllocations: 2 limits: cpu: "1" memory: 1024Mi requests: cpu: 200m memory: 256Mi
4.6.4.6.
$ oc label node/<node_name> node-role.kubernetes.io/nodeAgent=""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/nodeAgent: ""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/infra: "" node-role.kubernetes.io/worker: ""
4.6.4.7.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: test-dpa namespace: openshift-adp spec: backupLocations: - name: default velero: config: insecureSkipTLSVerify: "true" profile: "default" region: <bucket_region> s3ForcePathStyle: "true" s3Url: <bucket_url> credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: <bucket_name> prefix: velero provider: aws configuration: nodeAgent: enable: true uploaderType: restic velero: client-burst: 500 1 client-qps: 300 2 defaultPlugins: - openshift - aws - kubevirt
4.6.4.8.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication #... backupLocations: - name: aws 1 velero: provider: aws default: true 2 objectStorage: bucket: <bucket_name> 3 prefix: <prefix> 4 config: region: <region_name> 5 profile: "default" credential: key: cloud name: cloud-credentials 6 - name: odf 7 velero: provider: aws default: false objectStorage: bucket: <bucket_name> prefix: <prefix> config: profile: "default" region: <region_name> s3Url: <url> 8 insecureSkipTLSVerify: "true" s3ForcePathStyle: "true" credential: key: cloud name: <custom_secret_name_odf> 9 #...
apiVersion: velero.io/v1 kind: Backup # ... spec: includedNamespaces: - <namespace> 1 storageLocation: <backup_storage_location> 2 defaultVolumesToFsBackup: true
4.6.4.9.
# ... configuration: nodeAgent: enable: false 1 uploaderType: kopia # ...
# ... configuration: nodeAgent: enable: true 1 uploaderType: kopia # ...
4.6.5.
4.6.5.1.
4.6.5.2.
4.6.5.2.1.
$ oc create secret generic cloud-credentials-azure -n openshift-adp --from-file cloud=credentials-velero
4.6.5.2.2.
$ oc create secret generic cloud-credentials-azure -n openshift-adp --from-file cloud=credentials-velero
$ oc create secret generic <custom_secret> -n openshift-adp --from-file cloud=credentials-velero
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp spec: ... backupLocations: - velero: config: resourceGroup: <azure_resource_group> storageAccount: <azure_storage_account_id> subscriptionId: <azure_subscription_id> storageAccountKeyEnvVar: AZURE_STORAGE_ACCOUNT_ACCESS_KEY credential: key: cloud name: <custom_secret> 1 provider: azure default: true objectStorage: bucket: <bucket_name> prefix: <prefix> snapshotLocations: - velero: config: resourceGroup: <azure_resource_group> subscriptionId: <azure_subscription_id> incremental: "true" provider: azure
4.6.5.3.
4.6.5.3.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... configuration: velero: podConfig: nodeSelector: <node_selector> 1 resourceAllocations: 2 limits: cpu: "1" memory: 1024Mi requests: cpu: 200m memory: 256Mi
4.6.5.3.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... backupLocations: - name: default velero: provider: aws default: true objectStorage: bucket: <bucket> prefix: <prefix> caCert: <base64_encoded_cert_string> 1 config: insecureSkipTLSVerify: "false" 2 # ...
4.6.5.3.2.1.
$ alias velero='oc -n openshift-adp exec deployment/velero -c velero -it -- ./velero'
$ velero version Client: Version: v1.12.1-OADP Git commit: - Server: Version: v1.12.1-OADP
$ CA_CERT=$(oc -n openshift-adp get dataprotectionapplications.oadp.openshift.io <dpa-name> -o jsonpath='{.spec.backupLocations[0].velero.objectStorage.caCert}') $ [[ -n $CA_CERT ]] && echo "$CA_CERT" | base64 -d | oc exec -n openshift-adp -i deploy/velero -c velero -- bash -c "cat > /tmp/your-cacert.txt" || echo "DPA BSL has no caCert"
$ velero describe backup <backup_name> --details --cacert /tmp/<your_cacert>.txt
$ velero backup logs <backup_name> --cacert /tmp/<your_cacert.txt>
$ oc exec -n openshift-adp -i deploy/velero -c velero -- bash -c "ls /tmp/your-cacert.txt" /tmp/your-cacert.txt
4.6.5.4.
- 참고
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp 1 spec: configuration: velero: defaultPlugins: - azure - openshift 2 resourceTimeout: 10m 3 nodeAgent: 4 enable: true 5 uploaderType: kopia 6 podConfig: nodeSelector: <node_selector> 7 backupLocations: - velero: config: resourceGroup: <azure_resource_group> 8 storageAccount: <azure_storage_account_id> 9 subscriptionId: <azure_subscription_id> 10 storageAccountKeyEnvVar: AZURE_STORAGE_ACCOUNT_ACCESS_KEY credential: key: cloud name: cloud-credentials-azure 11 provider: azure default: true objectStorage: bucket: <bucket_name> 12 prefix: <prefix> 13 snapshotLocations: 14 - velero: config: resourceGroup: <azure_resource_group> subscriptionId: <azure_subscription_id> incremental: "true" name: default provider: azure credential: key: cloud name: cloud-credentials-azure 15
$ oc get all -n openshift-adp
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/node-agent-9cq4q 1/1 Running 0 94s pod/node-agent-m4lts 1/1 Running 0 94s pod/node-agent-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s service/openshift-adp-velero-metrics-svc ClusterIP 172.30.10.0 <none> 8085/TCP 8h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
4.6.5.5.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: test-dpa namespace: openshift-adp spec: backupLocations: - name: default velero: config: insecureSkipTLSVerify: "true" profile: "default" region: <bucket_region> s3ForcePathStyle: "true" s3Url: <bucket_url> credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: <bucket_name> prefix: velero provider: aws configuration: nodeAgent: enable: true uploaderType: restic velero: client-burst: 500 1 client-qps: 300 2 defaultPlugins: - openshift - aws - kubevirt
4.6.5.5.1.
$ oc label node/<node_name> node-role.kubernetes.io/nodeAgent=""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/nodeAgent: ""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/infra: "" node-role.kubernetes.io/worker: ""
4.6.5.5.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication ... spec: configuration: velero: defaultPlugins: - openshift - csi 1
4.6.5.5.3.
# ... configuration: nodeAgent: enable: false 1 uploaderType: kopia # ...
# ... configuration: nodeAgent: enable: true 1 uploaderType: kopia # ...
4.6.6.
4.6.6.1.
$ gcloud auth login
$ BUCKET=<bucket> 1
$ gsutil mb gs://$BUCKET/
$ PROJECT_ID=$(gcloud config get-value project)
$ gcloud iam service-accounts create velero \ --display-name "Velero service account"
$ gcloud iam service-accounts list
$ SERVICE_ACCOUNT_EMAIL=$(gcloud iam service-accounts list \ --filter="displayName:Velero service account" \ --format 'value(email)')
$ ROLE_PERMISSIONS=( compute.disks.get compute.disks.create compute.disks.createSnapshot compute.snapshots.get compute.snapshots.create compute.snapshots.useReadOnly compute.snapshots.delete compute.zones.get storage.objects.create storage.objects.delete storage.objects.get storage.objects.list iam.serviceAccounts.signBlob )
$ gcloud iam roles create velero.server \ --project $PROJECT_ID \ --title "Velero Server" \ --permissions "$(IFS=","; echo "${ROLE_PERMISSIONS[*]}")"
$ gcloud projects add-iam-policy-binding $PROJECT_ID \ --member serviceAccount:$SERVICE_ACCOUNT_EMAIL \ --role projects/$PROJECT_ID/roles/velero.server
$ gsutil iam ch serviceAccount:$SERVICE_ACCOUNT_EMAIL:objectAdmin gs://${BUCKET}
$ gcloud iam service-accounts keys create credentials-velero \ --iam-account $SERVICE_ACCOUNT_EMAIL
4.6.6.2.
4.6.6.2.1.
$ oc create secret generic cloud-credentials-gcp -n openshift-adp --from-file cloud=credentials-velero
4.6.6.2.2.
$ oc create secret generic cloud-credentials-gcp -n openshift-adp --from-file cloud=credentials-velero
$ oc create secret generic <custom_secret> -n openshift-adp --from-file cloud=credentials-velero
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp spec: ... backupLocations: - velero: provider: gcp default: true credential: key: cloud name: <custom_secret> 1 objectStorage: bucket: <bucket_name> prefix: <prefix> snapshotLocations: - velero: provider: gcp default: true config: project: <project> snapshotLocation: us-west1
4.6.6.3.
4.6.6.3.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... configuration: velero: podConfig: nodeSelector: <node_selector> 1 resourceAllocations: 2 limits: cpu: "1" memory: 1024Mi requests: cpu: 200m memory: 256Mi
4.6.6.3.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... backupLocations: - name: default velero: provider: aws default: true objectStorage: bucket: <bucket> prefix: <prefix> caCert: <base64_encoded_cert_string> 1 config: insecureSkipTLSVerify: "false" 2 # ...
4.6.6.3.2.1.
$ alias velero='oc -n openshift-adp exec deployment/velero -c velero -it -- ./velero'
$ velero version Client: Version: v1.12.1-OADP Git commit: - Server: Version: v1.12.1-OADP
$ CA_CERT=$(oc -n openshift-adp get dataprotectionapplications.oadp.openshift.io <dpa-name> -o jsonpath='{.spec.backupLocations[0].velero.objectStorage.caCert}') $ [[ -n $CA_CERT ]] && echo "$CA_CERT" | base64 -d | oc exec -n openshift-adp -i deploy/velero -c velero -- bash -c "cat > /tmp/your-cacert.txt" || echo "DPA BSL has no caCert"
$ velero describe backup <backup_name> --details --cacert /tmp/<your_cacert>.txt
$ velero backup logs <backup_name> --cacert /tmp/<your_cacert.txt>
$ oc exec -n openshift-adp -i deploy/velero -c velero -- bash -c "ls /tmp/your-cacert.txt" /tmp/your-cacert.txt
4.6.6.4.
$ mkdir -p oadp-credrequest
echo 'apiVersion: cloudcredential.openshift.io/v1 kind: CredentialsRequest metadata: name: oadp-operator-credentials namespace: openshift-cloud-credential-operator spec: providerSpec: apiVersion: cloudcredential.openshift.io/v1 kind: GCPProviderSpec permissions: - compute.disks.get - compute.disks.create - compute.disks.createSnapshot - compute.snapshots.get - compute.snapshots.create - compute.snapshots.useReadOnly - compute.snapshots.delete - compute.zones.get - storage.objects.create - storage.objects.delete - storage.objects.get - storage.objects.list - iam.serviceAccounts.signBlob skipServiceCheck: true secretRef: name: cloud-credentials-gcp namespace: <OPERATOR_INSTALL_NS> serviceAccountNames: - velero ' > oadp-credrequest/credrequest.yaml
$ ccoctl gcp create-service-accounts \ --name=<name> \ --project=<gcp_project_id> \ --credentials-requests-dir=oadp-credrequest \ --workload-identity-pool=<pool_id> \ --workload-identity-provider=<provider_id>
$ oc create namespace <OPERATOR_INSTALL_NS>
$ oc apply -f manifests/openshift-adp-cloud-credentials-gcp-credentials.yaml
4.6.6.4.1.
4.6.6.5.
- 참고
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: <OPERATOR_INSTALL_NS> 1 spec: configuration: velero: defaultPlugins: - gcp - openshift 2 resourceTimeout: 10m 3 nodeAgent: 4 enable: true 5 uploaderType: kopia 6 podConfig: nodeSelector: <node_selector> 7 backupLocations: - velero: provider: gcp default: true credential: key: cloud 8 name: cloud-credentials-gcp 9 objectStorage: bucket: <bucket_name> 10 prefix: <prefix> 11 snapshotLocations: 12 - velero: provider: gcp default: true config: project: <project> snapshotLocation: us-west1 13 credential: key: cloud name: cloud-credentials-gcp 14 backupImages: true 15
$ oc get all -n openshift-adp
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/node-agent-9cq4q 1/1 Running 0 94s pod/node-agent-m4lts 1/1 Running 0 94s pod/node-agent-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s service/openshift-adp-velero-metrics-svc ClusterIP 172.30.10.0 <none> 8085/TCP 8h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
4.6.6.6.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: test-dpa namespace: openshift-adp spec: backupLocations: - name: default velero: config: insecureSkipTLSVerify: "true" profile: "default" region: <bucket_region> s3ForcePathStyle: "true" s3Url: <bucket_url> credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: <bucket_name> prefix: velero provider: aws configuration: nodeAgent: enable: true uploaderType: restic velero: client-burst: 500 1 client-qps: 300 2 defaultPlugins: - openshift - aws - kubevirt
4.6.6.6.1.
$ oc label node/<node_name> node-role.kubernetes.io/nodeAgent=""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/nodeAgent: ""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/infra: "" node-role.kubernetes.io/worker: ""
4.6.6.6.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication ... spec: configuration: velero: defaultPlugins: - openshift - csi 1
4.6.6.6.3.
# ... configuration: nodeAgent: enable: false 1 uploaderType: kopia # ...
# ... configuration: nodeAgent: enable: true 1 uploaderType: kopia # ...
4.6.7.
4.6.7.1.
$ cat << EOF > ./credentials-velero [default] aws_access_key_id=<AWS_ACCESS_KEY_ID> aws_secret_access_key=<AWS_SECRET_ACCESS_KEY> EOF
4.6.7.2.
4.6.7.2.1.
$ oc create secret generic cloud-credentials -n openshift-adp --from-file cloud=credentials-velero
4.6.7.2.2.
$ oc create secret generic cloud-credentials -n openshift-adp --from-file cloud=credentials-velero
$ oc create secret generic <custom_secret> -n openshift-adp --from-file cloud=credentials-velero
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp spec: ... backupLocations: - velero: config: profile: "default" region: <region_name> 1 s3Url: <url> insecureSkipTLSVerify: "true" s3ForcePathStyle: "true" provider: aws default: true credential: key: cloud name: <custom_secret> 2 objectStorage: bucket: <bucket_name> prefix: <prefix>
4.6.7.3.
4.6.7.3.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... configuration: velero: podConfig: nodeSelector: <node_selector> 1 resourceAllocations: 2 limits: cpu: "1" memory: 1024Mi requests: cpu: 200m memory: 256Mi
4.6.7.3.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... backupLocations: - name: default velero: provider: aws default: true objectStorage: bucket: <bucket> prefix: <prefix> caCert: <base64_encoded_cert_string> 1 config: insecureSkipTLSVerify: "false" 2 # ...
4.6.7.3.2.1.
$ alias velero='oc -n openshift-adp exec deployment/velero -c velero -it -- ./velero'
$ velero version Client: Version: v1.12.1-OADP Git commit: - Server: Version: v1.12.1-OADP
$ CA_CERT=$(oc -n openshift-adp get dataprotectionapplications.oadp.openshift.io <dpa-name> -o jsonpath='{.spec.backupLocations[0].velero.objectStorage.caCert}') $ [[ -n $CA_CERT ]] && echo "$CA_CERT" | base64 -d | oc exec -n openshift-adp -i deploy/velero -c velero -- bash -c "cat > /tmp/your-cacert.txt" || echo "DPA BSL has no caCert"
$ velero describe backup <backup_name> --details --cacert /tmp/<your_cacert>.txt
$ velero backup logs <backup_name> --cacert /tmp/<your_cacert.txt>
$ oc exec -n openshift-adp -i deploy/velero -c velero -- bash -c "ls /tmp/your-cacert.txt" /tmp/your-cacert.txt
4.6.7.4.
- 참고
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp 1 spec: configuration: velero: defaultPlugins: - aws 2 - openshift 3 resourceTimeout: 10m 4 nodeAgent: 5 enable: true 6 uploaderType: kopia 7 podConfig: nodeSelector: <node_selector> 8 backupLocations: - velero: config: profile: "default" region: <region_name> 9 s3Url: <url> 10 insecureSkipTLSVerify: "true" s3ForcePathStyle: "true" provider: aws default: true credential: key: cloud name: cloud-credentials 11 objectStorage: bucket: <bucket_name> 12 prefix: <prefix> 13
$ oc get all -n openshift-adp
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/node-agent-9cq4q 1/1 Running 0 94s pod/node-agent-m4lts 1/1 Running 0 94s pod/node-agent-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s service/openshift-adp-velero-metrics-svc ClusterIP 172.30.10.0 <none> 8085/TCP 8h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
4.6.7.5.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: test-dpa namespace: openshift-adp spec: backupLocations: - name: default velero: config: insecureSkipTLSVerify: "true" profile: "default" region: <bucket_region> s3ForcePathStyle: "true" s3Url: <bucket_url> credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: <bucket_name> prefix: velero provider: aws configuration: nodeAgent: enable: true uploaderType: restic velero: client-burst: 500 1 client-qps: 300 2 defaultPlugins: - openshift - aws - kubevirt
4.6.7.5.1.
$ oc label node/<node_name> node-role.kubernetes.io/nodeAgent=""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/nodeAgent: ""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/infra: "" node-role.kubernetes.io/worker: ""
4.6.7.5.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication ... spec: configuration: velero: defaultPlugins: - openshift - csi 1
4.6.7.5.3.
# ... configuration: nodeAgent: enable: false 1 uploaderType: kopia # ...
# ... configuration: nodeAgent: enable: true 1 uploaderType: kopia # ...
4.6.8.
4.6.8.1.
4.6.8.1.1.
$ oc create secret generic cloud-credentials -n openshift-adp --from-file cloud=credentials-velero
4.6.8.1.2.
$ oc create secret generic cloud-credentials -n openshift-adp --from-file cloud=credentials-velero
$ oc create secret generic <custom_secret> -n openshift-adp --from-file cloud=credentials-velero
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp spec: ... backupLocations: - velero: provider: <provider> default: true credential: key: cloud name: <custom_secret> 1 objectStorage: bucket: <bucket_name> prefix: <prefix>
4.6.8.2.
4.6.8.2.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... configuration: velero: podConfig: nodeSelector: <node_selector> 1 resourceAllocations: 2 limits: cpu: "1" memory: 1024Mi requests: cpu: 200m memory: 256Mi
4.6.8.2.1.1.
4.6.8.2.1.1.1.
|
|
|
|
|
|
4.6.8.2.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... backupLocations: - name: default velero: provider: aws default: true objectStorage: bucket: <bucket> prefix: <prefix> caCert: <base64_encoded_cert_string> 1 config: insecureSkipTLSVerify: "false" 2 # ...
4.6.8.2.2.1.
$ alias velero='oc -n openshift-adp exec deployment/velero -c velero -it -- ./velero'
$ velero version Client: Version: v1.12.1-OADP Git commit: - Server: Version: v1.12.1-OADP
$ CA_CERT=$(oc -n openshift-adp get dataprotectionapplications.oadp.openshift.io <dpa-name> -o jsonpath='{.spec.backupLocations[0].velero.objectStorage.caCert}') $ [[ -n $CA_CERT ]] && echo "$CA_CERT" | base64 -d | oc exec -n openshift-adp -i deploy/velero -c velero -- bash -c "cat > /tmp/your-cacert.txt" || echo "DPA BSL has no caCert"
$ velero describe backup <backup_name> --details --cacert /tmp/<your_cacert>.txt
$ velero backup logs <backup_name> --cacert /tmp/<your_cacert.txt>
$ oc exec -n openshift-adp -i deploy/velero -c velero -- bash -c "ls /tmp/your-cacert.txt" /tmp/your-cacert.txt
4.6.8.3.
- 참고
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp 1 spec: configuration: velero: defaultPlugins: - aws 2 - kubevirt 3 - csi 4 - openshift 5 resourceTimeout: 10m 6 nodeAgent: 7 enable: true 8 uploaderType: kopia 9 podConfig: nodeSelector: <node_selector> 10 backupLocations: - velero: provider: gcp 11 default: true credential: key: cloud name: <default_secret> 12 objectStorage: bucket: <bucket_name> 13 prefix: <prefix> 14
$ oc get all -n openshift-adp
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/node-agent-9cq4q 1/1 Running 0 94s pod/node-agent-m4lts 1/1 Running 0 94s pod/node-agent-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s service/openshift-adp-velero-metrics-svc ClusterIP 172.30.10.0 <none> 8085/TCP 8h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
4.6.8.4.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: test-dpa namespace: openshift-adp spec: backupLocations: - name: default velero: config: insecureSkipTLSVerify: "true" profile: "default" region: <bucket_region> s3ForcePathStyle: "true" s3Url: <bucket_url> credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: <bucket_name> prefix: velero provider: aws configuration: nodeAgent: enable: true uploaderType: restic velero: client-burst: 500 1 client-qps: 300 2 defaultPlugins: - openshift - aws - kubevirt
4.6.8.4.1.
$ oc label node/<node_name> node-role.kubernetes.io/nodeAgent=""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/nodeAgent: ""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/infra: "" node-role.kubernetes.io/worker: ""
4.6.8.4.2.
4.6.8.4.3.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication ... spec: configuration: velero: defaultPlugins: - openshift - csi 1
4.6.8.4.4.
# ... configuration: nodeAgent: enable: false 1 uploaderType: kopia # ...
# ... configuration: nodeAgent: enable: true 1 uploaderType: kopia # ...
4.6.9.
4.6.9.1.
- 주의
4.6.9.2.
- 참고
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> namespace: openshift-adp 1 spec: configuration: velero: defaultPlugins: - kubevirt 2 - gcp 3 - csi 4 - openshift 5 resourceTimeout: 10m 6 nodeAgent: 7 enable: true 8 uploaderType: kopia 9 podConfig: nodeSelector: <node_selector> 10 backupLocations: - velero: provider: gcp 11 default: true credential: key: cloud name: <default_secret> 12 objectStorage: bucket: <bucket_name> 13 prefix: <prefix> 14
$ oc get all -n openshift-adp
NAME READY STATUS RESTARTS AGE pod/oadp-operator-controller-manager-67d9494d47-6l8z8 2/2 Running 0 2m8s pod/node-agent-9cq4q 1/1 Running 0 94s pod/node-agent-m4lts 1/1 Running 0 94s pod/node-agent-pv4kr 1/1 Running 0 95s pod/velero-588db7f655-n842v 1/1 Running 0 95s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/oadp-operator-controller-manager-metrics-service ClusterIP 172.30.70.140 <none> 8443/TCP 2m8s service/openshift-adp-velero-metrics-svc ClusterIP 172.30.10.0 <none> 8085/TCP 8h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 96s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/oadp-operator-controller-manager 1/1 1 1 2m9s deployment.apps/velero 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/oadp-operator-controller-manager-67d9494d47 1 1 1 2m9s replicaset.apps/velero-588db7f655 1 1 1 96s
$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'
{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT dpa-sample-1 Available 1s 3d16h true
4.6.9.3.
apiVersion: velero.io/v1 kind: Backup metadata: name: vmbackupsingle namespace: openshift-adp spec: snapshotMoveData: true includedNamespaces: - <vm_namespace> 1 labelSelector: matchLabels: app: <vm_app_name> 2 storageLocation: <backup_storage_location_name> 3
$ oc apply -f <backup_cr_file_name> 1
4.6.9.4.
4.6.9.5.
$ oc label vm <vm_name> app=<vm_name> -n openshift-adp
apiVersion: velero.io/v1 kind: Restore metadata: name: singlevmrestore namespace: openshift-adp spec: backupName: multiplevmbackup restorePVs: true LabelSelectors: - matchLabels: kubevirt.io/created-by: <datavolume_uid> 1 - matchLabels: app: <vm_name> 2
$ oc apply -f <restore_cr_file_name> 1
4.6.9.6.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: test-dpa namespace: openshift-adp spec: backupLocations: - name: default velero: config: insecureSkipTLSVerify: "true" profile: "default" region: <bucket_region> s3ForcePathStyle: "true" s3Url: <bucket_url> credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: <bucket_name> prefix: velero provider: aws configuration: nodeAgent: enable: true uploaderType: restic velero: client-burst: 500 1 client-qps: 300 2 defaultPlugins: - openshift - aws - kubevirt
4.6.9.6.1.
$ oc label node/<node_name> node-role.kubernetes.io/nodeAgent=""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/nodeAgent: ""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/infra: "" node-role.kubernetes.io/worker: ""
4.6.9.7.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4.6.10.
4.6.10.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication #... backupLocations: - name: aws 1 velero: provider: aws default: true 2 objectStorage: bucket: <bucket_name> 3 prefix: <prefix> 4 config: region: <region_name> 5 profile: "default" credential: key: cloud name: cloud-credentials 6 - name: odf 7 velero: provider: aws default: false objectStorage: bucket: <bucket_name> prefix: <prefix> config: profile: "default" region: <region_name> s3Url: <url> 8 insecureSkipTLSVerify: "true" s3ForcePathStyle: "true" credential: key: cloud name: <custom_secret_name_odf> 9 #...
apiVersion: velero.io/v1 kind: Backup # ... spec: includedNamespaces: - <namespace> 1 storageLocation: <backup_storage_location> 2 defaultVolumesToFsBackup: true
4.6.10.2.
$ oc create secret generic cloud-credentials -n openshift-adp --from-file cloud=<aws_credentials_file_name> 1
$ oc create secret generic mcg-secret -n openshift-adp --from-file cloud=<MCG_credentials_file_name> 1
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: two-bsl-dpa namespace: openshift-adp spec: backupLocations: - name: aws velero: config: profile: default region: <region_name> 1 credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: <bucket_name> 2 prefix: velero provider: aws - name: mcg velero: config: insecureSkipTLSVerify: "true" profile: noobaa region: <region_name> 3 s3ForcePathStyle: "true" s3Url: <s3_url> 4 credential: key: cloud name: mcg-secret 5 objectStorage: bucket: <bucket_name_mcg> 6 prefix: velero provider: aws configuration: nodeAgent: enable: true uploaderType: kopia velero: defaultPlugins: - openshift - aws
$ oc create -f <dpa_file_name> 1
$ oc get dpa -o yaml
$ oc get bsl
NAME PHASE LAST VALIDATED AGE DEFAULT aws Available 5s 3m28s true mcg Available 5s 3m28s
- 참고
apiVersion: velero.io/v1 kind: Backup metadata: name: test-backup1 namespace: openshift-adp spec: includedNamespaces: - <mysql_namespace> 1 defaultVolumesToFsBackup: true
$ oc apply -f <backup_file_name> 1
$ oc get backups.velero.io <backup_name> -o yaml 1
apiVersion: velero.io/v1 kind: Backup metadata: name: test-backup1 namespace: openshift-adp spec: includedNamespaces: - <mysql_namespace> 1 storageLocation: mcg 2 defaultVolumesToFsBackup: true
$ oc apply -f <backup_file_name> 1
$ oc get backups.velero.io <backup_name> -o yaml 1
4.6.11.
4.6.11.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication #... snapshotLocations: - velero: config: profile: default region: <region> 1 credential: key: cloud name: cloud-credentials provider: aws - velero: config: profile: default region: <region> credential: key: cloud name: <custom_credential> 2 provider: aws #...
4.7.
4.7.1.
4.8.
4.8.1.
4.8.1.1.
$ velero backup create <backup-name> --snapshot-volumes false 1
$ velero describe backup <backup_name> --details 1
$ velero restore create --from-backup <backup-name> 1
중요$ velero describe restore <restore_name> --details 1
4.8.1.2.
4.8.2.
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAMESPACE NAME PHASE LAST VALIDATED AGE DEFAULT openshift-adp velero-sample-1 Available 11s 31m
apiVersion: velero.io/v1 kind: Backup metadata: name: <backup> labels: velero.io/storage-location: default namespace: openshift-adp spec: hooks: {} includedNamespaces: - <namespace> 1 includedResources: [] 2 excludedResources: [] 3 storageLocation: <velero-sample-1> 4 ttl: 720h0m0s 5 labelSelector: 6 matchLabels: app: <label_1> app: <label_2> app: <label_3> orLabelSelectors: 7 - matchLabels: app: <label_1> app: <label_2> app: <label_3>
$ oc get backups.velero.io -n openshift-adp <backup> -o jsonpath='{.status.phase}'
4.8.3.
apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotClass metadata: name: <volume_snapshot_class_name> labels: velero.io/csi-volumesnapshot-class: "true" 1 annotations: snapshot.storage.kubernetes.io/is-default-class: true 2 driver: <csi_driver> deletionPolicy: <deletion_policy_type> 3
4.8.4.
apiVersion: velero.io/v1 kind: Backup metadata: name: <backup> labels: velero.io/storage-location: default namespace: openshift-adp spec: defaultVolumesToFsBackup: true 1 ...
4.8.5.
apiVersion: velero.io/v1 kind: Backup metadata: name: <backup> namespace: openshift-adp spec: hooks: resources: - name: <hook_name> includedNamespaces: - <namespace> 1 excludedNamespaces: 2 - <namespace> includedResources: [] - pods 3 excludedResources: [] 4 labelSelector: 5 matchLabels: app: velero component: server pre: 6 - exec: container: <container> 7 command: - /bin/uname 8 - -a onError: Fail 9 timeout: 30s 10 post: 11 ...
4.8.6.
$ oc get backupStorageLocations -n openshift-adp
NAMESPACE NAME PHASE LAST VALIDATED AGE DEFAULT openshift-adp velero-sample-1 Available 11s 31m
$ cat << EOF | oc apply -f - apiVersion: velero.io/v1 kind: Schedule metadata: name: <schedule> namespace: openshift-adp spec: schedule: 0 7 * * * 1 template: hooks: {} includedNamespaces: - <namespace> 2 storageLocation: <velero-sample-1> 3 defaultVolumesToFsBackup: true 4 ttl: 720h0m0s 5 EOF
참고schedule: "*/10 * * * *"
$ oc get schedule -n openshift-adp <schedule> -o jsonpath='{.status.phase}'
4.8.7.
4.8.7.1.
apiVersion: velero.io/v1 kind: DeleteBackupRequest metadata: name: deletebackuprequest namespace: openshift-adp spec: backupName: <backup_name> 1
$ oc apply -f <deletebackuprequest_cr_filename>
4.8.7.2.
$ velero backup delete <backup_name> -n openshift-adp 1
4.8.7.3.
4.8.7.3.1.
pod/repo-maintain-job-173...2527-2nbls 0/1 Completed 0 168m pod/repo-maintain-job-173....536-fl9tm 0/1 Completed 0 108m pod/repo-maintain-job-173...2545-55ggx 0/1 Completed 0 48m
not due for full maintenance cycle until 2024-00-00 18:29:4
4.8.7.4.
$ oc get backuprepositories.velero.io -n openshift-adp
$ oc delete backuprepository <backup_repository_name> -n openshift-adp 1
4.8.8.
4.8.8.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: dpa-sample spec: configuration: nodeAgent: enable: true uploaderType: kopia # ...
4.9.
4.9.1.
4.9.1.1.
4.9.1.2.
apiVersion: velero.io/v1 kind: Restore metadata: name: <restore> namespace: openshift-adp spec: backupName: <backup> 1 includedResources: [] 2 excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io restorePVs: true 3
$ oc get restores.velero.io -n openshift-adp <restore> -o jsonpath='{.status.phase}'
$ oc get all -n <namespace> 1
$ bash dc-restic-post-restore.sh -> dc-post-restore.sh
참고예 4.1.
#!/bin/bash set -e # if sha256sum exists, use it to check the integrity of the file if command -v sha256sum >/dev/null 2>&1; then CHECKSUM_CMD="sha256sum" else CHECKSUM_CMD="shasum -a 256" fi label_name () { if [ "${#1}" -le "63" ]; then echo $1 return fi sha=$(echo -n $1|$CHECKSUM_CMD) echo "${1:0:57}${sha:0:6}" } if [[ $# -ne 1 ]]; then echo "usage: ${BASH_SOURCE} restore-name" exit 1 fi echo "restore: $1" label=$(label_name $1) echo "label: $label" echo Deleting disconnected restore pods oc delete pods --all-namespaces -l oadp.openshift.io/disconnected-from-dc=$label for dc in $(oc get dc --all-namespaces -l oadp.openshift.io/replicas-modified=$label -o jsonpath='{range .items[*]}{.metadata.namespace}{","}{.metadata.name}{","}{.metadata.annotations.oadp\.openshift\.io/original-replicas}{","}{.metadata.annotations.oadp\.openshift\.io/original-paused}{"\n"}') do IFS=',' read -ra dc_arr <<< "$dc" if [ ${#dc_arr[0]} -gt 0 ]; then echo Found deployment ${dc_arr[0]}/${dc_arr[1]}, setting replicas: ${dc_arr[2]}, paused: ${dc_arr[3]} cat <<EOF | oc patch dc -n ${dc_arr[0]} ${dc_arr[1]} --patch-file /dev/stdin spec: replicas: ${dc_arr[2]} paused: ${dc_arr[3]} EOF fi done
4.9.1.3.
apiVersion: velero.io/v1 kind: Restore metadata: name: <restore> namespace: openshift-adp spec: hooks: resources: - name: <hook_name> includedNamespaces: - <namespace> 1 excludedNamespaces: - <namespace> includedResources: - pods 2 excludedResources: [] labelSelector: 3 matchLabels: app: velero component: server postHooks: - init: initContainers: - name: restore-hook-init image: alpine:latest volumeMounts: - mountPath: /restores/pvc1-vm name: pvc1-vm command: - /bin/ash - -c timeout: 4 - exec: container: <container> 5 command: - /bin/bash 6 - -c - "psql < /backup/backup.sql" waitTimeout: 5m 7 execTimeout: 1m 8 onError: Continue 9
$ velero restore create <RESTORE_NAME> \ --from-backup <BACKUP_NAME> \ --exclude-resources=deployment.apps
$ velero restore create <RESTORE_NAME> \ --from-backup <BACKUP_NAME> \ --include-resources=deployment.apps
4.10.
4.10.1.
4.10.1.1.
- 중요
$ export CLUSTER_NAME=my-cluster 1 export ROSA_CLUSTER_ID=$(rosa describe cluster -c ${CLUSTER_NAME} --output json | jq -r .id) export REGION=$(rosa describe cluster -c ${CLUSTER_NAME} --output json | jq -r .region.id) export OIDC_ENDPOINT=$(oc get authentication.config.openshift.io cluster -o jsonpath='{.spec.serviceAccountIssuer}' | sed 's|^https://||') export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text) export CLUSTER_VERSION=$(rosa describe cluster -c ${CLUSTER_NAME} -o json | jq -r .version.raw_id | cut -f -2 -d '.') export ROLE_NAME="${CLUSTER_NAME}-openshift-oadp-aws-cloud-credentials" export SCRATCH="/tmp/${CLUSTER_NAME}/oadp" mkdir -p ${SCRATCH} echo "Cluster ID: ${ROSA_CLUSTER_ID}, Region: ${REGION}, OIDC Endpoint: ${OIDC_ENDPOINT}, AWS Account ID: ${AWS_ACCOUNT_ID}"
$ POLICY_ARN=$(aws iam list-policies --query "Policies[?PolicyName=='RosaOadpVer1'].{ARN:Arn}" --output text) 1
- 참고
$ if [[ -z "${POLICY_ARN}" ]]; then cat << EOF > ${SCRATCH}/policy.json 1 { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:CreateBucket", "s3:DeleteBucket", "s3:PutBucketTagging", "s3:GetBucketTagging", "s3:PutEncryptionConfiguration", "s3:GetEncryptionConfiguration", "s3:PutLifecycleConfiguration", "s3:GetLifecycleConfiguration", "s3:GetBucketLocation", "s3:ListBucket", "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucketMultipartUploads", "s3:AbortMultipartUploads", "s3:ListMultipartUploadParts", "s3:DescribeSnapshots", "ec2:DescribeVolumes", "ec2:DescribeVolumeAttribute", "ec2:DescribeVolumesModifications", "ec2:DescribeVolumeStatus", "ec2:CreateTags", "ec2:CreateVolume", "ec2:CreateSnapshot", "ec2:DeleteSnapshot" ], "Resource": "*" } ]} EOF POLICY_ARN=$(aws iam create-policy --policy-name "RosaOadpVer1" \ --policy-document file:///${SCRATCH}/policy.json --query Policy.Arn \ --tags Key=rosa_openshift_version,Value=${CLUSTER_VERSION} Key=rosa_role_prefix,Value=ManagedOpenShift Key=operator_namespace,Value=openshift-oadp Key=operator_name,Value=openshift-oadp \ --output text) fi
$ echo ${POLICY_ARN}
$ cat <<EOF > ${SCRATCH}/trust-policy.json { "Version":2012-10-17", "Statement": [{ "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_ENDPOINT}" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { "${OIDC_ENDPOINT}:sub": [ "system:serviceaccount:openshift-adp:openshift-adp-controller-manager", "system:serviceaccount:openshift-adp:velero"] } } }] } EOF
$ ROLE_ARN=$(aws iam create-role --role-name \ "${ROLE_NAME}" \ --assume-role-policy-document file://${SCRATCH}/trust-policy.json \ --tags Key=rosa_cluster_id,Value=${ROSA_CLUSTER_ID} Key=rosa_openshift_version,Value=${CLUSTER_VERSION} Key=rosa_role_prefix,Value=ManagedOpenShift Key=operator_namespace,Value=openshift-adp Key=operator_name,Value=openshift-oadp \ --query Role.Arn --output text)
$ echo ${ROLE_ARN}
$ aws iam attach-role-policy --role-name "${ROLE_NAME}" \ --policy-arn ${POLICY_ARN}
4.10.1.2.
$ cat <<EOF > ${SCRATCH}/credentials [default] role_arn = ${ROLE_ARN} web_identity_token_file = /var/run/secrets/openshift/serviceaccount/token region = <aws_region> 1 EOF
$ oc create namespace openshift-adp
$ oc -n openshift-adp create secret generic cloud-credentials \ --from-file=${SCRATCH}/credentials
참고
$ cat << EOF | oc create -f - apiVersion: oadp.openshift.io/v1alpha1 kind: CloudStorage metadata: name: ${CLUSTER_NAME}-oadp namespace: openshift-adp spec: creationSecret: key: credentials name: cloud-credentials enableSharedConfig: true name: ${CLUSTER_NAME}-oadp provider: aws region: $REGION EOF
$ oc get pvc -n <namespace>
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE applog Bound pvc-351791ae-b6ab-4e8b-88a4-30f73caf5ef8 1Gi RWO gp3-csi 4d19h mysql Bound pvc-16b8e009-a20a-4379-accc-bc81fedd0621 1Gi RWO gp3-csi 4d19h
$ oc get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE gp2 kubernetes.io/aws-ebs Delete WaitForFirstConsumer true 4d21h gp2-csi ebs.csi.aws.com Delete WaitForFirstConsumer true 4d21h gp3 ebs.csi.aws.com Delete WaitForFirstConsumer true 4d21h gp3-csi (default) ebs.csi.aws.com Delete WaitForFirstConsumer true 4d21h
참고$ cat << EOF | oc create -f - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: ${CLUSTER_NAME}-dpa namespace: openshift-adp spec: backupImages: true 1 features: dataMover: enable: false backupLocations: - bucket: cloudStorageRef: name: ${CLUSTER_NAME}-oadp credential: key: credentials name: cloud-credentials prefix: velero default: true config: region: ${REGION} configuration: velero: defaultPlugins: - openshift - aws - csi nodeAgent: 2 enable: false uploaderType: kopia 3 EOF
$ cat << EOF | oc create -f - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: ${CLUSTER_NAME}-dpa namespace: openshift-adp spec: backupImages: true 1 features: dataMover: enable: false backupLocations: - bucket: cloudStorageRef: name: ${CLUSTER_NAME}-oadp credential: key: credentials name: cloud-credentials prefix: velero default: true config: region: ${REGION} configuration: velero: defaultPlugins: - openshift - aws nodeAgent: 2 enable: false uploaderType: restic snapshotLocations: - velero: config: credentialsFile: /tmp/credentials/openshift-adp/cloud-credentials-credentials 3 enableSharedConfig: "true" 4 profile: default 5 region: ${REGION} 6 provider: aws EOF
nodeAgent: enable: false uploaderType: restic
restic: enable: false
4.10.1.3.
$ oc get sub -o yaml redhat-oadp-operator
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: annotations: creationTimestamp: "2025-01-15T07:18:31Z" generation: 1 labels: operators.coreos.com/redhat-oadp-operator.openshift-adp: "" name: redhat-oadp-operator namespace: openshift-adp resourceVersion: "77363" uid: 5ba00906-5ad2-4476-ae7b-ffa90986283d spec: channel: stable-1.4 config: env: - name: ROLEARN value: arn:aws:iam::11111111:role/wrong-role-arn 1 installPlanApproval: Manual name: redhat-oadp-operator source: prestage-operators sourceNamespace: openshift-marketplace startingCSV: oadp-operator.v1.4.2
$ oc patch subscription redhat-oadp-operator -p '{"spec": {"config": {"env": [{"name": "ROLEARN", "value": "<role_arn>"}]}}}' --type='merge'
$ oc get secret cloud-credentials -o jsonpath='{.data.credentials}' | base64 -d
[default] sts_regional_endpoints = regional role_arn = arn:aws:iam::160.....6956:role/oadprosa.....8wlf web_identity_token_file = /var/run/secrets/openshift/serviceaccount/token
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: test-rosa-dpa namespace: openshift-adp spec: backupLocations: - bucket: config: region: us-east-1 cloudStorageRef: name: <cloud_storage> 1 credential: name: cloud-credentials key: credentials prefix: velero default: true configuration: velero: defaultPlugins: - aws - openshift
$ oc create -f <dpa_manifest_file>
$ oc get dpa -n openshift-adp -o yaml
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication ... status: conditions: - lastTransitionTime: "2023-07-31T04:48:12Z" message: Reconcile complete reason: Complete status: "True" type: Reconciled
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT ts-dpa-1 Available 3s 6s true
4.10.1.4.
4.10.1.4.1.
$ oc create namespace hello-world
$ oc new-app -n hello-world --image=docker.io/openshift/hello-openshift
$ oc expose service/hello-openshift -n hello-world
$ curl `oc get route/hello-openshift -n hello-world -o jsonpath='{.spec.host}'`
Hello OpenShift!
$ cat << EOF | oc create -f - apiVersion: velero.io/v1 kind: Backup metadata: name: hello-world namespace: openshift-adp spec: includedNamespaces: - hello-world storageLocation: ${CLUSTER_NAME}-dpa-1 ttl: 720h0m0s EOF
$ watch "oc -n openshift-adp get backup hello-world -o json | jq .status"
{ "completionTimestamp": "2022-09-07T22:20:44Z", "expiration": "2022-10-07T22:20:22Z", "formatVersion": "1.1.0", "phase": "Completed", "progress": { "itemsBackedUp": 58, "totalItems": 58 }, "startTimestamp": "2022-09-07T22:20:22Z", "version": 1 }
$ oc delete ns hello-world
$ cat << EOF | oc create -f - apiVersion: velero.io/v1 kind: Restore metadata: name: hello-world namespace: openshift-adp spec: backupName: hello-world EOF
$ watch "oc -n openshift-adp get restore hello-world -o json | jq .status"
{ "completionTimestamp": "2022-09-07T22:25:47Z", "phase": "Completed", "progress": { "itemsRestored": 38, "totalItems": 38 }, "startTimestamp": "2022-09-07T22:25:28Z", "warnings": 9 }
$ oc -n hello-world get pods
NAME READY STATUS RESTARTS AGE hello-openshift-9f885f7c6-kdjpj 1/1 Running 0 90s
$ curl `oc get route/hello-openshift -n hello-world -o jsonpath='{.spec.host}'`
Hello OpenShift!
4.10.1.4.2.
$ oc delete ns hello-world
$ oc -n openshift-adp delete dpa ${CLUSTER_NAME}-dpa
$ oc -n openshift-adp delete cloudstorage ${CLUSTER_NAME}-oadp
주의$ oc -n openshift-adp patch cloudstorage ${CLUSTER_NAME}-oadp -p '{"metadata":{"finalizers":null}}' --type=merge
$ oc -n openshift-adp delete subscription oadp-operator
$ oc delete ns openshift-adp
$ oc delete backups.velero.io hello-world
$ velero backup delete hello-world
$ for CRD in `oc get crds | grep velero | awk '{print $1}'`; do oc delete crd $CRD; done
$ aws s3 rm s3://${CLUSTER_NAME}-oadp --recursive
$ aws s3api delete-bucket --bucket ${CLUSTER_NAME}-oadp
$ aws iam detach-role-policy --role-name "${ROLE_NAME}" --policy-arn "${POLICY_ARN}"
$ aws iam delete-role --role-name "${ROLE_NAME}"
4.11.
4.11.1.
4.11.1.1.
$ export CLUSTER_NAME= <AWS_cluster_name> 1
$ export CLUSTER_VERSION=$(oc get clusterversion version -o jsonpath='{.status.desired.version}{"\n"}') export AWS_CLUSTER_ID=$(oc get clusterversion version -o jsonpath='{.spec.clusterID}{"\n"}') export OIDC_ENDPOINT=$(oc get authentication.config.openshift.io cluster -o jsonpath='{.spec.serviceAccountIssuer}' | sed 's|^https://||') export REGION=$(oc get infrastructures cluster -o jsonpath='{.status.platformStatus.aws.region}' --allow-missing-template-keys=false || echo us-east-2) export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text) export ROLE_NAME="${CLUSTER_NAME}-openshift-oadp-aws-cloud-credentials"
$ export SCRATCH="/tmp/${CLUSTER_NAME}/oadp" mkdir -p ${SCRATCH}
$ echo "Cluster ID: ${AWS_CLUSTER_ID}, Region: ${REGION}, OIDC Endpoint: ${OIDC_ENDPOINT}, AWS Account ID: ${AWS_ACCOUNT_ID}"
$ export POLICY_NAME="OadpVer1" 1
$ POLICY_ARN=$(aws iam list-policies --query "Policies[?PolicyName=='$POLICY_NAME'].{ARN:Arn}" --output text)
- 참고
$ if [[ -z "${POLICY_ARN}" ]]; then cat << EOF > ${SCRATCH}/policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:CreateBucket", "s3:DeleteBucket", "s3:PutBucketTagging", "s3:GetBucketTagging", "s3:PutEncryptionConfiguration", "s3:GetEncryptionConfiguration", "s3:PutLifecycleConfiguration", "s3:GetLifecycleConfiguration", "s3:GetBucketLocation", "s3:ListBucket", "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucketMultipartUploads", "s3:AbortMultipartUpload", "s3:ListMultipartUploadParts", "ec2:DescribeSnapshots", "ec2:DescribeVolumes", "ec2:DescribeVolumeAttribute", "ec2:DescribeVolumesModifications", "ec2:DescribeVolumeStatus", "ec2:CreateTags", "ec2:CreateVolume", "ec2:CreateSnapshot", "ec2:DeleteSnapshot" ], "Resource": "*" } ]} EOF POLICY_ARN=$(aws iam create-policy --policy-name $POLICY_NAME \ --policy-document file:///${SCRATCH}/policy.json --query Policy.Arn \ --tags Key=openshift_version,Value=${CLUSTER_VERSION} Key=operator_namespace,Value=openshift-adp Key=operator_name,Value=oadp \ --output text) 1 fi
$ echo ${POLICY_ARN}
$ cat <<EOF > ${SCRATCH}/trust-policy.json { "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_ENDPOINT}" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { "${OIDC_ENDPOINT}:sub": [ "system:serviceaccount:openshift-adp:openshift-adp-controller-manager", "system:serviceaccount:openshift-adp:velero"] } } }] } EOF
$ ROLE_ARN=$(aws iam create-role --role-name \ "${ROLE_NAME}" \ --assume-role-policy-document file://${SCRATCH}/trust-policy.json \ --tags Key=cluster_id,Value=${AWS_CLUSTER_ID} Key=openshift_version,Value=${CLUSTER_VERSION} Key=operator_namespace,Value=openshift-adp Key=operator_name,Value=oadp --query Role.Arn --output text)
$ echo ${ROLE_ARN}
$ aws iam attach-role-policy --role-name "${ROLE_NAME}" --policy-arn ${POLICY_ARN}
4.11.1.1.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_sample> spec: # ... configuration: velero: podConfig: nodeSelector: <node_selector> 1 resourceAllocations: 2 limits: cpu: "1" memory: 1024Mi requests: cpu: 200m memory: 256Mi
4.11.1.2.
$ cat <<EOF > ${SCRATCH}/credentials [default] role_arn = ${ROLE_ARN} web_identity_token_file = /var/run/secrets/openshift/serviceaccount/token EOF
$ oc create namespace openshift-adp
$ oc -n openshift-adp create secret generic cloud-credentials \ --from-file=${SCRATCH}/credentials
참고
$ cat << EOF | oc create -f - apiVersion: oadp.openshift.io/v1alpha1 kind: CloudStorage metadata: name: ${CLUSTER_NAME}-oadp namespace: openshift-adp spec: creationSecret: key: credentials name: cloud-credentials enableSharedConfig: true name: ${CLUSTER_NAME}-oadp provider: aws region: $REGION EOF
$ oc get pvc -n <namespace>
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE applog Bound pvc-351791ae-b6ab-4e8b-88a4-30f73caf5ef8 1Gi RWO gp3-csi 4d19h mysql Bound pvc-16b8e009-a20a-4379-accc-bc81fedd0621 1Gi RWO gp3-csi 4d19h
$ oc get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE gp2 kubernetes.io/aws-ebs Delete WaitForFirstConsumer true 4d21h gp2-csi ebs.csi.aws.com Delete WaitForFirstConsumer true 4d21h gp3 ebs.csi.aws.com Delete WaitForFirstConsumer true 4d21h gp3-csi (default) ebs.csi.aws.com Delete WaitForFirstConsumer true 4d21h
참고$ cat << EOF | oc create -f - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: ${CLUSTER_NAME}-dpa namespace: openshift-adp spec: backupImages: true 1 features: dataMover: enable: false backupLocations: - bucket: cloudStorageRef: name: ${CLUSTER_NAME}-oadp credential: key: credentials name: cloud-credentials prefix: velero default: true config: region: ${REGION} configuration: velero: defaultPlugins: - openshift - aws - csi restic: enable: false EOF
$ cat << EOF | oc create -f - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: ${CLUSTER_NAME}-dpa namespace: openshift-adp spec: backupImages: true 1 features: dataMover: enable: false backupLocations: - bucket: cloudStorageRef: name: ${CLUSTER_NAME}-oadp credential: key: credentials name: cloud-credentials prefix: velero default: true config: region: ${REGION} configuration: velero: defaultPlugins: - openshift - aws nodeAgent: 2 enable: false uploaderType: restic snapshotLocations: - velero: config: credentialsFile: /tmp/credentials/openshift-adp/cloud-credentials-credentials 3 enableSharedConfig: "true" 4 profile: default 5 region: ${REGION} 6 provider: aws EOF
nodeAgent: enable: false uploaderType: restic
restic: enable: false
4.11.1.3.
4.11.1.3.1.
$ oc create namespace hello-world
$ oc new-app -n hello-world --image=docker.io/openshift/hello-openshift
$ oc expose service/hello-openshift -n hello-world
$ curl `oc get route/hello-openshift -n hello-world -o jsonpath='{.spec.host}'`
Hello OpenShift!
$ cat << EOF | oc create -f - apiVersion: velero.io/v1 kind: Backup metadata: name: hello-world namespace: openshift-adp spec: includedNamespaces: - hello-world storageLocation: ${CLUSTER_NAME}-dpa-1 ttl: 720h0m0s EOF
$ watch "oc -n openshift-adp get backup hello-world -o json | jq .status"
{ "completionTimestamp": "2022-09-07T22:20:44Z", "expiration": "2022-10-07T22:20:22Z", "formatVersion": "1.1.0", "phase": "Completed", "progress": { "itemsBackedUp": 58, "totalItems": 58 }, "startTimestamp": "2022-09-07T22:20:22Z", "version": 1 }
$ oc delete ns hello-world
$ cat << EOF | oc create -f - apiVersion: velero.io/v1 kind: Restore metadata: name: hello-world namespace: openshift-adp spec: backupName: hello-world EOF
$ watch "oc -n openshift-adp get restore hello-world -o json | jq .status"
{ "completionTimestamp": "2022-09-07T22:25:47Z", "phase": "Completed", "progress": { "itemsRestored": 38, "totalItems": 38 }, "startTimestamp": "2022-09-07T22:25:28Z", "warnings": 9 }
$ oc -n hello-world get pods
NAME READY STATUS RESTARTS AGE hello-openshift-9f885f7c6-kdjpj 1/1 Running 0 90s
$ curl `oc get route/hello-openshift -n hello-world -o jsonpath='{.spec.host}'`
Hello OpenShift!
4.11.1.3.2.
$ oc delete ns hello-world
$ oc -n openshift-adp delete dpa ${CLUSTER_NAME}-dpa
$ oc -n openshift-adp delete cloudstorage ${CLUSTER_NAME}-oadp
중요$ oc -n openshift-adp patch cloudstorage ${CLUSTER_NAME}-oadp -p '{"metadata":{"finalizers":null}}' --type=merge
$ oc -n openshift-adp delete subscription oadp-operator
$ oc delete ns openshift-adp
$ oc delete backups.velero.io hello-world
$ velero backup delete hello-world
$ for CRD in `oc get crds | grep velero | awk '{print $1}'`; do oc delete crd $CRD; done
$ aws s3 rm s3://${CLUSTER_NAME}-oadp --recursive
$ aws s3api delete-bucket --bucket ${CLUSTER_NAME}-oadp
$ aws iam detach-role-policy --role-name "${ROLE_NAME}" --policy-arn "${POLICY_ARN}"
$ aws iam delete-role --role-name "${ROLE_NAME}"
4.12.
4.12.1.
4.12.1.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: dpa_sample namespace: openshift-adp spec: configuration: velero: defaultPlugins: - openshift - aws - csi resourceTimeout: 10m nodeAgent: enable: true uploaderType: kopia backupLocations: - name: default velero: provider: aws default: true objectStorage: bucket: <bucket_name> 1 prefix: <prefix> 2 config: region: <region> 3 profile: "default" s3ForcePathStyle: "true" s3Url: <s3_url> 4 credential: key: cloud name: cloud-credentials
$ oc create -f dpa.yaml
4.12.1.2.
apiVersion: velero.io/v1 kind: Backup metadata: name: operator-install-backup namespace: openshift-adp spec: csiSnapshotTimeout: 10m0s defaultVolumesToFsBackup: false includedNamespaces: - threescale 1 includedResources: - operatorgroups - subscriptions - namespaces itemOperationTimeout: 1h0m0s snapshotMoveData: false ttl: 720h0m0s
참고$ oc create -f backup.yaml
apiVersion: velero.io/v1 kind: Backup metadata: name: operator-resources-secrets namespace: openshift-adp spec: csiSnapshotTimeout: 10m0s defaultVolumesToFsBackup: false includedNamespaces: - threescale includedResources: - secrets itemOperationTimeout: 1h0m0s labelSelector: matchLabels: app: 3scale-api-management snapshotMoveData: false snapshotVolumes: false ttl: 720h0m0s
$ oc create -f backup-secret.yaml
apiVersion: velero.io/v1 kind: Backup metadata: name: operator-resources-apim namespace: openshift-adp spec: csiSnapshotTimeout: 10m0s defaultVolumesToFsBackup: false includedNamespaces: - threescale includedResources: - apimanagers itemOperationTimeout: 1h0m0s snapshotMoveData: false snapshotVolumes: false storageLocation: ts-dpa-1 ttl: 720h0m0s volumeSnapshotLocations: - ts-dpa-1
$ oc create -f backup-apimanager.yaml
4.12.1.3.
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: example-claim namespace: threescale spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: gp3-csi volumeMode: Filesystem
$ oc create -f ts_pvc.yml
$ oc edit deployment system-mysql -n threescale
volumeMounts: - name: example-claim mountPath: /var/lib/mysqldump/data - name: mysql-storage mountPath: /var/lib/mysql/data - name: mysql-extra-conf mountPath: /etc/my-extra.d - name: mysql-main-conf mountPath: /etc/my-extra ... serviceAccount: amp volumes: - name: example-claim persistentVolumeClaim: claimName: example-claim 1 ...
apiVersion: velero.io/v1 kind: Backup metadata: name: mysql-backup namespace: openshift-adp spec: csiSnapshotTimeout: 10m0s defaultVolumesToFsBackup: true hooks: resources: - name: dumpdb pre: - exec: command: - /bin/sh - -c - mysqldump -u $MYSQL_USER --password=$MYSQL_PASSWORD system --no-tablespaces > /var/lib/mysqldump/data/dump.sql 1 container: system-mysql onError: Fail timeout: 5m includedNamespaces: 2 - threescale includedResources: - deployment - pods - replicationControllers - persistentvolumeclaims - persistentvolumes itemOperationTimeout: 1h0m0s labelSelector: matchLabels: app: 3scale-api-management threescale_component_element: mysql snapshotMoveData: false ttl: 720h0m0s
$ oc create -f mysql.yaml
$ oc get backups.velero.io mysql-backup
NAME STATUS CREATED NAMESPACE POD VOLUME UPLOADER TYPE STORAGE LOCATION AGE mysql-backup-4g7qn Completed 30s threescale system-mysql-2-9pr44 example-claim kopia ts-dpa-1 30s mysql-backup-smh85 Completed 23s threescale system-mysql-2-9pr44 mysql-storage kopia ts-dpa-1 30s
4.12.1.4.
$ oc edit deployment backend-redis -n threescale
annotations: post.hook.backup.velero.io/command: >- ["/bin/bash", "-c", "redis-cli CONFIG SET auto-aof-rewrite-percentage 100"] pre.hook.backup.velero.io/command: >- ["/bin/bash", "-c", "redis-cli CONFIG SET auto-aof-rewrite-percentage 0"]
apiVersion: velero.io/v1 kind: Backup metadata: name: redis-backup namespace: openshift-adp spec: csiSnapshotTimeout: 10m0s defaultVolumesToFsBackup: true includedNamespaces: - threescale includedResources: - deployment - pods - replicationcontrollers - persistentvolumes - persistentvolumeclaims itemOperationTimeout: 1h0m0s labelSelector: matchLabels: app: 3scale-api-management threescale_component: backend threescale_component_element: redis snapshotMoveData: false snapshotVolumes: false ttl: 720h0m0s
$ oc get backups.velero.io redis-backup -o yaml
$ oc get backups.velero.io
4.12.1.5.
$ oc delete project threescale
"threescale" project deleted successfully
apiVersion: velero.io/v1 kind: Restore metadata: name: operator-installation-restore namespace: openshift-adp spec: backupName: operator-install-backup excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io - csinodes.storage.k8s.io - volumeattachments.storage.k8s.io - backuprepositories.velero.io itemOperationTimeout: 4h0m0s
$ oc create -f restore.yaml
$ oc apply -f - <<EOF --- apiVersion: v1 kind: Secret metadata: name: s3-credentials namespace: threescale stringData: AWS_ACCESS_KEY_ID: <ID_123456> 1 AWS_SECRET_ACCESS_KEY: <ID_98765544> 2 AWS_BUCKET: <mybucket.example.com> 3 AWS_REGION: <us-east-1> 4 type: Opaque EOF
$ oc scale deployment threescale-operator-controller-manager-v2 --replicas=0 -n threescale
apiVersion: velero.io/v1 kind: Restore metadata: name: operator-resources-secrets namespace: openshift-adp spec: backupName: operator-resources-secrets excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io - csinodes.storage.k8s.io - volumeattachments.storage.k8s.io - backuprepositories.velero.io itemOperationTimeout: 4h0m0s
$ oc create -f restore-secrets.yaml
apiVersion: velero.io/v1 kind: Restore metadata: name: operator-resources-apim namespace: openshift-adp spec: backupName: operator-resources-apim excludedResources: 1 - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io - csinodes.storage.k8s.io - volumeattachments.storage.k8s.io - backuprepositories.velero.io itemOperationTimeout: 4h0m0s
$ oc create -f restore-apimanager.yaml
$ oc scale deployment threescale-operator-controller-manager-v2 --replicas=1 -n threescale
4.12.1.6.
$ oc scale deployment threescale-operator-controller-manager-v2 --replicas=0 -n threescale
deployment.apps/threescale-operator-controller-manager-v2 scaled
$ vi ./scaledowndeployment.sh
for deployment in apicast-production apicast-staging backend-cron backend-listener backend-redis backend-worker system-app system-memcache system-mysql system-redis system-searchd system-sidekiq zync zync-database zync-que; do oc scale deployment/$deployment --replicas=0 -n threescale done
$ ./scaledowndeployment.sh
deployment.apps.openshift.io/apicast-production scaled deployment.apps.openshift.io/apicast-staging scaled deployment.apps.openshift.io/backend-cron scaled deployment.apps.openshift.io/backend-listener scaled deployment.apps.openshift.io/backend-redis scaled deployment.apps.openshift.io/backend-worker scaled deployment.apps.openshift.io/system-app scaled deployment.apps.openshift.io/system-memcache scaled deployment.apps.openshift.io/system-mysql scaled deployment.apps.openshift.io/system-redis scaled deployment.apps.openshift.io/system-searchd scaled deployment.apps.openshift.io/system-sidekiq scaled deployment.apps.openshift.io/zync scaled deployment.apps.openshift.io/zync-database scaled deployment.apps.openshift.io/zync-que scaled
$ oc delete deployment system-mysql -n threescale
Warning: apps.openshift.io/v1 deployment is deprecated in v4.14+, unavailable in v4.10000+ deployment.apps.openshift.io "system-mysql" deleted
apiVersion: velero.io/v1 kind: Restore metadata: name: restore-mysql namespace: openshift-adp spec: backupName: mysql-backup excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - csinodes.storage.k8s.io - volumeattachments.storage.k8s.io - backuprepositories.velero.io - resticrepositories.velero.io hooks: resources: - name: restoreDB postHooks: - exec: command: - /bin/sh - '-c' - > sleep 30 mysql -h 127.0.0.1 -D system -u root --password=$MYSQL_ROOT_PASSWORD < /var/lib/mysqldump/data/dump.sql 1 container: system-mysql execTimeout: 80s onError: Fail waitTimeout: 5m itemOperationTimeout: 1h0m0s restorePVs: true
$ oc create -f restore-mysql.yaml
$ oc get podvolumerestores.velero.io -n openshift-adp
NAME NAMESPACE POD UPLOADER TYPE VOLUME STATUS TOTALBYTES BYTESDONE AGE restore-mysql-rbzvm threescale system-mysql-2-kjkhl kopia mysql-storage Completed 771879108 771879108 40m restore-mysql-z7x7l threescale system-mysql-2-kjkhl kopia example-claim Completed 380415 380415 40m
$ oc get pvc -n threescale
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE backend-redis-storage Bound pvc-3dca410d-3b9f-49d4-aebf-75f47152e09d 1Gi RWO gp3-csi <unset> 68m example-claim Bound pvc-cbaa49b0-06cd-4b1a-9e90-0ef755c67a54 1Gi RWO gp3-csi <unset> 57m mysql-storage Bound pvc-4549649f-b9ad-44f7-8f67-dd6b9dbb3896 1Gi RWO gp3-csi <unset> 68m system-redis-storage Bound pvc-04dadafd-8a3e-4d00-8381-6041800a24fc 1Gi RWO gp3-csi <unset> 68m system-searchd Bound pvc-afbf606c-d4a8-4041-8ec6-54c5baf1a3b9 1Gi RWO gp3-csi <unset> 68m
4.12.1.7.
$ oc delete deployment backend-redis -n threescale
Warning: apps.openshift.io/v1 deployment is deprecated in v4.14+, unavailable in v4.10000+ deployment.apps.openshift.io "backend-redis" deleted
apiVersion: velero.io/v1 kind: Restore metadata: name: restore-backend namespace: openshift-adp spec: backupName: redis-backup excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io - csinodes.storage.k8s.io - volumeattachments.storage.k8s.io - backuprepositories.velero.io itemOperationTimeout: 1h0m0s restorePVs: true
$ oc create -f restore-backend.yaml
$ oc get podvolumerestores.velero.io -n openshift-adp
NAME NAMESPACE POD UPLOADER TYPE VOLUME STATUS TOTALBYTES BYTESDONE AGE restore-backend-jmrwx threescale backend-redis-1-bsfmv kopia backend-redis-storage Completed 76123 76123 21m
4.12.1.8.
$ oc scale deployment threescale-operator-controller-manager-v2 --replicas=1 -n threescale
$ oc get deployment -n threescale
$ ./scaledeployment.sh
$ oc get routes -n threescale
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD backend backend-3scale.apps.custom-cluster-name.openshift.com backend-listener http edge/Allow None zync-3scale-api-b4l4d api-3scale-apicast-production.apps.custom-cluster-name.openshift.com apicast-production gateway edge/Redirect None zync-3scale-api-b6sns api-3scale-apicast-staging.apps.custom-cluster-name.openshift.com apicast-staging gateway edge/Redirect None zync-3scale-master-7sc4j master.apps.custom-cluster-name.openshift.com system-master http edge/Redirect None zync-3scale-provider-7r2nm 3scale-admin.apps.custom-cluster-name.openshift.com system-provider http edge/Redirect None zync-3scale-provider-mjxlb 3scale.apps.custom-cluster-name.openshift.com system-developer http edge/Redirect None
4.13.
4.13.1.
4.13.1.1.
4.13.1.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: dpa-sample spec: configuration: nodeAgent: enable: true 1 uploaderType: kopia 2 velero: defaultPlugins: - openshift - aws - csi 3 defaultSnapshotMoveData: true defaultVolumesToFSBackup: 4 featureFlags: - EnableCSI # ...
4.13.1.3.
4.13.1.4.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4.13.2.
4.13.2.1.
kind: Backup apiVersion: velero.io/v1 metadata: name: backup namespace: openshift-adp spec: csiSnapshotTimeout: 10m0s defaultVolumesToFsBackup: 1 includedNamespaces: - mysql-persistent itemOperationTimeout: 4h0m0s snapshotMoveData: true 2 storageLocation: default ttl: 720h0m0s 3 volumeSnapshotLocations: - dpa-sample-1 # ...
참고Error: relabel failed /var/lib/kubelet/pods/3ac..34/volumes/ \ kubernetes.io~csi/pvc-684..12c/mount: lsetxattr /var/lib/kubelet/ \ pods/3ac..34/volumes/kubernetes.io~csi/pvc-68..2c/mount/data-xfs-103: \ no space left on device
$ oc create -f backup.yaml
$ oc get datauploads -A
NAMESPACE NAME STATUS STARTED BYTES DONE TOTAL BYTES STORAGE LOCATION AGE NODE openshift-adp backup-test-1-sw76b Completed 9m47s 108104082 108104082 dpa-sample-1 9m47s ip-10-0-150-57.us-west-2.compute.internal openshift-adp mongo-block-7dtpf Completed 14m 1073741824 1073741824 dpa-sample-1 14m ip-10-0-150-57.us-west-2.compute.internal
$ oc get datauploads <dataupload_name> -o yaml
apiVersion: velero.io/v2alpha1 kind: DataUpload metadata: name: backup-test-1-sw76b namespace: openshift-adp spec: backupStorageLocation: dpa-sample-1 csiSnapshot: snapshotClass: "" storageClass: gp3-csi volumeSnapshot: velero-mysql-fq8sl operationTimeout: 10m0s snapshotType: CSI sourceNamespace: mysql-persistent sourcePVC: mysql status: completionTimestamp: "2023-11-02T16:57:02Z" node: ip-10-0-150-57.us-west-2.compute.internal path: /host_pods/15116bac-cc01-4d9b-8ee7-609c3bef6bde/volumes/kubernetes.io~csi/pvc-eead8167-556b-461a-b3ec-441749e291c4/mount phase: Completed 1 progress: bytesDone: 108104082 totalBytes: 108104082 snapshotID: 8da1c5febf25225f4577ada2aeb9f899 startTimestamp: "2023-11-02T16:56:22Z"
4.13.2.2.
apiVersion: velero.io/v1 kind: Restore metadata: name: restore namespace: openshift-adp spec: backupName: <backup> # ...
$ oc create -f restore.yaml
$ oc get datadownloads -A
NAMESPACE NAME STATUS STARTED BYTES DONE TOTAL BYTES STORAGE LOCATION AGE NODE openshift-adp restore-test-1-sk7lg Completed 7m11s 108104082 108104082 dpa-sample-1 7m11s ip-10-0-150-57.us-west-2.compute.internal
$ oc get datadownloads <datadownload_name> -o yaml
apiVersion: velero.io/v2alpha1 kind: DataDownload metadata: name: restore-test-1-sk7lg namespace: openshift-adp spec: backupStorageLocation: dpa-sample-1 operationTimeout: 10m0s snapshotID: 8da1c5febf25225f4577ada2aeb9f899 sourceNamespace: mysql-persistent targetVolume: namespace: mysql-persistent pv: "" pvc: mysql status: completionTimestamp: "2023-11-02T17:01:24Z" node: ip-10-0-150-57.us-west-2.compute.internal phase: Completed 1 progress: bytesDone: 108104082 totalBytes: 108104082 startTimestamp: "2023-11-02T17:00:52Z"
4.13.2.3.
4.13.2.3.1.
4.13.3.
4.13.3.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication #... configuration: nodeAgent: enable: true 1 uploaderType: kopia 2 velero: defaultPlugins: - openshift - aws - csi 3 defaultSnapshotMoveData: true podConfig: env: - name: KOPIA_HASHING_ALGORITHM value: <hashing_algorithm_name> 4 - name: KOPIA_ENCRYPTION_ALGORITHM value: <encryption_algorithm_name> 5 - name: KOPIA_SPLITTER_ALGORITHM value: <splitter_algorithm_name> 6
4.13.3.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_name> 1 namespace: openshift-adp spec: backupLocations: - name: aws velero: config: profile: default region: <region_name> 2 credential: key: cloud name: cloud-credentials 3 default: true objectStorage: bucket: <bucket_name> 4 prefix: velero provider: aws configuration: nodeAgent: enable: true uploaderType: kopia velero: defaultPlugins: - openshift - aws - csi 5 defaultSnapshotMoveData: true podConfig: env: - name: KOPIA_HASHING_ALGORITHM value: BLAKE3-256 6 - name: KOPIA_ENCRYPTION_ALGORITHM value: CHACHA20-POLY1305-HMAC-SHA256 7 - name: KOPIA_SPLITTER_ALGORITHM value: DYNAMIC-8M-RABINKARP 8
$ oc create -f <dpa_file_name> 1
$ oc get dpa -o yaml
apiVersion: velero.io/v1 kind: Backup metadata: name: test-backup namespace: openshift-adp spec: includedNamespaces: - <application_namespace> 1 defaultVolumesToFsBackup: true
$ oc apply -f <backup_file_name> 1
$ oc get backups.velero.io <backup_name> -o yaml 1
$ kopia repository connect s3 \ --bucket=<bucket_name> \ 1 --prefix=velero/kopia/<application_namespace> \ 2 --password=static-passw0rd \ 3 --access-key="<aws_s3_access_key>" \ 4 --secret-access-key="<aws_s3_secret_access_key>" \ 5
참고$ kopia repository status
Config file: /../.config/kopia/repository.config Description: Repository in S3: s3.amazonaws.com <bucket_name> # ... Storage type: s3 Storage capacity: unbounded Storage config: { "bucket": <bucket_name>, "prefix": "velero/kopia/<application_namespace>/", "endpoint": "s3.amazonaws.com", "accessKeyID": <access_key>, "secretAccessKey": "****************************************", "sessionToken": "" } Unique ID: 58....aeb0 Hash: BLAKE3-256 Encryption: CHACHA20-POLY1305-HMAC-SHA256 Splitter: DYNAMIC-8M-RABINKARP Format version: 3 # ...
4.13.3.3.
apiVersion: v1 kind: Pod metadata: name: oadp-mustgather-pod labels: purpose: user-interaction spec: containers: - name: oadp-mustgather-container image: registry.redhat.io/oadp/oadp-mustgather-rhel9:v1.3 command: ["sleep"] args: ["infinity"]
참고$ oc apply -f <pod_config_file_name> 1
$ oc describe pod/oadp-mustgather-pod | grep scc
openshift.io/scc: anyuid
$ oc -n openshift-adp rsh pod/oadp-mustgather-pod
sh-5.1# kopia repository connect s3 \ --bucket=<bucket_name> \ 1 --prefix=velero/kopia/<application_namespace> \ 2 --password=static-passw0rd \ 3 --access-key="<access_key>" \ 4 --secret-access-key="<secret_access_key>" \ 5 --endpoint=<bucket_endpoint> \ 6
참고sh-5.1# kopia benchmark hashing
Benchmarking hash 'BLAKE2B-256' (100 x 1048576 bytes, parallelism 1) Benchmarking hash 'BLAKE2B-256-128' (100 x 1048576 bytes, parallelism 1) Benchmarking hash 'BLAKE2S-128' (100 x 1048576 bytes, parallelism 1) Benchmarking hash 'BLAKE2S-256' (100 x 1048576 bytes, parallelism 1) Benchmarking hash 'BLAKE3-256' (100 x 1048576 bytes, parallelism 1) Benchmarking hash 'BLAKE3-256-128' (100 x 1048576 bytes, parallelism 1) Benchmarking hash 'HMAC-SHA224' (100 x 1048576 bytes, parallelism 1) Benchmarking hash 'HMAC-SHA256' (100 x 1048576 bytes, parallelism 1) Benchmarking hash 'HMAC-SHA256-128' (100 x 1048576 bytes, parallelism 1) Benchmarking hash 'HMAC-SHA3-224' (100 x 1048576 bytes, parallelism 1) Benchmarking hash 'HMAC-SHA3-256' (100 x 1048576 bytes, parallelism 1) Hash Throughput ----------------------------------------------------------------- 0. BLAKE3-256 15.3 GB / second 1. BLAKE3-256-128 15.2 GB / second 2. HMAC-SHA256-128 6.4 GB / second 3. HMAC-SHA256 6.4 GB / second 4. HMAC-SHA224 6.4 GB / second 5. BLAKE2B-256-128 4.2 GB / second 6. BLAKE2B-256 4.1 GB / second 7. BLAKE2S-256 2.9 GB / second 8. BLAKE2S-128 2.9 GB / second 9. HMAC-SHA3-224 1.6 GB / second 10. HMAC-SHA3-256 1.5 GB / second ----------------------------------------------------------------- Fastest option for this machine is: --block-hash=BLAKE3-256
sh-5.1# kopia benchmark encryption
Benchmarking encryption 'AES256-GCM-HMAC-SHA256'... (1000 x 1048576 bytes, parallelism 1) Benchmarking encryption 'CHACHA20-POLY1305-HMAC-SHA256'... (1000 x 1048576 bytes, parallelism 1) Encryption Throughput ----------------------------------------------------------------- 0. AES256-GCM-HMAC-SHA256 2.2 GB / second 1. CHACHA20-POLY1305-HMAC-SHA256 1.8 GB / second ----------------------------------------------------------------- Fastest option for this machine is: --encryption=AES256-GCM-HMAC-SHA256
sh-5.1# kopia benchmark splitter
splitting 16 blocks of 32MiB each, parallelism 1 DYNAMIC 747.6 MB/s count:107 min:9467 10th:2277562 25th:2971794 50th:4747177 75th:7603998 90th:8388608 max:8388608 DYNAMIC-128K-BUZHASH 718.5 MB/s count:3183 min:3076 10th:80896 25th:104312 50th:157621 75th:249115 90th:262144 max:262144 DYNAMIC-128K-RABINKARP 164.4 MB/s count:3160 min:9667 10th:80098 25th:106626 50th:162269 75th:250655 90th:262144 max:262144 # ... FIXED-512K 102.9 TB/s count:1024 min:524288 10th:524288 25th:524288 50th:524288 75th:524288 90th:524288 max:524288 FIXED-8M 566.3 TB/s count:64 min:8388608 10th:8388608 25th:8388608 50th:8388608 75th:8388608 90th:8388608 max:8388608 ----------------------------------------------------------------- 0. FIXED-8M 566.3 TB/s count:64 min:8388608 10th:8388608 25th:8388608 50th:8388608 75th:8388608 90th:8388608 max:8388608 1. FIXED-4M 425.8 TB/s count:128 min:4194304 10th:4194304 25th:4194304 50th:4194304 75th:4194304 90th:4194304 max:4194304 # ... 22. DYNAMIC-128K-RABINKARP 164.4 MB/s count:3160 min:9667 10th:80098 25th:106626 50th:162269 75th:250655 90th:262144 max:262144
4.14.
4.14.1.
4.14.1.1.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4.14.2.
$ alias velero='oc -n openshift-adp exec deployment/velero -c velero -it -- ./velero'
4.14.3.
$ oc describe <velero_cr> <cr_name>
$ oc logs pod/<velero>
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: velero-sample spec: configuration: velero: logLevel: warning
4.14.4.
$ oc -n openshift-adp exec deployment/velero -c velero -- ./velero \ <backup_restore_cr> <command> <cr_name>
$ oc -n openshift-adp exec deployment/velero -c velero -- ./velero \ backup describe 0e44ae00-5dc3-11eb-9ca8-df7e5254778b-2d8ql
$ oc -n openshift-adp exec deployment/velero -c velero -- ./velero \ --help
$ oc -n openshift-adp exec deployment/velero -c velero -- ./velero \ <backup_restore_cr> describe <cr_name>
$ oc -n openshift-adp exec deployment/velero -c velero -- ./velero \ backup describe 0e44ae00-5dc3-11eb-9ca8-df7e5254778b-2d8ql
$ oc -n openshift-adp exec deployment/velero -c velero -- ./velero \ <backup_restore_cr> logs <cr_name>
$ oc -n openshift-adp exec deployment/velero -c velero -- ./velero \ restore logs ccc7c2d0-6017-11eb-afab-85d0007f5a19-x4lbf
4.14.5.
4.14.5.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication ... configuration: velero: podConfig: resourceAllocations: 1 requests: cpu: 200m memory: 256Mi
4.14.5.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication ... configuration: restic: podConfig: resourceAllocations: 1 requests: cpu: 1000m memory: 16Gi
requests: cpu: 500m memory: 128Mi
4.14.6.
Velero: pod volume restore failed: data path restore failed: \ Failed to run kopia restore: Failed to copy snapshot data to the target: \ restore error: copy file: error creating file: \ open /host_pods/b4d...6/volumes/kubernetes.io~nfs/pvc-53...4e5/userdata/base/13493/2681: \ no such file or directory
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-client
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner
parameters:
pathPattern: "${.PVC.namespace}/${.PVC.annotations.nfs.io/storage-path}" 1
onDelete: delete
4.14.7.
4.14.7.1.
4.14.7.1.1.
$ velero restore <restore_name> \ --from-backup=<backup_name> --include-resources \ service.serving.knavtive.dev
4.14.7.1.2.
$ oc get mutatingwebhookconfigurations
4.14.7.2.
4.14.7.2.1.
024-02-27T10:46:50.028951744Z time="2024-02-27T10:46:50Z" level=error msg="Error backing up item" backup=openshift-adp/<backup name> error="error executing custom action (groupResource=imagestreams.image.openshift.io, namespace=<BSL Name>, name=postgres): rpc error: code = Aborted desc = plugin panicked: runtime error: index out of range with length 1, stack trace: goroutine 94…
4.14.7.2.1.1.
$ oc label backupstoragelocations.velero.io <bsl_name> app.kubernetes.io/component=bsl
- 참고
$ oc -n openshift-adp get secret/oadp-<bsl_name>-<bsl_provider>-registry-secret -o json | jq -r '.data'
4.14.7.2.2.
4.14.7.2.2.1.
4.14.7.3.
4.14.8.
4.14.8.1.
4.14.8.2.
[default] 1 aws_access_key_id=AKIAIOSFODNN7EXAMPLE 2 aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
4.14.9.
4.14.9.1.
$ oc get backupstoragelocations.velero.io -A
$ velero backup-location get -n <OADP_Operator_namespace>
$ oc get backupstoragelocations.velero.io -n <namespace> -o yaml
apiVersion: v1 items: - apiVersion: velero.io/v1 kind: BackupStorageLocation metadata: creationTimestamp: "2023-11-03T19:49:04Z" generation: 9703 name: example-dpa-1 namespace: openshift-adp-operator ownerReferences: - apiVersion: oadp.openshift.io/v1alpha1 blockOwnerDeletion: true controller: true kind: DataProtectionApplication name: example-dpa uid: 0beeeaff-0287-4f32-bcb1-2e3c921b6e82 resourceVersion: "24273698" uid: ba37cd15-cf17-4f7d-bf03-8af8655cea83 spec: config: enableSharedConfig: "true" region: us-west-2 credential: key: credentials name: cloud-credentials default: true objectStorage: bucket: example-oadp-operator prefix: example provider: aws status: lastValidationTime: "2023-11-10T22:06:46Z" message: "BackupStorageLocation \"example-dpa-1\" is unavailable: rpc error: code = Unknown desc = WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: d3f2e099-70a0-467b-997e-ff62345e3b54" phase: Unavailable kind: List metadata: resourceVersion: ""
4.14.10.
4.14.10.1.
level=error msg="Error backing up item" backup=velero/monitoring error="timed out waiting for all PodVolumeBackups to complete"
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_name> spec: configuration: nodeAgent: enable: true uploaderType: restic timeout: 1h # ...
4.14.10.2.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_name> spec: configuration: velero: resourceTimeout: 10m # ...
4.14.10.3.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_name> spec: features: dataMover: timeout: 10m # ...
4.14.10.4.
apiVersion: velero.io/v1 kind: Backup metadata: name: <backup_name> spec: csiSnapshotTimeout: 10m # ...
4.14.10.5.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <dpa_name> spec: configuration: velero: defaultItemOperationTimeout: 1h # ...
4.14.10.6.
apiVersion: velero.io/v1 kind: Restore metadata: name: <restore_name> spec: itemOperationTimeout: 1h # ...
4.14.10.7.
apiVersion: velero.io/v1 kind: Backup metadata: name: <backup_name> spec: itemOperationTimeout: 1h # ...
4.14.11.
4.14.11.1.
4.14.11.2.
$ oc -n {namespace} exec deployment/velero -c velero -- ./velero \ backup describe <backup>
$ oc delete backups.velero.io <backup> -n openshift-adp
$ velero backup describe <backup-name> --details
4.14.11.3.
time="2023-02-17T16:33:13Z" level=error msg="Error backing up item" backup=openshift-adp/user1-backup-check5 error="error executing custom action (groupResource=persistentvolumeclaims, namespace=busy1, name=pvc1-user1): rpc error: code = Unknown desc = failed to get volumesnapshotclass for storageclass ocs-storagecluster-ceph-rbd: failed to get volumesnapshotclass for provisioner openshift-storage.rbd.csi.ceph.com, ensure that the desired volumesnapshot class has the velero.io/csi-volumesnapshot-class label" logSource="/remote-source/velero/app/pkg/backup/backup.go:417" name=busybox-79799557b5-vprq
$ oc delete backups.velero.io <backup> -n openshift-adp
$ oc label volumesnapshotclass/<snapclass_name> velero.io/csi-volumesnapshot-class=true
4.14.12.
4.14.12.1.
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication # ... spec: configuration: nodeAgent: enable: true uploaderType: restic supplementalGroups: - <group_id> 1 # ...
4.14.12.2.
$ oc delete resticrepository openshift-adp <name_of_the_restic_repository>
time="2021-12-29T18:29:14Z" level=info msg="1 errors encountered backup up item" backup=velero/backup65 logSource="pkg/backup/backup.go:431" name=mysql-7d99fc949-qbkds time="2021-12-29T18:29:14Z" level=error msg="Error backing up item" backup=velero/backup65 error="pod volume backup failed: error running restic backup, stderr=Fatal: unable to open config file: Stat: The specified key does not exist.\nIs there a repository at the following location?\ns3:http://minio-minio.apps.mayap-oadp- veleo-1234.qe.devcluster.openshift.com/mayapvelerooadp2/velero1/ restic/mysql-persistent\n: exit status 1" error.file="/remote-source/ src/github.com/vmware-tanzu/velero/pkg/restic/backupper.go:184" error.function="github.com/vmware-tanzu/velero/ pkg/restic.(*backupper).BackupPodVolumes" logSource="pkg/backup/backup.go:435" name=mysql-7d99fc949-qbkds
4.14.12.3.
\"level=error\" in line#2273: time=\"2023-06-12T06:50:04Z\" level=error msg=\"error restoring mysql-869f9f44f6-tp5lv: pods\\\ "mysql-869f9f44f6-tp5lv\\\" is forbidden: violates PodSecurity\\\ "restricted:v1.24\\\": privil eged (container \\\"mysql\\\ " must not set securityContext.privileged=true), allowPrivilegeEscalation != false (containers \\\ "restic-wait\\\", \\\"mysql\\\" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (containers \\\ "restic-wait\\\", \\\"mysql\\\" must set securityContext.capabilities.drop=[\\\"ALL\\\"]), seccompProfile (pod or containers \\\ "restic-wait\\\", \\\"mysql\\\" must set securityContext.seccompProfile.type to \\\ "RuntimeDefault\\\" or \\\"Localhost\\\")\" logSource=\"/remote-source/velero/app/pkg/restore/restore.go:1388\" restore=openshift-adp/todolist-backup-0780518c-08ed-11ee-805c-0a580a80e92c\n velero container contains \"level=error\" in line#2447: time=\"2023-06-12T06:50:05Z\" level=error msg=\"Namespace todolist-mariadb, resource restore error: error restoring pods/todolist-mariadb/mysql-869f9f44f6-tp5lv: pods \\\ "mysql-869f9f44f6-tp5lv\\\" is forbidden: violates PodSecurity \\\"restricted:v1.24\\\": privileged (container \\\ "mysql\\\" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (containers \\\ "restic-wait\\\",\\\"mysql\\\" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (containers \\\ "restic-wait\\\", \\\"mysql\\\" must set securityContext.capabilities.drop=[\\\"ALL\\\"]), seccompProfile (pod or containers \\\ "restic-wait\\\", \\\"mysql\\\" must set securityContext.seccompProfile.type to \\\ "RuntimeDefault\\\" or \\\"Localhost\\\")\" logSource=\"/remote-source/velero/app/pkg/controller/restore_controller.go:510\" restore=openshift-adp/todolist-backup-0780518c-08ed-11ee-805c-0a580a80e92c\n]",
$ oc get dpa -o yaml
# ... configuration: restic: enable: true velero: args: restore-resource-priorities: 'securitycontextconstraints,customresourcedefinitions,namespaces,storageclasses,volumesnapshotclass.snapshot.storage.k8s.io,volumesnapshotcontents.snapshot.storage.k8s.io,volumesnapshots.snapshot.storage.k8s.io,datauploads.velero.io,persistentvolumes,persistentvolumeclaims,serviceaccounts,secrets,configmaps,limitranges,pods,replicasets.apps,clusterclasses.cluster.x-k8s.io,endpoints,services,-,clusterbootstraps.run.tanzu.vmware.com,clusters.cluster.x-k8s.io,clusterresourcesets.addons.cluster.x-k8s.io' 1 defaultPlugins: - gcp - openshift
4.14.13.
$ oc adm must-gather --image=registry.redhat.io/oadp/oadp-mustgather-rhel9:v1.4
$ oc adm must-gather --image=registry.redhat.io/oadp/oadp-mustgather-rhel9:v1.4 \ -- /usr/bin/gather_<time>_essential 1
$ oc adm must-gather --image=registry.redhat.io/oadp/oadp-mustgather-rhel9:v1.4 \ -- /usr/bin/gather_with_timeout <timeout> 1
$ oc adm must-gather --image=registry.redhat.io/oadp/oadp-mustgather-rhel9:v1.4 -- /usr/bin/gather_metrics_dump
4.14.13.1.
$ oc adm must-gather --image=registry.redhat.io/oadp/oadp-mustgather-rhel9:v1.4 -- /usr/bin/gather_without_tls <true/false>
4.14.13.2.
$ oc adm must-gather --image=registry.redhat.io/oadp/oadp-mustgather-rhel9:v1.4 -- skip_tls=true /usr/bin/gather_with_timeout <timeout_value_in_seconds>
$ oc adm must-gather --image=registry.redhat.io/oadp/oadp-mustgather-rhel9:v1.4 -- /usr/bin/gather_without_tls true
4.14.14.
4.14.14.1.
$ oc edit configmap cluster-monitoring-config -n openshift-monitoring
apiVersion: v1 data: config.yaml: | enableUserWorkload: true 1 kind: ConfigMap metadata: # ...
$ oc get pods -n openshift-user-workload-monitoring
NAME READY STATUS RESTARTS AGE prometheus-operator-6844b4b99c-b57j9 2/2 Running 0 43s prometheus-user-workload-0 5/5 Running 0 32s prometheus-user-workload-1 5/5 Running 0 32s thanos-ruler-user-workload-0 3/3 Running 0 32s thanos-ruler-user-workload-1 3/3 Running 0 32s
$ oc get configmap user-workload-monitoring-config -n openshift-user-workload-monitoring
Error from server (NotFound): configmaps "user-workload-monitoring-config" not found
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: |
$ oc apply -f 2_configure_user_workload_monitoring.yaml configmap/user-workload-monitoring-config created
4.14.14.2.
$ oc get svc -n openshift-adp -l app.kubernetes.io/name=velero
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE openshift-adp-velero-metrics-svc ClusterIP 172.30.38.244 <none> 8085/TCP 1h
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: app: oadp-service-monitor name: oadp-service-monitor namespace: openshift-adp spec: endpoints: - interval: 30s path: /metrics targetPort: 8085 scheme: http selector: matchLabels: app.kubernetes.io/name: "velero"
$ oc apply -f 3_create_oadp_service_monitor.yaml
servicemonitor.monitoring.coreos.com/oadp-service-monitor created
그림 4.1.
[D]
4.14.14.3.
apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: sample-oadp-alert namespace: openshift-adp spec: groups: - name: sample-oadp-backup-alert rules: - alert: OADPBackupFailing annotations: description: 'OADP had {{$value | humanize}} backup failures over the last 2 hours.' summary: OADP has issues creating backups expr: | increase(velero_backup_failure_total{job="openshift-adp-velero-metrics-svc"}[2h]) > 0 for: 5m labels: severity: warning
$ oc apply -f 4_create_oadp_alert_rule.yaml
prometheusrule.monitoring.coreos.com/sample-oadp-alert created
그림 4.2.
[D]
4.14.14.4.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4.14.14.5.
그림 4.3.
[D]
4.15.
4.15.1.
4.15.2.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4.15.2.1.
$ oc label node/<node_name> node-role.kubernetes.io/nodeAgent=""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/nodeAgent: ""
configuration: nodeAgent: enable: true podConfig: nodeSelector: node-role.kubernetes.io/infra: "" node-role.kubernetes.io/worker: ""
|
|
|
|
|
|
|
|
|
|
|
|
4.16.
4.16.1.
4.16.1.1.
$ oc api-resources
4.16.1.2.
4.16.1.3.
apiVersion: oadp.openshift.io/vialpha1 kind: DataProtectionApplication ... spec: configuration: velero: featureFlags: - EnableAPIGroupVersions
4.16.2.
4.16.2.1.
4.16.2.1.1.
4.16.2.1.2.
4.16.2.2.
4.16.2.2.1.
4.16.2.2.2.
$ oc -n <your_pod_namespace> annotate pod/<your_pod_name> \ backup.velero.io/backup-volumes=<your_volume_name_1>, \ <your_volume_name_2>>,...,<your_volume_name_n>
4.16.2.2.3.
$ oc -n <your_pod_namespace> annotate pod/<your_pod_name> \ backup.velero.io/backup-volumes-excludes=<your_volume_name_1>, \ <your_volume_name_2>>,...,<your_volume_name_n>
4.16.2.3.
4.16.2.4.
$ velero backup create <backup_name> --default-volumes-to-fs-backup <any_other_options>
참고
4.16.3.
4.16.3.1.
4.16.3.1.1.
4.16.3.1.2.
$ cat change-storageclass.yaml
apiVersion: v1 kind: ConfigMap metadata: name: change-storage-class-config namespace: openshift-adp labels: velero.io/plugin-config: "" velero.io/change-storage-class: RestoreItemAction data: standard-csi: ssd-csi
$ oc create -f change-storage-class-config
4.16.4.
5장.
5.1.
5.1.1.
- 작은 정보
$ oc debug --as-root node/<node_name>
sh-4.4# chroot /host
$ export HTTP_PROXY=http://<your_proxy.example.com>:8080
$ export HTTPS_PROXY=https://<your_proxy.example.com>:8080
$ export NO_PROXY=<example.com>
- 작은 정보
sh-4.4# /usr/local/bin/cluster-backup.sh /home/core/assets/backup
found latest kube-apiserver: /etc/kubernetes/static-pod-resources/kube-apiserver-pod-6 found latest kube-controller-manager: /etc/kubernetes/static-pod-resources/kube-controller-manager-pod-7 found latest kube-scheduler: /etc/kubernetes/static-pod-resources/kube-scheduler-pod-6 found latest etcd: /etc/kubernetes/static-pod-resources/etcd-pod-3 ede95fe6b88b87ba86a03c15e669fb4aa5bf0991c180d3c6895ce72eaade54a1 etcdctl version: 3.4.14 API version: 3.4 {"level":"info","ts":1624647639.0188997,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"/home/core/assets/backup/snapshot_2021-06-25_190035.db.part"} {"level":"info","ts":"2021-06-25T19:00:39.030Z","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"} {"level":"info","ts":1624647639.0301006,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"https://10.0.0.5:2379"} {"level":"info","ts":"2021-06-25T19:00:40.215Z","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"} {"level":"info","ts":1624647640.6032252,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"https://10.0.0.5:2379","size":"114 MB","took":1.584090459} {"level":"info","ts":1624647640.6047094,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"/home/core/assets/backup/snapshot_2021-06-25_190035.db"} Snapshot saved at /home/core/assets/backup/snapshot_2021-06-25_190035.db {"hash":3866667823,"revision":31407,"totalKey":12828,"totalSize":114446336} snapshot db and kube resources are successfully saved to /home/core/assets/backup
- 참고
5.1.2.
5.1.3.
apiVersion: config.openshift.io/v1 kind: FeatureGate metadata: name: cluster spec: featureSet: TechPreviewNoUpgrade
$ oc apply -f enable-tech-preview-no-upgrade.yaml
$ oc get crd | grep backup
backups.config.openshift.io 2023-10-25T13:32:43Z etcdbackups.operator.openshift.io 2023-10-25T13:32:04Z
5.1.3.1.
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: etcd-backup-pvc namespace: openshift-etcd spec: accessModes: - ReadWriteOnce resources: requests: storage: 200Gi 1 volumeMode: Filesystem
$ oc apply -f etcd-backup-pvc.yaml
$ oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE etcd-backup-pvc Bound 51s
참고apiVersion: operator.openshift.io/v1alpha1 kind: EtcdBackup metadata: name: etcd-single-backup namespace: openshift-etcd spec: pvcName: etcd-backup-pvc 1
$ oc apply -f etcd-single-backup.yaml
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: etcd-backup-local-storage provisioner: kubernetes.io/no-provisioner volumeBindingMode: Immediate
$ oc apply -f etcd-backup-local-storage.yaml
apiVersion: v1 kind: PersistentVolume metadata: name: etcd-backup-pv-fs spec: capacity: storage: 100Gi 1 volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: etcd-backup-local-storage local: path: /mnt nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - <example_master_node> 2
$ oc get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE etcd-backup-pv-fs 100Gi RWO Retain Available etcd-backup-local-storage 10s
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: etcd-backup-pvc namespace: openshift-etcd spec: accessModes: - ReadWriteOnce volumeMode: Filesystem resources: requests: storage: 10Gi 1
$ oc apply -f etcd-backup-pvc.yaml
apiVersion: operator.openshift.io/v1alpha1 kind: EtcdBackup metadata: name: etcd-single-backup namespace: openshift-etcd spec: pvcName: etcd-backup-pvc 1
$ oc apply -f etcd-single-backup.yaml
5.1.3.2.
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: etcd-backup-pvc namespace: openshift-etcd spec: accessModes: - ReadWriteOnce resources: requests: storage: 200Gi 1 volumeMode: Filesystem storageClassName: etcd-backup-local-storage
참고$ oc apply -f etcd-backup-pvc.yaml
$ oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE etcd-backup-pvc Bound 51s
참고
- 주의
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: etcd-backup-local-storage provisioner: kubernetes.io/no-provisioner volumeBindingMode: Immediate
$ oc apply -f etcd-backup-local-storage.yaml
apiVersion: v1 kind: PersistentVolume metadata: name: etcd-backup-pv-fs spec: capacity: storage: 100Gi 1 volumeMode: Filesystem accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Delete storageClassName: etcd-backup-local-storage local: path: /mnt/ nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - <example_master_node> 2
작은 정보$ oc get nodes
$ oc get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE etcd-backup-pv-fs 100Gi RWX Delete Available etcd-backup-local-storage 10s
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: etcd-backup-pvc spec: accessModes: - ReadWriteMany volumeMode: Filesystem resources: requests: storage: 10Gi 1 storageClassName: etcd-backup-local-storage
$ oc apply -f etcd-backup-pvc.yaml
apiVersion: config.openshift.io/v1alpha1 kind: Backup metadata: name: etcd-recurring-backup spec: etcd: schedule: "20 4 * * *" 1 timeZone: "UTC" pvcName: etcd-backup-pvc
spec: etcd: retentionPolicy: retentionType: RetentionNumber 1 retentionNumber: maxNumberOfBackups: 5 2
주의spec: etcd: retentionPolicy: retentionType: RetentionSize retentionSize: maxSizeOfBackupsGb: 20 1
주의$ oc create -f etcd-recurring-backup.yaml
$ oc get cronjob -n openshift-etcd
5.2.
5.2.1.
5.2.2.
$ oc get etcd -o=jsonpath='{range .items[0].status.conditions[?(@.type=="EtcdMembersAvailable")]}{.message}{"\n"}'
2 of 3 members are available, ip-10-0-131-183.ec2.internal is unhealthy
5.2.3.
$ oc get machines -A -ojsonpath='{range .items[*]}{@.status.nodeRef.name}{"\t"}{@.status.providerStatus.instanceState}{"\n"}' | grep -v running
ip-10-0-131-183.ec2.internal stopped 1
$ oc get nodes -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{"\t"}{range .spec.taints[*]}{.key}{" "}' | grep unreachable
ip-10-0-131-183.ec2.internal node-role.kubernetes.io/master node.kubernetes.io/unreachable node.kubernetes.io/unreachable 1
$ oc get nodes -l node-role.kubernetes.io/master | grep "NotReady"
ip-10-0-131-183.ec2.internal NotReady master 122m v1.29.4 1
$ oc get nodes -l node-role.kubernetes.io/master
NAME STATUS ROLES AGE VERSION ip-10-0-131-183.ec2.internal Ready master 6h13m v1.29.4 ip-10-0-164-97.ec2.internal Ready master 6h13m v1.29.4 ip-10-0-154-204.ec2.internal Ready master 6h13m v1.29.4
$ oc -n openshift-etcd get pods -l k8s-app=etcd
etcd-ip-10-0-131-183.ec2.internal 2/3 Error 7 6h9m 1 etcd-ip-10-0-164-97.ec2.internal 3/3 Running 0 6h6m etcd-ip-10-0-154-204.ec2.internal 3/3 Running 0 6h6m
5.2.4.
5.2.4.1.
- 중요
- 중요
$ oc -n openshift-etcd get pods -l k8s-app=etcd
etcd-ip-10-0-131-183.ec2.internal 3/3 Running 0 123m etcd-ip-10-0-164-97.ec2.internal 3/3 Running 0 123m etcd-ip-10-0-154-204.ec2.internal 3/3 Running 0 124m
$ oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
sh-4.2# etcdctl member list -w table
+------------------+---------+------------------------------+---------------------------+---------------------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | +------------------+---------+------------------------------+---------------------------+---------------------------+ | 6fc1e7c9db35841d | started | ip-10-0-131-183.ec2.internal | https://10.0.131.183:2380 | https://10.0.131.183:2379 | | 757b6793e2408b6c | started | ip-10-0-164-97.ec2.internal | https://10.0.164.97:2380 | https://10.0.164.97:2379 | | ca8c2990a0aa29d1 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 | +------------------+---------+------------------------------+---------------------------+---------------------------+
sh-4.2# etcdctl member remove 6fc1e7c9db35841d
Member 6fc1e7c9db35841d removed from cluster ead669ce1fbfb346
sh-4.2# etcdctl member list -w table
+------------------+---------+------------------------------+---------------------------+---------------------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | +------------------+---------+------------------------------+---------------------------+---------------------------+ | 757b6793e2408b6c | started | ip-10-0-164-97.ec2.internal | https://10.0.164.97:2380 | https://10.0.164.97:2379 | | ca8c2990a0aa29d1 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 | +------------------+---------+------------------------------+---------------------------+---------------------------+
$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}'
중요참고$ oc delete node <node_name>
$ oc delete node ip-10-0-131-183.ec2.internal
$ oc get secrets -n openshift-etcd | grep ip-10-0-131-183.ec2.internal 1
etcd-peer-ip-10-0-131-183.ec2.internal kubernetes.io/tls 2 47m etcd-serving-ip-10-0-131-183.ec2.internal kubernetes.io/tls 2 47m etcd-serving-metrics-ip-10-0-131-183.ec2.internal kubernetes.io/tls 2 47m
$ oc delete secret -n openshift-etcd etcd-peer-ip-10-0-131-183.ec2.internal
$ oc delete secret -n openshift-etcd etcd-serving-ip-10-0-131-183.ec2.internal
$ oc delete secret -n openshift-etcd etcd-serving-metrics-ip-10-0-131-183.ec2.internal
$ oc get machines -n openshift-machine-api -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE clustername-8qw5l-master-0 Running m4.xlarge us-east-1 us-east-1a 3h37m ip-10-0-131-183.ec2.internal aws:///us-east-1a/i-0ec2782f8287dfb7e stopped 1 clustername-8qw5l-master-1 Running m4.xlarge us-east-1 us-east-1b 3h37m ip-10-0-154-204.ec2.internal aws:///us-east-1b/i-096c349b700a19631 running clustername-8qw5l-master-2 Running m4.xlarge us-east-1 us-east-1c 3h37m ip-10-0-164-97.ec2.internal aws:///us-east-1c/i-02626f1dba9ed5bba running clustername-8qw5l-worker-us-east-1a-wbtgd Running m4.large us-east-1 us-east-1a 3h28m ip-10-0-129-226.ec2.internal aws:///us-east-1a/i-010ef6279b4662ced running clustername-8qw5l-worker-us-east-1b-lrdxb Running m4.large us-east-1 us-east-1b 3h28m ip-10-0-144-248.ec2.internal aws:///us-east-1b/i-0cb45ac45a166173b running clustername-8qw5l-worker-us-east-1c-pkg26 Running m4.large us-east-1 us-east-1c 3h28m ip-10-0-170-181.ec2.internal aws:///us-east-1c/i-06861c00007751b0a running
$ oc delete machine -n openshift-machine-api clustername-8qw5l-master-0 1
$ oc get machines -n openshift-machine-api -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE clustername-8qw5l-master-1 Running m4.xlarge us-east-1 us-east-1b 3h37m ip-10-0-154-204.ec2.internal aws:///us-east-1b/i-096c349b700a19631 running clustername-8qw5l-master-2 Running m4.xlarge us-east-1 us-east-1c 3h37m ip-10-0-164-97.ec2.internal aws:///us-east-1c/i-02626f1dba9ed5bba running clustername-8qw5l-master-3 Provisioning m4.xlarge us-east-1 us-east-1a 85s ip-10-0-133-53.ec2.internal aws:///us-east-1a/i-015b0888fe17bc2c8 running 1 clustername-8qw5l-worker-us-east-1a-wbtgd Running m4.large us-east-1 us-east-1a 3h28m ip-10-0-129-226.ec2.internal aws:///us-east-1a/i-010ef6279b4662ced running clustername-8qw5l-worker-us-east-1b-lrdxb Running m4.large us-east-1 us-east-1b 3h28m ip-10-0-144-248.ec2.internal aws:///us-east-1b/i-0cb45ac45a166173b running clustername-8qw5l-worker-us-east-1c-pkg26 Running m4.large us-east-1 us-east-1c 3h28m ip-10-0-170-181.ec2.internal aws:///us-east-1c/i-06861c00007751b0a running
$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": null}}'
$ oc get etcd/cluster -oyaml
EtcdCertSignerControllerDegraded: [Operation cannot be fulfilled on secrets "etcd-peer-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-metrics-sno-0": the object has been modified; please apply your changes to the latest version and try again]
$ oc -n openshift-etcd get pods -l k8s-app=etcd
etcd-ip-10-0-133-53.ec2.internal 3/3 Running 0 7m49s etcd-ip-10-0-164-97.ec2.internal 3/3 Running 0 123m etcd-ip-10-0-154-204.ec2.internal 3/3 Running 0 124m
$ oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge 1
$ oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
sh-4.2# etcdctl member list -w table
+------------------+---------+------------------------------+---------------------------+---------------------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | +------------------+---------+------------------------------+---------------------------+---------------------------+ | 5eb0d6b8ca24730c | started | ip-10-0-133-53.ec2.internal | https://10.0.133.53:2380 | https://10.0.133.53:2379 | | 757b6793e2408b6c | started | ip-10-0-164-97.ec2.internal | https://10.0.164.97:2380 | https://10.0.164.97:2379 | | ca8c2990a0aa29d1 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 | +------------------+---------+------------------------------+---------------------------+---------------------------+
주의
5.2.4.2.
- 중요
$ oc debug node/ip-10-0-131-183.ec2.internal 1
sh-4.2# chroot /host
sh-4.2# mkdir /var/lib/etcd-backup
sh-4.2# mv /etc/kubernetes/manifests/etcd-pod.yaml /var/lib/etcd-backup/
sh-4.2# mv /var/lib/etcd/ /tmp
$ oc -n openshift-etcd get pods -l k8s-app=etcd
etcd-ip-10-0-131-183.ec2.internal 2/3 Error 7 6h9m etcd-ip-10-0-164-97.ec2.internal 3/3 Running 0 6h6m etcd-ip-10-0-154-204.ec2.internal 3/3 Running 0 6h6m
$ oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
sh-4.2# etcdctl member list -w table
+------------------+---------+------------------------------+---------------------------+---------------------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | +------------------+---------+------------------------------+---------------------------+---------------------------+ | 62bcf33650a7170a | started | ip-10-0-131-183.ec2.internal | https://10.0.131.183:2380 | https://10.0.131.183:2379 | | b78e2856655bc2eb | started | ip-10-0-164-97.ec2.internal | https://10.0.164.97:2380 | https://10.0.164.97:2379 | | d022e10b498760d5 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 | +------------------+---------+------------------------------+---------------------------+---------------------------+
sh-4.2# etcdctl member remove 62bcf33650a7170a
Member 62bcf33650a7170a removed from cluster ead669ce1fbfb346
sh-4.2# etcdctl member list -w table
+------------------+---------+------------------------------+---------------------------+---------------------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | +------------------+---------+------------------------------+---------------------------+---------------------------+ | b78e2856655bc2eb | started | ip-10-0-164-97.ec2.internal | https://10.0.164.97:2380 | https://10.0.164.97:2379 | | d022e10b498760d5 | started | ip-10-0-154-204.ec2.internal | https://10.0.154.204:2380 | https://10.0.154.204:2379 | +------------------+---------+------------------------------+---------------------------+---------------------------+
$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}'
$ oc get secrets -n openshift-etcd | grep ip-10-0-131-183.ec2.internal 1
etcd-peer-ip-10-0-131-183.ec2.internal kubernetes.io/tls 2 47m etcd-serving-ip-10-0-131-183.ec2.internal kubernetes.io/tls 2 47m etcd-serving-metrics-ip-10-0-131-183.ec2.internal kubernetes.io/tls 2 47m
$ oc delete secret -n openshift-etcd etcd-peer-ip-10-0-131-183.ec2.internal
$ oc delete secret -n openshift-etcd etcd-serving-ip-10-0-131-183.ec2.internal
$ oc delete secret -n openshift-etcd etcd-serving-metrics-ip-10-0-131-183.ec2.internal
$ oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "single-master-recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge 1
$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": null}}'
$ oc get etcd/cluster -oyaml
EtcdCertSignerControllerDegraded: [Operation cannot be fulfilled on secrets "etcd-peer-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-metrics-sno-0": the object has been modified; please apply your changes to the latest version and try again]
$ oc rsh -n openshift-etcd etcd-ip-10-0-154-204.ec2.internal
sh-4.2# etcdctl endpoint health
https://10.0.131.183:2379 is healthy: successfully committed proposal: took = 16.671434ms https://10.0.154.204:2379 is healthy: successfully committed proposal: took = 16.698331ms https://10.0.164.97:2379 is healthy: successfully committed proposal: took = 16.621645ms
5.2.4.3.
- 중요
$ oc -n openshift-etcd get pods -l k8s-app=etcd -o wide
etcd-openshift-control-plane-0 5/5 Running 11 3h56m 192.168.10.9 openshift-control-plane-0 <none> <none> etcd-openshift-control-plane-1 5/5 Running 0 3h54m 192.168.10.10 openshift-control-plane-1 <none> <none> etcd-openshift-control-plane-2 5/5 Running 0 3h58m 192.168.10.11 openshift-control-plane-2 <none> <none>
$ oc rsh -n openshift-etcd etcd-openshift-control-plane-0
sh-4.2# etcdctl member list -w table
+------------------+---------+--------------------+---------------------------+---------------------------+---------------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+--------------------+---------------------------+---------------------------+---------------------+ | 7a8197040a5126c8 | started | openshift-control-plane-2 | https://192.168.10.11:2380/ | https://192.168.10.11:2379/ | false | | 8d5abe9669a39192 | started | openshift-control-plane-1 | https://192.168.10.10:2380/ | https://192.168.10.10:2379/ | false | | cc3830a72fc357f9 | started | openshift-control-plane-0 | https://192.168.10.9:2380/ | https://192.168.10.9:2379/ | false | +------------------+---------+--------------------+---------------------------+---------------------------+---------------------+
- 주의
sh-4.2# etcdctl member remove 7a8197040a5126c8
Member 7a8197040a5126c8 removed from cluster b23536c33f2cdd1b
sh-4.2# etcdctl member list -w table
+------------------+---------+--------------------+---------------------------+---------------------------+-------------------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+--------------------+---------------------------+---------------------------+-------------------------+ | cc3830a72fc357f9 | started | openshift-control-plane-2 | https://192.168.10.11:2380/ | https://192.168.10.11:2379/ | false | | 8d5abe9669a39192 | started | openshift-control-plane-1 | https://192.168.10.10:2380/ | https://192.168.10.10:2379/ | false | +------------------+---------+--------------------+---------------------------+---------------------------+-------------------------+
중요
$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}'
$ oc get secrets -n openshift-etcd | grep openshift-control-plane-2
etcd-peer-openshift-control-plane-2 kubernetes.io/tls 2 134m etcd-serving-metrics-openshift-control-plane-2 kubernetes.io/tls 2 134m etcd-serving-openshift-control-plane-2 kubernetes.io/tls 2 134m
$ oc delete secret etcd-peer-openshift-control-plane-2 -n openshift-etcd secret "etcd-peer-openshift-control-plane-2" deleted
$ oc delete secret etcd-serving-metrics-openshift-control-plane-2 -n openshift-etcd secret "etcd-serving-metrics-openshift-control-plane-2" deleted
$ oc delete secret etcd-serving-openshift-control-plane-2 -n openshift-etcd secret "etcd-serving-openshift-control-plane-2" deleted
$ oc get machines -n openshift-machine-api -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE examplecluster-control-plane-0 Running 3h11m openshift-control-plane-0 baremetalhost:///openshift-machine-api/openshift-control-plane-0/da1ebe11-3ff2-41c5-b099-0aa41222964e externally provisioned 1 examplecluster-control-plane-1 Running 3h11m openshift-control-plane-1 baremetalhost:///openshift-machine-api/openshift-control-plane-1/d9f9acbc-329c-475e-8d81-03b20280a3e1 externally provisioned examplecluster-control-plane-2 Running 3h11m openshift-control-plane-2 baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135 externally provisioned examplecluster-compute-0 Running 165m openshift-compute-0 baremetalhost:///openshift-machine-api/openshift-compute-0/3d685b81-7410-4bb3-80ec-13a31858241f provisioned examplecluster-compute-1 Running 165m openshift-compute-1 baremetalhost:///openshift-machine-api/openshift-compute-1/0fdae6eb-2066-4241-91dc-e7ea72ab13b9 provisioned
$ oc get clusteroperator baremetal
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE baremetal 4.16.0 True False False 3d15h
$ oc delete bmh openshift-control-plane-2 -n openshift-machine-api
baremetalhost.metal3.io "openshift-control-plane-2" deleted
$ oc delete machine -n openshift-machine-api examplecluster-control-plane-2
중요$ oc edit machine -n openshift-machine-api examplecluster-control-plane-2
finalizers: - machine.machine.openshift.io
machine.machine.openshift.io/examplecluster-control-plane-2 edited
$ oc get machines -n openshift-machine-api -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE examplecluster-control-plane-0 Running 3h11m openshift-control-plane-0 baremetalhost:///openshift-machine-api/openshift-control-plane-0/da1ebe11-3ff2-41c5-b099-0aa41222964e externally provisioned examplecluster-control-plane-1 Running 3h11m openshift-control-plane-1 baremetalhost:///openshift-machine-api/openshift-control-plane-1/d9f9acbc-329c-475e-8d81-03b20280a3e1 externally provisioned examplecluster-compute-0 Running 165m openshift-compute-0 baremetalhost:///openshift-machine-api/openshift-compute-0/3d685b81-7410-4bb3-80ec-13a31858241f provisioned examplecluster-compute-1 Running 165m openshift-compute-1 baremetalhost:///openshift-machine-api/openshift-compute-1/0fdae6eb-2066-4241-91dc-e7ea72ab13b9 provisioned
$ oc get nodes NAME STATUS ROLES AGE VERSION openshift-control-plane-0 Ready master 3h24m v1.29.4 openshift-control-plane-1 Ready master 3h24m v1.29.4 openshift-compute-0 Ready worker 176m v1.29.4 openshift-compute-1 Ready worker 176m v1.29.4
$ cat <<EOF | oc apply -f - apiVersion: v1 kind: Secret metadata: name: openshift-control-plane-2-bmc-secret namespace: openshift-machine-api data: password: <password> username: <username> type: Opaque --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: openshift-control-plane-2 namespace: openshift-machine-api spec: automatedCleaningMode: disabled bmc: address: redfish://10.46.61.18:443/redfish/v1/Systems/1 credentialsName: openshift-control-plane-2-bmc-secret disableCertificateVerification: true bootMACAddress: 48:df:37:b0:8a:a0 bootMode: UEFI externallyProvisioned: false online: true rootDeviceHints: deviceName: /dev/disk/by-id/scsi-<serial_number> userData: name: master-user-data-managed namespace: openshift-machine-api EOF
참고중요$ oc get bmh -n openshift-machine-api NAME STATE CONSUMER ONLINE ERROR AGE openshift-control-plane-0 externally provisioned examplecluster-control-plane-0 true 4h48m openshift-control-plane-1 externally provisioned examplecluster-control-plane-1 true 4h48m openshift-control-plane-2 available examplecluster-control-plane-3 true 47m openshift-compute-0 provisioned examplecluster-compute-0 true 4h48m openshift-compute-1 provisioned examplecluster-compute-1 true 4h48m
$ oc get machines -n openshift-machine-api -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE examplecluster-control-plane-0 Running 3h11m openshift-control-plane-0 baremetalhost:///openshift-machine-api/openshift-control-plane-0/da1ebe11-3ff2-41c5-b099-0aa41222964e externally provisioned 1 examplecluster-control-plane-1 Running 3h11m openshift-control-plane-1 baremetalhost:///openshift-machine-api/openshift-control-plane-1/d9f9acbc-329c-475e-8d81-03b20280a3e1 externally provisioned examplecluster-control-plane-2 Running 3h11m openshift-control-plane-2 baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135 externally provisioned examplecluster-compute-0 Running 165m openshift-compute-0 baremetalhost:///openshift-machine-api/openshift-compute-0/3d685b81-7410-4bb3-80ec-13a31858241f provisioned examplecluster-compute-1 Running 165m openshift-compute-1 baremetalhost:///openshift-machine-api/openshift-compute-1/0fdae6eb-2066-4241-91dc-e7ea72ab13b9 provisioned
$ oc get bmh -n openshift-machine-api
$ oc get bmh -n openshift-machine-api NAME STATE CONSUMER ONLINE ERROR AGE openshift-control-plane-0 externally provisioned examplecluster-control-plane-0 true 4h48m openshift-control-plane-1 externally provisioned examplecluster-control-plane-1 true 4h48m openshift-control-plane-2 provisioned examplecluster-control-plane-3 true 47m openshift-compute-0 provisioned examplecluster-compute-0 true 4h48m openshift-compute-1 provisioned examplecluster-compute-1 true 4h48m
$ oc get nodes
$ oc get nodes NAME STATUS ROLES AGE VERSION openshift-control-plane-0 Ready master 4h26m v1.29.4 openshift-control-plane-1 Ready master 4h26m v1.29.4 openshift-control-plane-2 Ready master 12m v1.29.4 openshift-compute-0 Ready worker 3h58m v1.29.4 openshift-compute-1 Ready worker 3h58m v1.29.4
$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": null}}'
$ oc get etcd/cluster -oyaml
EtcdCertSignerControllerDegraded: [Operation cannot be fulfilled on secrets "etcd-peer-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-sno-0": the object has been modified; please apply your changes to the latest version and try again, Operation cannot be fulfilled on secrets "etcd-serving-metrics-sno-0": the object has been modified; please apply your changes to the latest version and try again]
$ oc -n openshift-etcd get pods -l k8s-app=etcd
etcd-openshift-control-plane-0 5/5 Running 0 105m etcd-openshift-control-plane-1 5/5 Running 0 107m etcd-openshift-control-plane-2 5/5 Running 0 103m
$ oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge 1
$ oc rsh -n openshift-etcd etcd-openshift-control-plane-0
sh-4.2# etcdctl member list -w table
+------------------+---------+--------------------+---------------------------+---------------------------+-----------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+--------------------+---------------------------+---------------------------+-----------------+ | 7a8197040a5126c8 | started | openshift-control-plane-2 | https://192.168.10.11:2380 | https://192.168.10.11:2379 | false | | 8d5abe9669a39192 | started | openshift-control-plane-1 | https://192.168.10.10:2380 | https://192.168.10.10:2379 | false | | cc3830a72fc357f9 | started | openshift-control-plane-0 | https://192.168.10.9:2380 | https://192.168.10.9:2379 | false | +------------------+---------+--------------------+---------------------------+---------------------------+-----------------+
참고# etcdctl endpoint health --cluster
https://192.168.10.10:2379 is healthy: successfully committed proposal: took = 8.973065ms https://192.168.10.9:2379 is healthy: successfully committed proposal: took = 11.559829ms https://192.168.10.11:2379 is healthy: successfully committed proposal: took = 11.665203ms
$ oc get etcd -o=jsonpath='{range.items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}'
AllNodesAtLatestRevision
5.2.5.
5.3.
5.3.1.
- 주의참고
5.3.2.
5.3.2.1.
5.3.2.2.
- 중요
- 참고
$ sudo mv -v /etc/kubernetes/manifests/etcd-pod.yaml /tmp
$ sudo crictl ps | grep etcd | egrep -v "operator|etcd-guard"
$ sudo mv -v /etc/kubernetes/manifests/kube-apiserver-pod.yaml /tmp
$ sudo crictl ps | grep kube-apiserver | egrep -v "operator|guard"
$ sudo mv -v /etc/kubernetes/manifests/kube-controller-manager-pod.yaml /tmp
$ sudo crictl ps | grep kube-controller-manager | egrep -v "operator|guard"
$ sudo mv -v /etc/kubernetes/manifests/kube-scheduler-pod.yaml /tmp
$ sudo crictl ps | grep kube-scheduler | egrep -v "operator|guard"
$ sudo mv -v /var/lib/etcd/ /tmp
$ sudo mv -v /etc/kubernetes/manifests/keepalived.yaml /tmp
$ sudo crictl ps --name keepalived
$ ip -o address | egrep '<api_vip>|<ingress_vip>'
$ sudo ip address del <reported_vip> dev <reported_vip_device>
$ ip -o address | grep <api_vip>
- 작은 정보
$ sudo -E /usr/local/bin/cluster-restore.sh /home/core/assets/backup
...stopping kube-scheduler-pod.yaml ...stopping kube-controller-manager-pod.yaml ...stopping etcd-pod.yaml ...stopping kube-apiserver-pod.yaml Waiting for container etcd to stop .complete Waiting for container etcdctl to stop .............................complete Waiting for container etcd-metrics to stop complete Waiting for container kube-controller-manager to stop complete Waiting for container kube-apiserver to stop ..........................................................................................complete Waiting for container kube-scheduler to stop complete Moving etcd data-dir /var/lib/etcd/member to /var/lib/etcd-backup starting restore-etcd static pod starting kube-apiserver-pod.yaml static-pod-resources/kube-apiserver-pod-7/kube-apiserver-pod.yaml starting kube-controller-manager-pod.yaml static-pod-resources/kube-controller-manager-pod-7/kube-controller-manager-pod.yaml starting kube-scheduler-pod.yaml static-pod-resources/kube-scheduler-pod-8/kube-scheduler-pod.yaml
참고$ oc get nodes -w
NAME STATUS ROLES AGE VERSION host-172-25-75-28 Ready master 3d20h v1.29.4 host-172-25-75-38 Ready infra,worker 3d20h v1.29.4 host-172-25-75-40 Ready master 3d20h v1.29.4 host-172-25-75-65 Ready master 3d20h v1.29.4 host-172-25-75-74 Ready infra,worker 3d20h v1.29.4 host-172-25-75-79 Ready worker 3d20h v1.29.4 host-172-25-75-86 Ready worker 3d20h v1.29.4 host-172-25-75-98 Ready infra,worker 3d20h v1.29.4
$ ssh -i <ssh-key-path> core@<master-hostname>
sh-4.4# pwd /var/lib/kubelet/pki sh-4.4# ls kubelet-client-2022-04-28-11-24-09.pem kubelet-server-2022-04-28-11-24-15.pem kubelet-client-current.pem kubelet-server-current.pem
$ sudo systemctl restart kubelet.service
- 참고
$ oc get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION csr-2s94x 8m3s kubernetes.io/kubelet-serving system:node:<node_name> Pending 1 csr-4bd6t 8m3s kubernetes.io/kubelet-serving system:node:<node_name> Pending 2 csr-4hl85 13m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending 3 csr-zhhhp 3m8s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending 4 ...
$ oc describe csr <csr_name> 1
$ oc adm certificate approve <csr_name>
$ oc adm certificate approve <csr_name>
$ sudo crictl ps | grep etcd | egrep -v "operator|etcd-guard"
3ad41b7908e32 36f86e2eeaaffe662df0d21041eb22b8198e0e58abeeae8c743c3e6e977e8009 About a minute ago Running etcd 0 7c05f8af362f0
$ oc -n openshift-etcd get pods -l k8s-app=etcd
NAME READY STATUS RESTARTS AGE etcd-ip-10-0-143-125.ec2.internal 1/1 Running 1 2m47s
$ oc -n openshift-ovn-kubernetes delete pod -l app=ovnkube-control-plane
$ oc -n openshift-ovn-kubernetes get pod -l app=ovnkube-control-plane
- 중요참고
$ sudo rm -f /var/lib/ovn-ic/etc/*.db
$ sudo systemctl restart ovs-vswitchd ovsdb-server
$ oc -n openshift-ovn-kubernetes delete pod -l app=ovnkube-node --field-selector=spec.nodeName==<node>
$ oc -n openshift-ovn-kubernetes get pod -l app=ovnkube-node --field-selector=spec.nodeName==<node>
참고
- 주의
- 주의
$ oc get machines -n openshift-machine-api -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE clustername-8qw5l-master-0 Running m4.xlarge us-east-1 us-east-1a 3h37m ip-10-0-131-183.ec2.internal aws:///us-east-1a/i-0ec2782f8287dfb7e stopped 1 clustername-8qw5l-master-1 Running m4.xlarge us-east-1 us-east-1b 3h37m ip-10-0-143-125.ec2.internal aws:///us-east-1b/i-096c349b700a19631 running clustername-8qw5l-master-2 Running m4.xlarge us-east-1 us-east-1c 3h37m ip-10-0-154-194.ec2.internal aws:///us-east-1c/i-02626f1dba9ed5bba running clustername-8qw5l-worker-us-east-1a-wbtgd Running m4.large us-east-1 us-east-1a 3h28m ip-10-0-129-226.ec2.internal aws:///us-east-1a/i-010ef6279b4662ced running clustername-8qw5l-worker-us-east-1b-lrdxb Running m4.large us-east-1 us-east-1b 3h28m ip-10-0-144-248.ec2.internal aws:///us-east-1b/i-0cb45ac45a166173b running clustername-8qw5l-worker-us-east-1c-pkg26 Running m4.large us-east-1 us-east-1c 3h28m ip-10-0-170-181.ec2.internal aws:///us-east-1c/i-06861c00007751b0a running
$ oc delete machine -n openshift-machine-api clustername-8qw5l-master-0 1
$ oc get machines -n openshift-machine-api -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE clustername-8qw5l-master-1 Running m4.xlarge us-east-1 us-east-1b 3h37m ip-10-0-143-125.ec2.internal aws:///us-east-1b/i-096c349b700a19631 running clustername-8qw5l-master-2 Running m4.xlarge us-east-1 us-east-1c 3h37m ip-10-0-154-194.ec2.internal aws:///us-east-1c/i-02626f1dba9ed5bba running clustername-8qw5l-master-3 Provisioning m4.xlarge us-east-1 us-east-1a 85s ip-10-0-173-171.ec2.internal aws:///us-east-1a/i-015b0888fe17bc2c8 running 1 clustername-8qw5l-worker-us-east-1a-wbtgd Running m4.large us-east-1 us-east-1a 3h28m ip-10-0-129-226.ec2.internal aws:///us-east-1a/i-010ef6279b4662ced running clustername-8qw5l-worker-us-east-1b-lrdxb Running m4.large us-east-1 us-east-1b 3h28m ip-10-0-144-248.ec2.internal aws:///us-east-1b/i-0cb45ac45a166173b running clustername-8qw5l-worker-us-east-1c-pkg26 Running m4.large us-east-1 us-east-1c 3h28m ip-10-0-170-181.ec2.internal aws:///us-east-1c/i-06861c00007751b0a running
$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}'
$ export KUBECONFIG=/etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/localhost-recovery.kubeconfig
$ oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge 1
$ oc patch etcd/cluster --type=merge -p '{"spec": {"unsupportedConfigOverrides": null}}'
$ oc get etcd/cluster -oyaml
$ oc get etcd -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}'
AllNodesAtLatestRevision 3 nodes are at revision 7 1
$ oc patch kubeapiserver cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge
$ oc get kubeapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}'
AllNodesAtLatestRevision 3 nodes are at revision 7 1
$ oc patch kubecontrollermanager cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge
$ oc get kubecontrollermanager -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}'
AllNodesAtLatestRevision 3 nodes are at revision 7 1
$ oc patch kubescheduler cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge
$ oc get kubescheduler -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}'
AllNodesAtLatestRevision 3 nodes are at revision 7 1
$ oc adm wait-for-stable-cluster
$ oc -n openshift-etcd get pods -l k8s-app=etcd
etcd-ip-10-0-143-125.ec2.internal 2/2 Running 0 9h etcd-ip-10-0-154-194.ec2.internal 2/2 Running 0 9h etcd-ip-10-0-173-171.ec2.internal 2/2 Running 0 9h
$ export KUBECONFIG=<installation_directory>/auth/kubeconfig
$ oc whoami
5.3.2.3.
5.3.2.4.
5.3.3.
5.3.3.1.
$ oc get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION csr-2s94x 8m3s kubernetes.io/kubelet-serving system:node:<node_name> Pending 1 csr-4bd6t 8m3s kubernetes.io/kubelet-serving system:node:<node_name> Pending csr-4hl85 13m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending 2 csr-zhhhp 3m8s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...
$ oc describe csr <csr_name> 1
$ oc adm certificate approve <csr_name>
$ oc adm certificate approve <csr_name>
Legal Notice
Copyright © 2024 Red Hat, Inc.
OpenShift documentation is licensed under the Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0).
Modified versions must remove all Red Hat trademarks.
Portions adapted from https://github.com/kubernetes-incubator/service-catalog/ with modifications by Red Hat.
Red Hat, Red Hat Enterprise Linux, the Red Hat logo, the Shadowman logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat Software Collections is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation’s permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.