이 콘텐츠는 선택한 언어로 제공되지 않습니다.
Chapter 34. Cluster recovery from persistent volumes
You can recover a Kafka cluster from persistent volumes (PVs) if they are still present.
34.1. Cluster recovery scenarios 링크 복사링크가 클립보드에 복사되었습니다!
Recovering from PVs is possible in the following scenarios:
- Unintentional deletion of a namespace
- Loss of an entire OpenShift cluster while PVs remain in the infrastructure
The recovery procedure for both scenarios is to recreate the original PersistentVolumeClaim (PVC) resources.
34.1.1. Recovering from namespace deletion 링크 복사링크가 클립보드에 복사되었습니다!
When you delete a namespace, all resources within that namespace—including PVCs, pods, and services—are deleted. If the reclaimPolicy for the PV resource specification is set to Retain, the PV retains its data and is not deleted. This configuration allows you to recover from namespace deletion.
PV configuration to retain data
apiVersion: v1
kind: PersistentVolume
# ...
spec:
# ...
persistentVolumeReclaimPolicy: Retain
Alternatively, PVs can inherit the reclaim policy from an associated storage class. Storage classes are used for dynamic volume allocation.
By configuring the reclaimPolicy property for the storage class, PVs created with this class use the specified reclaim policy. The storage class is assigned to the PV using the storageClassName property.
Storage class configuration to retain data
apiVersion: v1
kind: StorageClass
metadata:
name: gp2-retain
parameters:
# ...
# ...
reclaimPolicy: Retain
Storage class specified for PV
apiVersion: v1
kind: PersistentVolume
# ...
spec:
# ...
storageClassName: gp2-retain
When using Retain as the reclaim policy, you must manually delete PVs if you intend to delete the entire cluster.
34.1.2. Recovering from cluster loss 링크 복사링크가 클립보드에 복사되었습니다!
If you lose the entire OpenShift cluster, all resources—including PVs, PVCs, and namespaces—are lost. However, it’s possible to recover if the physical storage backing the PVs remains intact.
To recover, you need to set up a new OpenShift cluster and manually reconfigure the PVs to use the existing storage.
34.2. Recovering a deleted Kafka cluster 링크 복사링크가 클립보드에 복사되었습니다!
This procedure describes how to recover a deleted Kafka cluster from persistent volumes (PVs) by recreating the original PersistentVolumeClaim (PVC) resources.
If the Topic Operator and User Operator are deployed, you can recover KafkaTopic and KafkaUser resources by recreating them. It is important that you recreate the KafkaTopic resources with the same configurations, or the Topic Operator will try to update them in Kafka. This procedure shows how to recreate both resources.
If the User Operator is enabled and Kafka users are not recreated, users are deleted from the Kafka cluster immediately after recovery.
Before you begin
In this procedure, it is essential that PVs are mounted into the correct PVC to avoid data corruption. A volumeName is specified for the PVC and this must match the name of the PV.
For more information, see Section 10.4, “Configuring Kafka storage”.
Procedure
Check information on the PVs in the cluster:
oc get pvInformation is presented for PVs with data.
Example PV output
NAME RECLAIMPOLICY CLAIM pvc-5e9c5c7f-3317-11ea-a650-06e1eadd9a4c ... Retain ... myproject/data-0-my-cluster-broker-0 pvc-5e9cc72d-3317-11ea-97b0-0aef8816c7ea ... Retain ... myproject/data-0-my-cluster-broker-1 pvc-5ead43d1-3317-11ea-97b0-0aef8816c7ea ... Retain ... myproject/data-0-my-cluster-broker-2 pvc-7e1f67f9-3317-11ea-a650-06e1eadd9a4c ... Retain ... myproject/data-0-my-cluster-controller-3 pvc-7e21042e-3317-11ea-9786-02deaf9aa87e ... Retain ... myproject/data-0-my-cluster-controller-4 pvc-7e226978-3317-11ea-97b0-0aef8816c7ea ... Retain ... myproject/data-0-my-cluster-controller-5-
NAMEis the name of each PV. -
RECLAIMPOLICYshows that PVs are retained, meaning that the PV is not automatically deleted when the PVC is deleted. -
CLAIMshows the link to the original PVCs.
-
Recreate the original namespace:
oc create namespace myprojectHere, we recreate the
myprojectnamespace.Recreate the original PVC resource specifications, linking the PVCs to the appropriate PV:
Example PVC resource specification
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: data-0-my-cluster-broker-0 spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi storageClassName: gp2-retain volumeMode: Filesystem volumeName: pvc-7e1f67f9-3317-11ea-a650-06e1eadd9a4cEdit the PV specifications to delete the
claimRefproperties that bound the original PVC.Example PV specification
apiVersion: v1 kind: PersistentVolume metadata: annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner pv.kubernetes.io/bound-by-controller: "yes" pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs creationTimestamp: "<date>" finalizers: - kubernetes.io/pv-protection labels: failure-domain.beta.kubernetes.io/region: eu-west-1 failure-domain.beta.kubernetes.io/zone: eu-west-1c name: pvc-5ead43d1-3317-11ea-97b0-0aef8816c7ea resourceVersion: "39431" selfLink: /api/v1/persistentvolumes/pvc-7e226978-3317-11ea-97b0-0aef8816c7ea uid: 7efe6b0d-3317-11ea-a650-06e1eadd9a4c spec: accessModes: - ReadWriteOnce awsElasticBlockStore: fsType: xfs volumeID: aws://eu-west-1c/vol-09db3141656d1c258 capacity: storage: 100Gi claimRef: apiVersion: v1 kind: PersistentVolumeClaim name: data-0-my-cluster-kafka-2 namespace: myproject resourceVersion: "39113" uid: 54be1c60-3319-11ea-97b0-0aef8816c7ea nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: failure-domain.beta.kubernetes.io/zone operator: In values: - eu-west-1c - key: failure-domain.beta.kubernetes.io/region operator: In values: - eu-west-1 persistentVolumeReclaimPolicy: Retain storageClassName: gp2-retain volumeMode: FilesystemIn the example, the following properties are deleted:
claimRef: apiVersion: v1 kind: PersistentVolumeClaim name: data-0-my-cluster-broker-2 namespace: myproject resourceVersion: "39113" uid: 54be1c60-3319-11ea-97b0-0aef8816c7eaDeploy the Cluster Operator:
oc create -f install/cluster-operator -n myprojectRecreate all
KafkaTopicresources by applying theKafkaTopicresource configuration:oc apply -f <topic_configuration_file> -n myprojectRecreate all
KafkaUserresources:If user passwords and certificates need to be retained, recreate the user secrets before recreating the
KafkaUserresources.If the secrets are not recreated, the User Operator will generate new credentials automatically. Ensure that the recreated secrets have exactly the same name, labels, and fields as the original secrets.
Apply the
KafkaUserresource configuration:oc apply -f <user_configuration_file> -n myproject
Deploy the Kafka cluster using the original configuration for the
Kafkaresource. Add the annotationstrimzi.io/pause-reconciliation="true"to the original configuration for theKafkaresource, and then deploy the Kafka cluster using the updated configuration.oc apply -f <kafka_resource_configuration>.yaml -n myprojectRecover the original
clusterIdfrom logs or copies of theKafkacustom resource. Otherwise, you can retrieve it from one of the volumes by spinning up a temporary pod.PVC_NAME="data-0-my-cluster-kafka-0" COMMAND="grep cluster.id /disk/kafka-log*/meta.properties | awk -F'=' '{print \$2}'" oc run tmp -itq --rm --restart "Never" --image "foo" --overrides "{\"spec\": {\"containers\":[{\"name\":\"busybox\",\"image\":\"busybox\",\"command\":[\"/bin/sh\", \"-c\",\"$COMMAND\"],\"volumeMounts\":[{\"name\":\"disk\",\"mountPath\":\"/disk\"}]}], \"volumes\":[{\"name\":\"disk\",\"persistentVolumeClaim\":{\"claimName\": \"$PVC_NAME\"}}]}}" -n myprojectEdit the
Kafkaresource to set the.status.clusterIdwith the recovered value:oc edit kafka <cluster-name> --subresource status -n myprojectUnpause the
Kafkaresource reconciliation:oc annotate kafka my-cluster strimzi.io/pause-reconciliation=false \ --overwrite -n myprojectVerify the recovery of the
KafkaTopicresources:oc get kafkatopics -o wide -w -n myprojectKafka topic status
NAME CLUSTER PARTITIONS REPLICATION FACTOR READY my-topic-1 my-cluster 10 3 True my-topic-2 my-cluster 10 3 True my-topic-3 my-cluster 10 3 TrueKafkaTopiccustom resource creation is successful when theREADYoutput showsTrue.Verify the recovery of the
KafkaUserresources:oc get kafkausers -o wide -w -n myprojectKafka user status
NAME CLUSTER AUTHENTICATION AUTHORIZATION READY my-user-1 my-cluster tls simple True my-user-2 my-cluster tls simple True my-user-3 my-cluster tls simple TrueKafkaUsercustom resource creation is successful when theREADYoutput showsTrue.