Chapter 23. Evicting pods with the Streams for Apache Kafka Drain Cleaner
Kafka and ZooKeeper pods might be evicted during OpenShift upgrades, maintenance, or pod rescheduling. If your Kafka and ZooKeeper pods were deployed by Streams for Apache Kafka, you can use the Streams for Apache Kafka Drain Cleaner tool to handle the pod evictions. The Streams for Apache Kafka Drain Cleaner handles the eviction instead of OpenShift.
By deploying the Streams for Apache Kafka Drain Cleaner, you can use the Cluster Operator to move Kafka pods instead of OpenShift. The Cluster Operator ensures that topics are never under-replicated and Kafka can remain operational during the eviction process. The Cluster Operator waits for topics to synchronize, as the OpenShift worker nodes drain consecutively.
An admission webhook notifies the Streams for Apache Kafka Drain Cleaner of pod eviction requests to the Kubernetes API. The Streams for Apache Kafka Drain Cleaner then adds a rolling update annotation to the pods to be drained. This informs the Cluster Operator to perform a rolling update of an evicted pod.
If you are not using the Streams for Apache Kafka Drain Cleaner, you can add pod annotations to perform rolling updates manually.
Webhook configuration
The Streams for Apache Kafka Drain Cleaner deployment files include a ValidatingWebhookConfiguration
resource file. The resource provides the configuration for registering the webhook with the Kubernetes API.
The configuration defines the rules
for the Kubernetes API to follow in the event of a pod eviction request. The rules specify that only CREATE
operations related to pods/eviction
sub-resources are intercepted. If these rules are met, the API forwards the notification.
The clientConfig
points to the Streams for Apache Kafka Drain Cleaner service and /drainer
endpoint that exposes the webhook. The webhook uses a secure TLS connection, which requires authentication. The caBundle
property specifies the certificate chain to validate HTTPS communication. Certificates are encoded in Base64.
Webhook configuration for pod eviction notifications
apiVersion: admissionregistration.k8s.io/v1 kind: ValidatingWebhookConfiguration # ... webhooks: - name: strimzi-drain-cleaner.strimzi.io rules: - apiGroups: [""] apiVersions: ["v1"] operations: ["CREATE"] resources: ["pods/eviction"] scope: "Namespaced" clientConfig: service: namespace: "strimzi-drain-cleaner" name: "strimzi-drain-cleaner" path: /drainer port: 443 caBundle: Cg== # ...
23.1. Downloading the Streams for Apache Kafka Drain Cleaner deployment files
To deploy and use the Streams for Apache Kafka Drain Cleaner, you need to download the deployment files.
The Streams for Apache Kafka Drain Cleaner deployment files are available from the Streams for Apache Kafka software downloads page.
23.2. Deploying the Streams for Apache Kafka Drain Cleaner using installation files
Deploy the Streams for Apache Kafka Drain Cleaner to the OpenShift cluster where the Cluster Operator and Kafka cluster are running.
Streams for Apache Kafka Drain Cleaner can run in two different modes. By default, the Drain Cleaner denies (blocks) the OpenShift eviction request to prevent OpenShift from evicting the pods and instead uses the Cluster Operator to move the pod. This mode has better compatibility with various cluster autoscaling tools and does not require any specific PodDisuptionBudget
configuration. Alternatively, you can enable the legacy mode where it allows the eviction request while also instructing the Cluster Operator to move the pod. For the legacy mode to work, you have to configure the PodDisruptionBudget
to not allow any pod evictions by setting the maxUnavailable
option to 0
.
Prerequisites
- You have downloaded the Streams for Apache Kafka Drain Cleaner deployment files.
- You have a highly available Kafka cluster deployment running with OpenShift worker nodes that you would like to update.
Topics are replicated for high availability.
Topic configuration specifies a replication factor of at least 3 and a minimum number of in-sync replicas to 1 less than the replication factor.
Kafka topic replicated for high availability
apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaTopic metadata: name: my-topic labels: strimzi.io/cluster: my-cluster spec: partitions: 1 replicas: 3 config: # ... min.insync.replicas: 2 # ...
Excluding Kafka or ZooKeeper
If you don’t want to include Kafka or ZooKeeper pods in Drain Cleaner operations, or if you prefer to use the Drain Cleaner in legacy mode, change the default environment variables in the Drain Cleaner Deployment
configuration file:
-
Set
STRIMZI_DENY_EVICTION
tofalse
to use the legacy mode relying on thePodDisruptionBudget
configuration -
Set
STRIMZI_DRAIN_KAFKA
tofalse
to exclude Kafka pods -
Set
STRIMZI_DRAIN_ZOOKEEPER
tofalse
to exclude ZooKeeper pods
Example configuration to exclude ZooKeeper pods
apiVersion: apps/v1 kind: Deployment spec: # ... template: spec: serviceAccountName: strimzi-drain-cleaner containers: - name: strimzi-drain-cleaner # ... env: - name: STRIMZI_DENY_EVICTION value: "true" - name: STRIMZI_DRAIN_KAFKA value: "true" - name: STRIMZI_DRAIN_ZOOKEEPER value: "false" # ...
Procedure
If you are using the legacy mode activated by setting the
STRIMZI_DENY_EVICTION
environment variable tofalse
, you must also configure thePodDisruptionBudget
resource. SetmaxUnavailable
to0
(zero) in the Kafka and ZooKeeper sections of theKafka
resource usingtemplate
settings.Specifying a pod disruption budget
apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: my-cluster namespace: myproject spec: kafka: template: podDisruptionBudget: maxUnavailable: 0 # ... zookeeper: template: podDisruptionBudget: maxUnavailable: 0 # ...
This setting prevents the automatic eviction of pods in case of planned disruptions, leaving the Streams for Apache Kafka Drain Cleaner and Cluster Operator to roll the pods on different worker nodes.
Add the same configuration for ZooKeeper if you want to use Streams for Apache Kafka Drain Cleaner to drain ZooKeeper nodes.
Update the
Kafka
resource:oc apply -f <kafka_configuration_file>
Deploy the Streams for Apache Kafka Drain Cleaner.
To run the Drain Cleaner on OpenShift, apply the resources in the
/install/drain-cleaner/openshift
directory.oc apply -f ./install/drain-cleaner/openshift
23.3. Using the Streams for Apache Kafka Drain Cleaner
Use the Streams for Apache Kafka Drain Cleaner in combination with the Cluster Operator to move Kafka broker or ZooKeeper pods from nodes that are being drained. When you run the Streams for Apache Kafka Drain Cleaner, it annotates pods with a rolling update pod annotation. The Cluster Operator performs rolling updates based on the annotation.
Prerequisites
Considerations when using anti-affinity configuration
When using anti-affinity with your Kafka or ZooKeeper pods, consider adding a spare worker node to your cluster. Including spare nodes ensures that your cluster has the capacity to reschedule pods during node draining or temporary unavailability of other nodes. When a worker node is drained, and anti-affinity rules restrict pod rescheduling on alternative nodes, spare nodes help prevent restarted pods from becoming unschedulable. This mitigates the risk of the draining operation failing.
Procedure
Drain a specified OpenShift node hosting the Kafka broker or ZooKeeper pods.
oc get nodes oc drain <name-of-node> --delete-emptydir-data --ignore-daemonsets --timeout=6000s --force
Check the eviction events in the Streams for Apache Kafka Drain Cleaner log to verify that the pods have been annotated for restart.
Streams for Apache Kafka Drain Cleaner log show annotations of pods
INFO ... Received eviction webhook for Pod my-cluster-zookeeper-2 in namespace my-project INFO ... Pod my-cluster-zookeeper-2 in namespace my-project will be annotated for restart INFO ... Pod my-cluster-zookeeper-2 in namespace my-project found and annotated for restart INFO ... Received eviction webhook for Pod my-cluster-kafka-0 in namespace my-project INFO ... Pod my-cluster-kafka-0 in namespace my-project will be annotated for restart INFO ... Pod my-cluster-kafka-0 in namespace my-project found and annotated for restart
Check the reconciliation events in the Cluster Operator log to verify the rolling updates.
Cluster Operator log shows rolling updates
INFO PodOperator:68 - Reconciliation #13(timer) Kafka(my-project/my-cluster): Rolling Pod my-cluster-zookeeper-2 INFO PodOperator:68 - Reconciliation #13(timer) Kafka(my-project/my-cluster): Rolling Pod my-cluster-kafka-0 INFO AbstractOperator:500 - Reconciliation #13(timer) Kafka(my-project/my-cluster): reconciled
23.4. Watching the TLS certificates used by the Streams for Apache Kafka Drain Cleaner
By default, the Drain Cleaner deployment watches the secret containing the TLS certificates its uses for authentication. The Drain Cleaner watches for changes, such as certificate renewals. If it detects a change, it restarts to reload the TLS certificates. The Drain Cleaner installation files enable this behavior by default. But you can disable the watching of certificates by setting the STRIMZI_CERTIFICATE_WATCH_ENABLED
environment variable to false
in the Deployment
configuration (060-Deployment.yaml
) of the Drain Cleaner installation files.
With STRIMZI_CERTIFICATE_WATCH_ENABLED
enabled, you can also use the following environment variables for watching TLS certificates.
Environment Variable | Description | Default |
---|---|---|
| Enables or disables the certificate watch |
|
| The namespace where the Drain Cleaner is deployed and where the certificate secret exists |
|
| The Drain Cleaner pod name | - |
| The name of the secret containing TLS certificates |
|
| The list of fields inside the secret that contain the TLS certificates |
|
Example environment variable configuration to control watch operations
apiVersion: apps/v1 kind: Deployment metadata: name: strimzi-drain-cleaner labels: app: strimzi-drain-cleaner namespace: strimzi-drain-cleaner spec: # ... spec: serviceAccountName: strimzi-drain-cleaner containers: - name: strimzi-drain-cleaner # ... env: - name: STRIMZI_DRAIN_KAFKA value: "true" - name: STRIMZI_DRAIN_ZOOKEEPER value: "true" - name: STRIMZI_CERTIFICATE_WATCH_ENABLED value: "true" - name: STRIMZI_CERTIFICATE_WATCH_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: STRIMZI_CERTIFICATE_WATCH_POD_NAME valueFrom: fieldRef: fieldPath: metadata.name # ...
Use the Downward API mechanism to configure STRIMZI_CERTIFICATE_WATCH_NAMESPACE
and STRIMZI_CERTIFICATE_WATCH_POD_NAME
.