Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Chapter 27. Managing AMQ Streams
Managing AMQ Streams requires performing various tasks to keep the Kafka clusters and associated resources running smoothly. Use oc commands to check the status of resources, configure maintenance windows for rolling updates, and leverage tools such as the AMQ Streams Drain Cleaner and Kafka Static Quota plugin to manage your deployment effectively.
27.1. Working with custom resources Link kopierenLink in die Zwischenablage kopiert!
You can use oc commands to retrieve information and perform other operations on AMQ Streams custom resources.
Using oc with the status subresource of a custom resource allows you to get the information about the resource.
27.1.1. Performing oc operations on custom resources Link kopierenLink in die Zwischenablage kopiert!
Use oc commands, such as get, describe, edit, or delete, to perform operations on resource types. For example, oc get kafkatopics retrieves a list of all Kafka topics and oc get kafkas retrieves all deployed Kafka clusters.
When referencing resource types, you can use both singular and plural names: oc get kafkas gets the same results as oc get kafka.
You can also use the short name of the resource. Learning short names can save you time when managing AMQ Streams. The short name for Kafka is k, so you can also run oc get k to list all Kafka clusters.
oc get k NAME DESIRED KAFKA REPLICAS DESIRED ZK REPLICAS my-cluster 3 3
oc get k
NAME DESIRED KAFKA REPLICAS DESIRED ZK REPLICAS
my-cluster 3 3
| AMQ Streams resource | Long name | Short name |
|---|---|---|
| Kafka | kafka | k |
| Kafka Topic | kafkatopic | kt |
| Kafka User | kafkauser | ku |
| Kafka Connect | kafkaconnect | kc |
| Kafka Connector | kafkaconnector | kctr |
| Kafka Mirror Maker | kafkamirrormaker | kmm |
| Kafka Mirror Maker 2 | kafkamirrormaker2 | kmm2 |
| Kafka Bridge | kafkabridge | kb |
| Kafka Rebalance | kafkarebalance | kr |
27.1.1.1. Resource categories Link kopierenLink in die Zwischenablage kopiert!
Categories of custom resources can also be used in oc commands.
All AMQ Streams custom resources belong to the category strimzi, so you can use strimzi to get all the AMQ Streams resources with one command.
For example, running oc get strimzi lists all AMQ Streams custom resources in a given namespace.
The oc get strimzi -o name command returns all resource types and resource names. The -o name option fetches the output in the type/name format
oc get strimzi -o name kafka.kafka.strimzi.io/my-cluster kafkatopic.kafka.strimzi.io/kafka-apps kafkauser.kafka.strimzi.io/my-user
oc get strimzi -o name
kafka.kafka.strimzi.io/my-cluster
kafkatopic.kafka.strimzi.io/kafka-apps
kafkauser.kafka.strimzi.io/my-user
You can combine this strimzi command with other commands. For example, you can pass it into a oc delete command to delete all resources in a single command.
oc delete $(oc get strimzi -o name) kafka.kafka.strimzi.io "my-cluster" deleted kafkatopic.kafka.strimzi.io "kafka-apps" deleted kafkauser.kafka.strimzi.io "my-user" deleted
oc delete $(oc get strimzi -o name)
kafka.kafka.strimzi.io "my-cluster" deleted
kafkatopic.kafka.strimzi.io "kafka-apps" deleted
kafkauser.kafka.strimzi.io "my-user" deleted
Deleting all resources in a single operation might be useful, for example, when you are testing new AMQ Streams features.
27.1.1.2. Querying the status of sub-resources Link kopierenLink in die Zwischenablage kopiert!
There are other values you can pass to the -o option. For example, by using -o yaml you get the output in YAML format. Using -o json will return it as JSON.
You can see all the options in oc get --help.
One of the most useful options is the JSONPath support, which allows you to pass JSONPath expressions to query the Kubernetes API. A JSONPath expression can extract or navigate specific parts of any resource.
For example, you can use the JSONPath expression {.status.listeners[?(@.name=="tls")].bootstrapServers} to get the bootstrap address from the status of the Kafka custom resource and use it in your Kafka clients.
Here, the command finds the bootstrapServers value of the listener named tls:
oc get kafka my-cluster -o=jsonpath='{.status.listeners[?(@.name=="tls")].bootstrapServers}{"\n"}'
my-cluster-kafka-bootstrap.myproject.svc:9093
oc get kafka my-cluster -o=jsonpath='{.status.listeners[?(@.name=="tls")].bootstrapServers}{"\n"}'
my-cluster-kafka-bootstrap.myproject.svc:9093
By changing the name condition you can also get the address of the other Kafka listeners.
You can use jsonpath to extract any other property or group of properties from any custom resource.
27.1.2. AMQ Streams custom resource status information Link kopierenLink in die Zwischenablage kopiert!
Status properties provide status information for certain custom resources.
The following table lists the custom resources that provide status information (when deployed) and the schemas that define the status properties.
For more information on the schemas, see the AMQ Streams Custom Resource API Reference.
| AMQ Streams resource | Schema reference | Publishes status information on… |
|---|---|---|
|
|
| The Kafka cluster |
|
|
| Kafka topics in the Kafka cluster |
|
|
| Kafka users in the Kafka cluster |
|
|
| The Kafka Connect cluster |
|
|
|
|
|
|
| The Kafka MirrorMaker 2 cluster |
|
|
| The Kafka MirrorMaker cluster |
|
|
| The AMQ Streams Kafka Bridge |
|
|
| The status and results of a rebalance |
The status property of a resource provides information on the state of the resource. The status.conditions and status.observedGeneration properties are common to all resources.
status.conditions-
Status conditions describe the current state of a resource. Status condition properties are useful for tracking progress related to the resource achieving its desired state, as defined by the configuration specified in its
spec. Status condition properties provide the time and reason the state of the resource changed, and details of events preventing or delaying the operator from realizing the desired state. status.observedGeneration-
Last observed generation denotes the latest reconciliation of the resource by the Cluster Operator. If the value of
observedGenerationis different from the value ofmetadata.generation((the current version of the deployment), the operator has not yet processed the latest update to the resource. If these values are the same, the status information reflects the most recent changes to the resource.
The status properties also provide resource-specific information. For example, KafkaStatus provides information on listener addresses, and the ID of the Kafka cluster.
AMQ Streams creates and maintains the status of custom resources, periodically evaluating the current state of the custom resource and updating its status accordingly. When performing an update on a custom resource using oc edit, for example, its status is not editable. Moreover, changing the status would not affect the configuration of the Kafka cluster.
Here we see the status properties for a Kafka custom resource.
Kafka custom resource status
- 1
- The Kafka cluster ID.
- 2
- Status
conditionsdescribe the current state of the Kafka cluster. - 3
- The
Readycondition indicates that the Cluster Operator considers the Kafka cluster able to handle traffic. - 4
- The
listenersdescribe Kafka bootstrap addresses by type. - 5
- The
observedGenerationvalue indicates the last reconciliation of theKafkacustom resource by the Cluster Operator.
The Kafka bootstrap addresses listed in the status do not signify that those endpoints or the Kafka cluster is in a Ready state.
Accessing status information
You can access status information for a resource from the command line. For more information, see Section 27.1.3, “Finding the status of a custom resource”.
27.1.3. Finding the status of a custom resource Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to find the status of a custom resource.
Prerequisites
- An OpenShift cluster.
- The Cluster Operator is running.
Procedure
Specify the custom resource and use the
-o jsonpathoption to apply a standard JSONPath expression to select thestatusproperty:oc get kafka <kafka_resource_name> -o jsonpath='{.status}'oc get kafka <kafka_resource_name> -o jsonpath='{.status}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow This expression returns all the status information for the specified custom resource. You can use dot notation, such as
status.listenersorstatus.observedGeneration, to fine-tune the status information you wish to see.
27.2. Discovering services using labels and annotations Link kopierenLink in die Zwischenablage kopiert!
Service discovery makes it easier for client applications running in the same OpenShift cluster as AMQ Streams to interact with a Kafka cluster.
A service discovery label and annotation is generated for services used to access the Kafka cluster:
- Internal Kafka bootstrap service
- HTTP Bridge service
The label helps to make the service discoverable, and the annotation provides connection details that a client application can use to make the connection.
The service discovery label, strimzi.io/discovery, is set as true for the Service resources. The service discovery annotation has the same key, providing connection details in JSON format for each service.
Example internal Kafka bootstrap service
Example HTTP Bridge service
27.2.1. Returning connection details on services Link kopierenLink in die Zwischenablage kopiert!
You can find the services by specifying the discovery label when fetching services from the command line or a corresponding API call.
oc get service -l strimzi.io/discovery=true
oc get service -l strimzi.io/discovery=true
The connection details are returned when retrieving the service discovery label.
27.3. Connecting to ZooKeeper from a terminal Link kopierenLink in die Zwischenablage kopiert!
ZooKeeper services are secured with encryption and authentication and are not intended to be used by external applications that are not part of AMQ Streams.
However, if you want to use CLI tools that require a connection to ZooKeeper, you can use a terminal inside a ZooKeeper pod and connect to localhost:12181 as the ZooKeeper address.
Prerequisites
- An OpenShift cluster is available.
- A Kafka cluster is running.
- The Cluster Operator is running.
Procedure
Open the terminal using the OpenShift console or run the
execcommand from your CLI.For example:
oc exec -ti my-cluster-zookeeper-0 -- bin/zookeeper-shell.sh localhost:12181 ls /
oc exec -ti my-cluster-zookeeper-0 -- bin/zookeeper-shell.sh localhost:12181 ls /Copy to Clipboard Copied! Toggle word wrap Toggle overflow Be sure to use
localhost:12181.
27.4. Pausing reconciliation of custom resources Link kopierenLink in die Zwischenablage kopiert!
Sometimes it is useful to pause the reconciliation of custom resources managed by AMQ Streams Operators, so that you can perform fixes or make updates. If reconciliations are paused, any changes made to custom resources are ignored by the Operators until the pause ends.
If you want to pause reconciliation of a custom resource, set the strimzi.io/pause-reconciliation annotation to true in its configuration. This instructs the appropriate Operator to pause reconciliation of the custom resource. For example, you can apply the annotation to the KafkaConnect resource so that reconciliation by the Cluster Operator is paused.
You can also create a custom resource with the pause annotation enabled. The custom resource is created, but it is ignored.
Prerequisites
- The AMQ Streams Operator that manages the custom resource is running.
Procedure
Annotate the custom resource in OpenShift, setting
pause-reconciliationtotrue:oc annotate <kind_of_custom_resource> <name_of_custom_resource> strimzi.io/pause-reconciliation="true"
oc annotate <kind_of_custom_resource> <name_of_custom_resource> strimzi.io/pause-reconciliation="true"Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, for the
KafkaConnectcustom resource:oc annotate KafkaConnect my-connect strimzi.io/pause-reconciliation="true"
oc annotate KafkaConnect my-connect strimzi.io/pause-reconciliation="true"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the status conditions of the custom resource show a change to
ReconciliationPaused:oc describe <kind_of_custom_resource> <name_of_custom_resource>
oc describe <kind_of_custom_resource> <name_of_custom_resource>Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
typecondition changes toReconciliationPausedat thelastTransitionTime.Example custom resource with a paused reconciliation condition type
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Resuming from pause
-
To resume reconciliation, you can set the annotation to
false, or remove the annotation.
27.5. Maintenance time windows for rolling updates Link kopierenLink in die Zwischenablage kopiert!
Maintenance time windows allow you to schedule certain rolling updates of your Kafka and ZooKeeper clusters to start at a convenient time.
27.5.1. Maintenance time windows overview Link kopierenLink in die Zwischenablage kopiert!
In most cases, the Cluster Operator only updates your Kafka or ZooKeeper clusters in response to changes to the corresponding Kafka resource. This enables you to plan when to apply changes to a Kafka resource to minimize the impact on Kafka client applications.
However, some updates to your Kafka and ZooKeeper clusters can happen without any corresponding change to the Kafka resource. For example, the Cluster Operator will need to perform a rolling restart if a CA (certificate authority) certificate that it manages is close to expiry.
While a rolling restart of the pods should not affect availability of the service (assuming correct broker and topic configurations), it could affect performance of the Kafka client applications. Maintenance time windows allow you to schedule such spontaneous rolling updates of your Kafka and ZooKeeper clusters to start at a convenient time. If maintenance time windows are not configured for a cluster then it is possible that such spontaneous rolling updates will happen at an inconvenient time, such as during a predictable period of high load.
27.5.2. Maintenance time window definition Link kopierenLink in die Zwischenablage kopiert!
You configure maintenance time windows by entering an array of strings in the Kafka.spec.maintenanceTimeWindows property. Each string is a cron expression interpreted as being in UTC (Coordinated Universal Time, which for practical purposes is the same as Greenwich Mean Time).
The following example configures a single maintenance time window that starts at midnight and ends at 01:59am (UTC), on Sundays, Mondays, Tuesdays, Wednesdays, and Thursdays:
# ... maintenanceTimeWindows: - "* * 0-1 ? * SUN,MON,TUE,WED,THU *" # ...
# ...
maintenanceTimeWindows:
- "* * 0-1 ? * SUN,MON,TUE,WED,THU *"
# ...
In practice, maintenance windows should be set in conjunction with the Kafka.spec.clusterCa.renewalDays and Kafka.spec.clientsCa.renewalDays properties of the Kafka resource, to ensure that the necessary CA certificate renewal can be completed in the configured maintenance time windows.
AMQ Streams does not schedule maintenance operations exactly according to the given windows. Instead, for each reconciliation, it checks whether a maintenance window is currently "open". This means that the start of maintenance operations within a given time window can be delayed by up to the Cluster Operator reconciliation interval. Maintenance time windows must therefore be at least this long.
27.5.3. Configuring a maintenance time window Link kopierenLink in die Zwischenablage kopiert!
You can configure a maintenance time window for rolling updates triggered by supported processes.
Prerequisites
- An OpenShift cluster.
- The Cluster Operator is running.
Procedure
Add or edit the
maintenanceTimeWindowsproperty in theKafkaresource. For example to allow maintenance between 0800 and 1059 and between 1400 and 1559 you would set themaintenanceTimeWindowsas shown below:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create or update the resource:
oc apply -f <kafka_configuration_file>
oc apply -f <kafka_configuration_file>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
27.6. Evicting pods with the AMQ Streams Drain Cleaner Link kopierenLink in die Zwischenablage kopiert!
Kafka and ZooKeeper pods might be evicted during OpenShift upgrades, maintenance, or pod rescheduling. If your Kafka broker and ZooKeeper pods were deployed by AMQ Streams, you can use the AMQ Streams Drain Cleaner tool to handle the pod evictions. The AMQ Streams Drain Cleaner handles the eviction instead of OpenShift. You must set the podDisruptionBudget for your Kafka deployment to 0 (zero). OpenShift will then no longer be allowed to evict the pod automatically.
By deploying the AMQ Streams Drain Cleaner, you can use the Cluster Operator to move Kafka pods instead of OpenShift. The Cluster Operator ensures that topics are never under-replicated. Kafka can remain operational during the eviction process. The Cluster Operator waits for topics to synchronize, as the OpenShift worker nodes drain consecutively.
An admission webhook notifies the AMQ Streams Drain Cleaner of pod eviction requests to the Kubernetes API. The AMQ Streams Drain Cleaner then adds a rolling update annotation to the pods to be drained. This informs the Cluster Operator to perform a rolling update of an evicted pod.
If you are not using the AMQ Streams Drain Cleaner, you can add pod annotations to perform rolling updates manually.
Webhook configuration
The AMQ Streams Drain Cleaner deployment files include a ValidatingWebhookConfiguration resource file. The resource provides the configuration for registering the webhook with the Kubernetes API.
The configuration defines the rules for the Kubernetes API to follow in the event of a pod eviction request. The rules specify that only CREATE operations related to pods/eviction sub-resources are intercepted. If these rules are met, the API forwards the notification.
The clientConfig points to the AMQ Streams Drain Cleaner service and /drainer endpoint that exposes the webhook. The webhook uses a secure TLS connection, which requires authentication. The caBundle property specifies the certificate chain to validate HTTPS communication. Certificates are encoded in Base64.
Webhook configuration for pod eviction notifications
27.6.1. Downloading the AMQ Streams Drain Cleaner deployment files Link kopierenLink in die Zwischenablage kopiert!
To deploy and use the AMQ Streams Drain Cleaner, you need to download the deployment files.
The AMQ Streams Drain Cleaner deployment files are available from the AMQ Streams software downloads page.
27.6.2. Deploying the AMQ Streams Drain Cleaner using installation files Link kopierenLink in die Zwischenablage kopiert!
Deploy the AMQ Streams Drain Cleaner to the OpenShift cluster where the Cluster Operator and Kafka cluster are running.
AMQ Streams sets a default PodDisruptionBudget (PDB) that allows only one Kafka or ZooKeeper pod to be unavailable at any given time. To use the Drain Cleaner for planned maintenance or upgrades, you must set a PDB of zero. This is to prevent voluntary evictions of pods, and ensure that the Kafka or ZooKeeper cluster remains available. You do this by setting the maxUnavailable value to zero in the Kafka or ZooKeeper template. StrimziPodSet custom resources manage Kafka and ZooKeeper pods using a custom controller that cannot use the maxUnavailable value directly. Instead, the maxUnavailable value is converted to a minAvailable value. For example, if there are three broker pods and the maxUnavailable property is set to 0 (zero), the minAvailable setting is 3, requiring all three broker pods to be available and allowing zero pods to be unavailable.
Prerequisites
- You have downloaded the AMQ Streams Drain Cleaner deployment files.
- You have a highly available Kafka cluster deployment running with OpenShift worker nodes that you would like to update.
Topics are replicated for high availability.
Topic configuration specifies a replication factor of at least 3 and a minimum number of in-sync replicas to 1 less than the replication factor.
Kafka topic replicated for high availability
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Excluding Kafka or ZooKeeper
If you don’t want to include Kafka or ZooKeeper pods in Drain Cleaner operations, change the default environment variables in the Drain Cleaner Deployment configuration file.
-
Set
STRIMZI_DRAIN_KAFKAtofalseto exclude Kafka pods -
Set
STRIMZI_DRAIN_ZOOKEEPERtofalseto exclude ZooKeeper pods
Example configuration to exclude ZooKeeper pods
Procedure
Set
maxUnavailableto0(zero) in the Kafka and ZooKeeper sections of theKafkaresource usingtemplatesettings.Specifying a pod disruption budget
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This setting prevents the automatic eviction of pods in case of planned disruptions, leaving the AMQ Streams Drain Cleaner and Cluster Operator to roll the pods on different worker nodes.
Add the same configuration for ZooKeeper if you want to use AMQ Streams Drain Cleaner to drain ZooKeeper nodes.
Update the
Kafkaresource:oc apply -f <kafka_configuration_file>
oc apply -f <kafka_configuration_file>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Deploy the AMQ Streams Drain Cleaner.
To run the Drain Cleaner on OpenShift, apply the resources in the
/install/drain-cleaner/openshiftdirectory.oc apply -f ./install/drain-cleaner/openshift
oc apply -f ./install/drain-cleaner/openshiftCopy to Clipboard Copied! Toggle word wrap Toggle overflow
27.6.3. Using the AMQ Streams Drain Cleaner Link kopierenLink in die Zwischenablage kopiert!
Use the AMQ Streams Drain Cleaner in combination with the Cluster Operator to move Kafka broker or ZooKeeper pods from nodes that are being drained. When you run the AMQ Streams Drain Cleaner, it annotates pods with a rolling update pod annotation. The Cluster Operator performs rolling updates based on the annotation.
Prerequisites
- You have deployed the AMQ Streams Drain Cleaner.
Procedure
Drain a specified OpenShift node hosting the Kafka broker or ZooKeeper pods.
oc get nodes oc drain <name-of-node> --delete-emptydir-data --ignore-daemonsets --timeout=6000s --force
oc get nodes oc drain <name-of-node> --delete-emptydir-data --ignore-daemonsets --timeout=6000s --forceCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the eviction events in the AMQ Streams Drain Cleaner log to verify that the pods have been annotated for restart.
AMQ Streams Drain Cleaner log show annotations of pods
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the reconciliation events in the Cluster Operator log to verify the rolling updates.
Cluster Operator log shows rolling updates
INFO PodOperator:68 - Reconciliation #13(timer) Kafka(my-project/my-cluster): Rolling Pod my-cluster-zookeeper-2 INFO PodOperator:68 - Reconciliation #13(timer) Kafka(my-project/my-cluster): Rolling Pod my-cluster-kafka-0 INFO AbstractOperator:500 - Reconciliation #13(timer) Kafka(my-project/my-cluster): reconciled
INFO PodOperator:68 - Reconciliation #13(timer) Kafka(my-project/my-cluster): Rolling Pod my-cluster-zookeeper-2 INFO PodOperator:68 - Reconciliation #13(timer) Kafka(my-project/my-cluster): Rolling Pod my-cluster-kafka-0 INFO AbstractOperator:500 - Reconciliation #13(timer) Kafka(my-project/my-cluster): reconciledCopy to Clipboard Copied! Toggle word wrap Toggle overflow
27.6.4. Watching the TLS certificates used by the AMQ Streams Drain Cleaner Link kopierenLink in die Zwischenablage kopiert!
By default, the Drain Cleaner deployment watches the secret containing the TLS certificates its uses for authentication. The Drain Cleaner watches for changes, such as certificate renewals. If it detects a change, it restarts to reload the TLS certificates. The Drain Cleaner installation files enable this behavior by default. But you can disable the watching of certificates by setting the STRIMZI_CERTIFICATE_WATCH_ENABLED environment variable to false in the Deployment configuration (060-Deployment.yaml) of the Drain Cleaner installation files.
With STRIMZI_CERTIFICATE_WATCH_ENABLED enabled, you can also use the following environment variables for watching TLS certificates.
| Environment Variable | Description | Default |
|---|---|---|
|
| Enables or disables the certificate watch |
|
|
| The namespace where the Drain Cleaner is deployed and where the certificate secret exists |
|
|
| The Drain Cleaner pod name | - |
|
| The name of the secret containing TLS certificates |
|
|
| The list of fields inside the secret that contain the TLS certificates |
|
Example environment variable configuration to control watch operations
Use the Downward API mechanism to configure STRIMZI_CERTIFICATE_WATCH_NAMESPACE and STRIMZI_CERTIFICATE_WATCH_POD_NAME.
27.7. Deleting Kafka nodes using annotations Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to delete an existing Kafka node by using an OpenShift annotation. Deleting a Kafka node consists of deleting both the Pod on which the Kafka broker is running and the related PersistentVolumeClaim (if the cluster was deployed with persistent storage). After deletion, the Pod and its related PersistentVolumeClaim are recreated automatically.
Deleting a PersistentVolumeClaim can cause permanent data loss and the availability of your cluster cannot be guaranteed. The following procedure should only be performed if you have encountered storage issues.
Prerequisites
- A running Cluster Operator
Procedure
Find the name of the
Podthat you want to delete.Kafka broker pods are named <cluster-name>-kafka-<index>, where <index> starts at zero and ends at the total number of replicas minus one. For example,
my-cluster-kafka-0.Annotate the
Podresource in OpenShift.Use
oc annotate:oc annotate pod cluster-name-kafka-index strimzi.io/delete-pod-and-pvc=true
oc annotate pod cluster-name-kafka-index strimzi.io/delete-pod-and-pvc=trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Wait for the next reconciliation, when the annotated pod with the underlying persistent volume claim will be deleted and then recreated.
27.8. Deleting ZooKeeper nodes using annotations Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to delete an existing ZooKeeper node by using an OpenShift annotation. Deleting a ZooKeeper node consists of deleting both the Pod on which ZooKeeper is running and the related PersistentVolumeClaim (if the cluster was deployed with persistent storage). After deletion, the Pod and its related PersistentVolumeClaim are recreated automatically.
Deleting a PersistentVolumeClaim can cause permanent data loss and the availability of your cluster cannot be guaranteed. The following procedure should only be performed if you have encountered storage issues.
Prerequisites
- A running Cluster Operator
Procedure
Find the name of the
Podthat you want to delete.ZooKeeper pods are named <cluster-name>-zookeeper-<index>, where <index> starts at zero and ends at the total number of replicas minus one. For example,
my-cluster-zookeeper-0.Annotate the
Podresource in OpenShift.Use
oc annotate:oc annotate pod cluster-name-zookeeper-index strimzi.io/delete-pod-and-pvc=true
oc annotate pod cluster-name-zookeeper-index strimzi.io/delete-pod-and-pvc=trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Wait for the next reconciliation, when the annotated pod with the underlying persistent volume claim will be deleted and then recreated.
27.9. Starting rolling updates of Kafka and ZooKeeper clusters using annotations Link kopierenLink in die Zwischenablage kopiert!
AMQ Streams supports the use of annotations on resources to manually trigger a rolling update of Kafka and ZooKeeper clusters through the Cluster Operator. Rolling updates restart the pods of the resource with new ones.
Manually performing a rolling update on a specific pod or set of pods is usually only required in exceptional circumstances. However, rather than deleting the pods directly, if you perform the rolling update through the Cluster Operator you ensure the following:
- The manual deletion of the pod does not conflict with simultaneous Cluster Operator operations, such as deleting other pods in parallel.
- The Cluster Operator logic handles the Kafka configuration specifications, such as the number of in-sync replicas.
27.9.1. Performing a rolling update using a pod management annotation Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to trigger a rolling update of a Kafka cluster or ZooKeeper cluster. To trigger the update, you add an annotation to the StrimziPodSet that manages the pods running on the cluster.
Prerequisites
To perform a manual rolling update, you need a running Cluster Operator and Kafka cluster.
Procedure
Find the name of the resource that controls the Kafka or ZooKeeper pods you want to manually update.
For example, if your Kafka cluster is named my-cluster, the corresponding names are my-cluster-kafka and my-cluster-zookeeper.
Use
oc annotateto annotate the appropriate resource in OpenShift.Annotating a StrimziPodSet
oc annotate strimzipodset <cluster_name>-kafka strimzi.io/manual-rolling-update=true oc annotate strimzipodset <cluster_name>-zookeeper strimzi.io/manual-rolling-update=true
oc annotate strimzipodset <cluster_name>-kafka strimzi.io/manual-rolling-update=true oc annotate strimzipodset <cluster_name>-zookeeper strimzi.io/manual-rolling-update=trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Wait for the next reconciliation to occur (every two minutes by default). A rolling update of all pods within the annotated resource is triggered, as long as the annotation was detected by the reconciliation process. When the rolling update of all the pods is complete, the annotation is removed from the resource.
27.9.2. Performing a rolling update using a pod annotation Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to manually trigger a rolling update of an existing Kafka cluster or ZooKeeper cluster using an OpenShift Pod annotation. When multiple pods are annotated, consecutive rolling updates are performed within the same reconciliation run.
Prerequisites
To perform a manual rolling update, you need a running Cluster Operator and Kafka cluster.
You can perform a rolling update on a Kafka cluster regardless of the topic replication factor used. But for Kafka to stay operational during the update, you’ll need the following:
- A highly available Kafka cluster deployment running with nodes that you wish to update.
Topics replicated for high availability.
Topic configuration specifies a replication factor of at least 3 and a minimum number of in-sync replicas to 1 less than the replication factor.
Kafka topic replicated for high availability
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure
Find the name of the Kafka or ZooKeeper
Podyou want to manually update.For example, if your Kafka cluster is named my-cluster, the corresponding
Podnames are my-cluster-kafka-index and my-cluster-zookeeper-index. The index starts at zero and ends at the total number of replicas minus one.Annotate the
Podresource in OpenShift.Use
oc annotate:oc annotate pod cluster-name-kafka-index strimzi.io/manual-rolling-update=true oc annotate pod cluster-name-zookeeper-index strimzi.io/manual-rolling-update=true
oc annotate pod cluster-name-kafka-index strimzi.io/manual-rolling-update=true oc annotate pod cluster-name-zookeeper-index strimzi.io/manual-rolling-update=trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Wait for the next reconciliation to occur (every two minutes by default). A rolling update of the annotated
Podis triggered, as long as the annotation was detected by the reconciliation process. When the rolling update of a pod is complete, the annotation is removed from thePod.
27.10. Performing restarts of MirrorMaker 2 connectors using annotations Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to manually trigger a restart of a Kafka MirrorMaker 2 connector by using an OpenShift annotation.
Prerequisites
- The Cluster Operator is running.
Procedure
Find the name of the
KafkaMirrorMaker2custom resource that controls the Kafka MirrorMaker 2 connector you want to restart:oc get KafkaMirrorMaker2
oc get KafkaMirrorMaker2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Find the name of the Kafka MirrorMaker 2 connector to be restarted from the
KafkaMirrorMaker2custom resource.oc describe KafkaMirrorMaker2 KAFKAMIRRORMAKER-2-NAME
oc describe KafkaMirrorMaker2 KAFKAMIRRORMAKER-2-NAMECopy to Clipboard Copied! Toggle word wrap Toggle overflow To restart the connector, annotate the
KafkaMirrorMaker2resource in OpenShift. In this example,oc annotaterestarts a connector namedmy-source->my-target.MirrorSourceConnector:oc annotate KafkaMirrorMaker2 KAFKAMIRRORMAKER-2-NAME "strimzi.io/restart-connector=my-source->my-target.MirrorSourceConnector"
oc annotate KafkaMirrorMaker2 KAFKAMIRRORMAKER-2-NAME "strimzi.io/restart-connector=my-source->my-target.MirrorSourceConnector"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Wait for the next reconciliation to occur (every two minutes by default).
The Kafka MirrorMaker 2 connector is restarted, as long as the annotation was detected by the reconciliation process. When the restart request is accepted, the annotation is removed from the
KafkaMirrorMaker2custom resource.
27.11. Performing restarts of MirrorMaker 2 connector task using annotations Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to manually trigger a restart of a Kafka MirrorMaker 2 connector task by using an OpenShift annotation.
Prerequisites
- The Cluster Operator is running.
Procedure
Find the name of the
KafkaMirrorMaker2custom resource that controls the Kafka MirrorMaker 2 connector you want to restart:oc get KafkaMirrorMaker2
oc get KafkaMirrorMaker2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Find the name of the Kafka MirrorMaker 2 connector and the ID of the task to be restarted from the
KafkaMirrorMaker2custom resource. Task IDs are non-negative integers, starting from 0.oc describe KafkaMirrorMaker2 KAFKAMIRRORMAKER-2-NAME
oc describe KafkaMirrorMaker2 KAFKAMIRRORMAKER-2-NAMECopy to Clipboard Copied! Toggle word wrap Toggle overflow To restart the connector task, annotate the
KafkaMirrorMaker2resource in OpenShift. In this example,oc annotaterestarts task 0 of a connector namedmy-source->my-target.MirrorSourceConnector:oc annotate KafkaMirrorMaker2 KAFKAMIRRORMAKER-2-NAME "strimzi.io/restart-connector-task=my-source->my-target.MirrorSourceConnector:0"
oc annotate KafkaMirrorMaker2 KAFKAMIRRORMAKER-2-NAME "strimzi.io/restart-connector-task=my-source->my-target.MirrorSourceConnector:0"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Wait for the next reconciliation to occur (every two minutes by default).
The Kafka MirrorMaker 2 connector task is restarted, as long as the annotation was detected by the reconciliation process. When the restart task request is accepted, the annotation is removed from the
KafkaMirrorMaker2custom resource.
27.12. Recovering a cluster from persistent volumes Link kopierenLink in die Zwischenablage kopiert!
You can recover a Kafka cluster from persistent volumes (PVs) if they are still present.
You might want to do this, for example, after:
- A namespace was deleted unintentionally
- A whole OpenShift cluster is lost, but the PVs remain in the infrastructure
27.12.1. Recovery from namespace deletion Link kopierenLink in die Zwischenablage kopiert!
Recovery from namespace deletion is possible because of the relationship between persistent volumes and namespaces. A PersistentVolume (PV) is a storage resource that lives outside of a namespace. A PV is mounted into a Kafka pod using a PersistentVolumeClaim (PVC), which lives inside a namespace.
The reclaim policy for a PV tells a cluster how to act when a namespace is deleted. If the reclaim policy is set as:
- Delete (default), PVs are deleted when PVCs are deleted within a namespace
- Retain, PVs are not deleted when a namespace is deleted
To ensure that you can recover from a PV if a namespace is deleted unintentionally, the policy must be reset from Delete to Retain in the PV specification using the persistentVolumeReclaimPolicy property:
Alternatively, PVs can inherit the reclaim policy of an associated storage class. Storage classes are used for dynamic volume allocation.
By configuring the reclaimPolicy property for the storage class, PVs that use the storage class are created with the appropriate reclaim policy. The storage class is configured for the PV using the storageClassName property.
If you are using Retain as the reclaim policy, but you want to delete an entire cluster, you need to delete the PVs manually. Otherwise they will not be deleted, and may cause unnecessary expenditure on resources.
27.12.2. Recovery from loss of an OpenShift cluster Link kopierenLink in die Zwischenablage kopiert!
When a cluster is lost, you can use the data from disks/volumes to recover the cluster if they were preserved within the infrastructure. The recovery procedure is the same as with namespace deletion, assuming PVs can be recovered and they were created manually.
27.12.3. Recovering a deleted cluster from persistent volumes Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to recover a deleted cluster from persistent volumes (PVs).
In this situation, the Topic Operator identifies that topics exist in Kafka, but the KafkaTopic resources do not exist.
When you get to the step to recreate your cluster, you have two options:
Use Option 1 when you can recover all
KafkaTopicresources.The
KafkaTopicresources must therefore be recovered before the cluster is started so that the corresponding topics are not deleted by the Topic Operator.Use Option 2 when you are unable to recover all
KafkaTopicresources.In this case, you deploy your cluster without the Topic Operator, delete the Topic Operator topic store metadata, and then redeploy the Kafka cluster with the Topic Operator so it can recreate the
KafkaTopicresources from the corresponding topics.
If the Topic Operator is not deployed, you only need to recover the PersistentVolumeClaim (PVC) resources.
Before you begin
In this procedure, it is essential that PVs are mounted into the correct PVC to avoid data corruption. A volumeName is specified for the PVC and this must match the name of the PV.
For more information, see Persistent storage.
The procedure does not include recovery of KafkaUser resources, which must be recreated manually. If passwords and certificates need to be retained, secrets must be recreated before creating the KafkaUser resources.
Procedure
Check information on the PVs in the cluster:
oc get pv
oc get pvCopy to Clipboard Copied! Toggle word wrap Toggle overflow Information is presented for PVs with data.
Example output showing columns important to this procedure:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - NAME shows the name of each PV.
- RECLAIM POLICY shows that PVs are retained.
- CLAIM shows the link to the original PVCs.
Recreate the original namespace:
oc create namespace myproject
oc create namespace myprojectCopy to Clipboard Copied! Toggle word wrap Toggle overflow Recreate the original PVC resource specifications, linking the PVCs to the appropriate PV:
For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the PV specifications to delete the
claimRefproperties that bound the original PVC.For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow In the example, the following properties are deleted:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Deploy the Cluster Operator.
oc create -f install/cluster-operator -n my-project
oc create -f install/cluster-operator -n my-projectCopy to Clipboard Copied! Toggle word wrap Toggle overflow Recreate your cluster.
Follow the steps depending on whether or not you have all the
KafkaTopicresources needed to recreate your cluster.Option 1: If you have all the
KafkaTopicresources that existed before you lost your cluster, including internal topics such as committed offsets from__consumer_offsets:Recreate all
KafkaTopicresources.It is essential that you recreate the resources before deploying the cluster, or the Topic Operator will delete the topics.
Deploy the Kafka cluster.
For example:
oc apply -f kafka.yaml
oc apply -f kafka.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Option 2: If you do not have all the
KafkaTopicresources that existed before you lost your cluster:Deploy the Kafka cluster, as with the first option, but without the Topic Operator by removing the
topicOperatorproperty from the Kafka resource before deploying.If you include the Topic Operator in the deployment, the Topic Operator will delete all the topics.
Delete the internal topic store topics from the Kafka cluster:
oc run kafka-admin -ti --image=registry.redhat.io/amq-streams/kafka-35-rhel8:2.5.2 --rm=true --restart=Never -- ./bin/kafka-topics.sh --bootstrap-server localhost:9092 --topic __strimzi-topic-operator-kstreams-topic-store-changelog --delete && ./bin/kafka-topics.sh --bootstrap-server localhost:9092 --topic __strimzi_store_topic --delete
oc run kafka-admin -ti --image=registry.redhat.io/amq-streams/kafka-35-rhel8:2.5.2 --rm=true --restart=Never -- ./bin/kafka-topics.sh --bootstrap-server localhost:9092 --topic __strimzi-topic-operator-kstreams-topic-store-changelog --delete && ./bin/kafka-topics.sh --bootstrap-server localhost:9092 --topic __strimzi_store_topic --deleteCopy to Clipboard Copied! Toggle word wrap Toggle overflow The command must correspond to the type of listener and authentication used to access the Kafka cluster.
Enable the Topic Operator by redeploying the Kafka cluster with the
topicOperatorproperty to recreate theKafkaTopicresources.For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
- 1
- Here we show the default configuration, which has no additional properties. You specify the required configuration using the properties described in the
EntityTopicOperatorSpecschema reference.
Verify the recovery by listing the
KafkaTopicresources:oc get KafkaTopic
oc get KafkaTopicCopy to Clipboard Copied! Toggle word wrap Toggle overflow
27.13. Uninstalling AMQ Streams Link kopierenLink in die Zwischenablage kopiert!
You can uninstall AMQ Streams on OpenShift 4.12 and later from the OperatorHub using the OpenShift Container Platform web console or CLI.
Use the same approach you used to install AMQ Streams.
When you uninstall AMQ Streams, you will need to identify resources created specifically for a deployment and referenced from the AMQ Streams resource.
Such resources include:
- Secrets (Custom CAs and certificates, Kafka Connect secrets, and other Kafka secrets)
-
Logging
ConfigMaps(of typeexternal)
These are resources referenced by Kafka, KafkaConnect, KafkaMirrorMaker, or KafkaBridge configuration.
Deleting CustomResourceDefinitions results in the garbage collection of the corresponding custom resources (Kafka, KafkaConnect, KafkaMirrorMaker, or KafkaBridge) and the resources dependent on them (Deployments, StatefulSets, and other dependent resources).
27.13.1. Uninstalling AMQ Streams from the OperatorHub using the web console Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to uninstall AMQ Streams from the OperatorHub and remove resources related to the deployment.
You can perform the steps from the console or use alternative CLI commands.
Prerequisites
-
Access to an OpenShift Container Platform web console using an account with
cluster-adminorstrimzi-adminpermissions. You have identified the resources to be deleted.
You can use the following
ocCLI command to find resources and also verify that they have been removed when you have uninstalled AMQ Streams.Command to find resources related to an AMQ Streams deployment
oc get <resource_type> --all-namespaces | grep <kafka_cluster_name>
oc get <resource_type> --all-namespaces | grep <kafka_cluster_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <resource_type> with the type of the resource you are checking, such as
secretorconfigmap.
Procedure
- Navigate in the OpenShift web console to Operators > Installed Operators.
For the installed AMQ Streams operator, select the options icon (three vertical dots) and click Uninstall Operator.
The operator is removed from Installed Operators.
- Navigate to Home > Projects and select the project where you installed AMQ Streams and the Kafka components.
Click the options under Inventory to delete related resources.
Resources include the following:
- Deployments
- StatefulSets
- Pods
- Services
- ConfigMaps
- Secrets
TipUse the search to find related resources that begin with the name of the Kafka cluster. You can also find the resources under Workloads.
Alternative CLI commands
You can use CLI commands to uninstall AMQ Streams from the OperatorHub.
Delete the AMQ Streams subscription.
oc delete subscription amq-streams -n openshift-operators
oc delete subscription amq-streams -n openshift-operatorsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Delete the cluster service version (CSV).
oc delete csv amqstreams.<version> -n openshift-operators
oc delete csv amqstreams.<version> -n openshift-operatorsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Remove related CRDs.
oc get crd -l app=strimzi -o name | xargs oc delete
oc get crd -l app=strimzi -o name | xargs oc deleteCopy to Clipboard Copied! Toggle word wrap Toggle overflow
27.13.2. Uninstalling AMQ Streams using the CLI Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to use the oc command-line tool to uninstall AMQ Streams and remove resources related to the deployment.
Prerequisites
-
Access to an OpenShift cluster using an account with
cluster-adminorstrimzi-adminpermissions. You have identified the resources to be deleted.
You can use the following
ocCLI command to find resources and also verify that they have been removed when you have uninstalled AMQ Streams.Command to find resources related to an AMQ Streams deployment
oc get <resource_type> --all-namespaces | grep <kafka_cluster_name>
oc get <resource_type> --all-namespaces | grep <kafka_cluster_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <resource_type> with the type of the resource you are checking, such as
secretorconfigmap.
Procedure
Delete the Cluster Operator
Deployment, relatedCustomResourceDefinitions, andRBACresources.Specify the installation files used to deploy the Cluster Operator.
oc delete -f install/cluster-operator
oc delete -f install/cluster-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Delete the resources you identified in the prerequisites.
oc delete <resource_type> <resource_name> -n <namespace>
oc delete <resource_type> <resource_name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <resource_type> with the type of resource you are deleting and <resource_name> with the name of the resource.
Example to delete a secret
oc delete secret my-cluster-clients-ca-cert -n my-project
oc delete secret my-cluster-clients-ca-cert -n my-projectCopy to Clipboard Copied! Toggle word wrap Toggle overflow