Chapter 5. Using Kafka in KRaft mode
KRaft (Kafka Raft metadata) mode replaces Kafka’s dependency on ZooKeeper for cluster management. KRaft mode simplifies the deployment and management of Kafka clusters by bringing metadata management and coordination of clusters into Kafka.
Kafka in KRaft mode is designed to offer enhanced reliability, scalability, and throughput. Metadata operations become more efficient as they are directly integrated. And by removing the need to maintain a ZooKeeper cluster, there’s also a reduction in the operational and security overhead.
To deploy a Kafka cluster in KRaft mode, you must use Kafka and KafkaNodePool custom resources. The Kafka resource using KRaft mode must also have the annotations strimzi.io/kraft: enabled and strimzi.io/node-pools: enabled. For more details and examples, see Section 7.3.1, “Deploying a Kafka cluster in KRaft mode”.
Through node pool configuration using KafkaNodePool resources, nodes are assigned the role of broker, controller, or both:
- Controller nodes operate in the control plane to manage cluster metadata and the state of the cluster using a Raft-based consensus protocol.
- Broker nodes operate in the data plane to manage the streaming of messages, receiving and storing data in topic partitions.
- Dual-role nodes fulfill the responsibilities of controllers and brokers.
Controllers use a metadata log, stored as a single-partition topic (__cluster_metadata) on every node, which records the state of the cluster. When requests are made to change the cluster configuration, an active (lead) controller manages updates to the metadata log, and follower controllers replicate these updates. The metadata log stores information on brokers, replicas, topics, and partitions, including the state of in-sync replicas and partition leadership. Kafka uses this metadata to coordinate changes and manage the cluster effectively.
Broker nodes act as observers, storing the metadata log passively to stay up-to-date with the cluster’s state. Each node fetches updates to the log independently.
The KRaft metadata version used in the Kafka cluster must be supported by the Kafka version in use. Both versions are managed through the Kafka resource configuration. For more information, see Section 9.2, “Configuring Kafka in KRaft mode”.
In the following example, a Kafka cluster comprises a quorum of controller and broker nodes for fault tolerance and high availability.
Figure 5.1. Example cluster with separate broker and controller nodes
In a typical production environment, use dedicated broker and controller nodes. However, you might want to use nodes in a dual-role configuration for development or testing.
You can use a combination of nodes that combine roles with nodes that perform a single role. In the following example, three nodes perform a dual role and two nodes act only as brokers.
Figure 5.2. Example cluster with dual-role nodes and dedicated broker nodes
5.1. KRaft limitations Copy linkLink copied to clipboard!
Currently, the KRaft mode in Streams for Apache Kafka has the following major limitations:
- Scaling of KRaft controller nodes up or down is not supported.
If you are using JBOD storage, you can change the volume that stores the metadata log.
5.2. Migrating to KRaft mode Copy linkLink copied to clipboard!
If you are using ZooKeeper for metadata management in your Kafka cluster, you can migrate to using Kafka in KRaft mode.
During the migration, you install a quorum of controller nodes as a node pool, which replaces ZooKeeper for management of your cluster. You enable KRaft migration in the cluster configuration by applying the strimzi.io/kraft="migration" annotation. After the migration is complete, you switch the brokers to using KRaft and the controllers out of migration mode using the strimzi.io/kraft="enabled" annotation.
Before starting the migration, verify that your environment can support Kafka in KRaft mode, as there are a number of limitations. Note also, the following:
- Migration is only supported on dedicated controller nodes, not on nodes with dual roles as brokers and controllers.
- Throughout the migration process, ZooKeeper and controller nodes operate in parallel for a period, requiring sufficient compute resources in the cluster.
- Once KRaft mode is enabled, rollback to ZooKeeper is not possible. Consider this carefully before proceeding with the migration.
Prerequisites
- You must be using Streams for Apache Kafka 2.7 or newer with Kafka 3.7.0 or newer. If you are using an earlier version of Streams for Apache Kafka or Apache Kafka, upgrade before migrating to KRaft mode.
Verify that the ZooKeeper-based deployment is operating without the following, as they are not supported in KRaft mode:
-
JBOD storage. While the
jbodstorage type can be used, the JBOD array must contain only one disk.
-
JBOD storage. While the
- The Cluster Operator that manages the Kafka cluster is running.
The Kafka cluster deployment uses Kafka node pools.
If your ZooKeeper-based cluster is already using node pools, it is ready to migrate. If not, you can migrate the cluster to use node pools. To migrate when the cluster is not using node pools, brokers must be contained in a
KafkaNodePoolresource configuration that is assigned abrokerrole and has the namekafka. Support for node pools is enabled in theKafkaresource configuration using thestrimzi.io/node-pools: enabledannotation.
Using a single controller with ephemeral storage for migrating to KRaft will not work. During the migration, controller restart will cause loss of metadata synced from ZooKeeper (such as topics and ACLs). In general, migrating an ephemeral-based ZooKeeper cluster to KRaft is not recommended.
In this procedure, the Kafka cluster name is my-cluster, which is located in the my-project namespace. The name of the controller node pool created is controller. The node pool for the brokers is called kafka.
Procedure
For the Kafka cluster, create a node pool with a
controllerrole.The node pool adds a quorum of controller nodes to the cluster.
Example configuration for a controller node pool
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor the migration, you cannot use a node pool of nodes that share the broker and controller roles.
Apply the new
KafkaNodePoolresource to create the controllers.Errors related to using controllers in a ZooKeeper-based environment are expected in the Cluster Operator logs. The errors can block reconciliation. To prevent this, perform the next step immediately.
Enable KRaft migration in the
Kafkaresource by setting thestrimzi.io/kraftannotation tomigration:oc annotate kafka my-cluster strimzi.io/kraft="migration" --overwrite
oc annotate kafka my-cluster strimzi.io/kraft="migration" --overwriteCopy to Clipboard Copied! Toggle word wrap Toggle overflow Enabling KRaft migration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Applying the annotation to the
Kafkaresource configuration starts the migration.Check the controllers have started and the brokers have rolled:
oc get pods -n my-project
oc get pods -n my-projectCopy to Clipboard Copied! Toggle word wrap Toggle overflow Output shows nodes in broker and controller node pools
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the status of the migration:
oc get kafka my-cluster -n my-project -w
oc get kafka my-cluster -n my-project -wCopy to Clipboard Copied! Toggle word wrap Toggle overflow Updates to the metadata state
NAME ... METADATA STATE my-cluster ... Zookeeper my-cluster ... KRaftMigration my-cluster ... KRaftDualWriting my-cluster ... KRaftPostMigration
NAME ... METADATA STATE my-cluster ... Zookeeper my-cluster ... KRaftMigration my-cluster ... KRaftDualWriting my-cluster ... KRaftPostMigrationCopy to Clipboard Copied! Toggle word wrap Toggle overflow METADATA STATEshows the mechanism used to manage Kafka metadata and coordinate operations. At the start of the migration this isZooKeeper.-
ZooKeeperis the initial state when metadata is only stored in ZooKeeper. -
KRaftMigrationis the state when the migration is in progress. The flag to enable ZooKeeper to KRaft migration (zookeeper.metadata.migration.enable) is added to the brokers and they are rolled to register with the controllers. The migration can take some time at this point depending on the number of topics and partitions in the cluster. -
KRaftDualWritingis the state when the Kafka cluster is working as a KRaft cluster, but metadata are being stored in both Kafka and ZooKeeper. Brokers are rolled a second time to remove the flag to enable migration. -
KRaftPostMigrationis the state when KRaft mode is enabled for brokers. Metadata are still being stored in both Kafka and ZooKeeper.
The migration status is also represented in the
status.kafkaMetadataStateproperty of theKafkaresource.WarningYou can roll back to using ZooKeeper from this point. The next step is to enable KRaft. Rollback cannot be performed after enabling KRaft.
-
When the metadata state has reached
KRaftPostMigration, enable KRaft in theKafkaresource configuration by setting thestrimzi.io/kraftannotation toenabled:oc annotate kafka my-cluster strimzi.io/kraft="enabled" --overwrite
oc annotate kafka my-cluster strimzi.io/kraft="enabled" --overwriteCopy to Clipboard Copied! Toggle word wrap Toggle overflow Enabling KRaft migration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the status of the move to full KRaft mode:
oc get kafka my-cluster -n my-project -w
oc get kafka my-cluster -n my-project -wCopy to Clipboard Copied! Toggle word wrap Toggle overflow Updates to the metadata state
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
PreKRaftis the state when all ZooKeeper-related resources have been automatically deleted. -
KRaftis the final state (after the controllers have rolled) when the KRaft migration is finalized.
NoteDepending on how
deleteClaimis configured for ZooKeeper, its Persistent Volume Claims (PVCs) and persistent volumes (PVs) may not be deleted.deleteClaimspecifies whether the PVC is deleted when the cluster is uninstalled. The default isfalse.-
Remove any ZooKeeper-related configuration from the
Kafkaresource.Remove the following section:
-
spec.zookeeper
If present, you can also remove the following options from the
.spec.kafka.configsection:-
log.message.format.version -
inter.broker.protocol.version
Removing
log.message.format.versionandinter.broker.protocol.versioncauses the brokers and controllers to roll again. Removing ZooKeeper properties removes any warning messages related to ZooKeeper configuration being present in a KRaft-operated cluster.-
5.2.1. Performing a rollback on the migration Copy linkLink copied to clipboard!
Before the migration is finalized by enabling KRaft in the Kafka resource, and the state has moved to the KRaft state, you can perform a rollback operation as follows:
Apply the
strimzi.io/kraft="rollback"annotation to theKafkaresource to roll back the brokers.oc annotate kafka my-cluster strimzi.io/kraft="rollback" --overwrite
oc annotate kafka my-cluster strimzi.io/kraft="rollback" --overwriteCopy to Clipboard Copied! Toggle word wrap Toggle overflow Rolling back KRaft migration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The migration process must be in the
KRaftPostMigrationstate to do this. The brokers are rolled back so that they can be connected to ZooKeeper again and the state returns toKRaftDualWriting.Delete the controllers node pool:
oc delete KafkaNodePool controller -n my-project
oc delete KafkaNodePool controller -n my-projectCopy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the
strimzi.io/kraft="disabled"annotation to theKafkaresource to return the metadata state toZooKeeper.oc annotate kafka my-cluster strimzi.io/kraft="disabled" --overwrite
oc annotate kafka my-cluster strimzi.io/kraft="disabled" --overwriteCopy to Clipboard Copied! Toggle word wrap Toggle overflow Switching back to using ZooKeeper
Copy to Clipboard Copied! Toggle word wrap Toggle overflow