Chapter 4. Using Kafka in KRaft mode
KRaft (Kafka Raft metadata) mode replaces Kafka’s dependency on ZooKeeper for cluster management. KRaft mode simplifies the deployment and management of Kafka clusters by bringing metadata management and coordination of clusters into Kafka.
Kafka in KRaft mode is designed to offer enhanced reliability, scalability, and throughput. Metadata operations become more efficient as they are directly integrated. And by removing the need to maintain a ZooKeeper cluster, there’s also a reduction in the operational and security overhead.
Through Kafka configuration, nodes are assigned the role of broker, controller, or both:
- Controller nodes operate in the control plane to manage cluster metadata and the state of the cluster using a Raft-based consensus protocol.
- Broker nodes operate in the data plane to manage the streaming of messages, receiving and storing data in topic partitions.
- Dual-role nodes fulfill the responsibilities of controllers and brokers.
You can use a dynamic or static quorum of controllers. Dynamic is recommended as it supports dynamic scaling.
Controllers use a metadata log, stored as a single-partition topic (__cluster_metadata) on every node, which records the state of the cluster. When requests are made to change the cluster configuration, an active (lead) controller manages updates to the metadata log, and follower controllers replicate these updates. The metadata log stores information on brokers, replicas, topics, and partitions, including the state of in-sync replicas and partition leadership. Kafka uses this metadata to coordinate changes and manage the cluster effectively.
Broker nodes act as observers, storing the metadata log passively to stay up-to-date with the cluster’s state. Each node fetches updates to the log independently. If you are using JBOD storage, you can change the directory that stores the metadata log.
The KRaft metadata version used in the Kafka cluster must be supported by the Kafka version in use.
In the following example, a Kafka cluster comprises a quorum of controller and broker nodes for fault tolerance and high availability.
Figure 4.1. Example cluster with separate broker and controller nodes
In a typical production environment, use dedicated broker and controller nodes. However, you might want to use nodes in a dual-role configuration for development or testing.
You can use a combination of nodes that combine roles with nodes that perform a single role. In the following example, three nodes perform a dual role and two nodes act only as brokers.
Figure 4.2. Example cluster with dual-role nodes and dedicated broker nodes
4.1. Migrating to KRaft mode Copy linkLink copied to clipboard!
If you are using ZooKeeper for metadata management of your Kafka cluster, you can migrate to using Kafka in KRaft mode. KRaft mode replaces ZooKeeper for distributed coordination, offering enhanced reliability, scalability, and throughput.
To migrate your cluster, do as follows:
- Install a quorum of controller nodes to replace ZooKeeper for cluster management.
-
Enable KRaft migration in the controller configuration by setting the
zookeeper.metadata.migration.enableproperty totrue. - Start the controllers and enable KRaft migration on the current cluster brokers using the same configuration property.
- Perform a rolling restart of the brokers to apply the configuration changes.
- When migration is complete, switch the brokers to KRaft mode and disable migration on the controllers.
Once KRaft mode has been finalized, rollback to ZooKeeper is not possible. Carefully consider this before proceeding with the migration.
Before starting the migration, verify that your environment can support Kafka in KRaft mode:
- Migration is only supported on dedicated controller nodes, not on nodes with dual roles as brokers and controllers.
- Throughout the migration process, ZooKeeper and KRaft controller nodes operate in parallel, requiring sufficient compute resources in your cluster.
If you previously rolled back a KRaft migration, ensure that all required cleanup steps were completed before attempting to migrate again.
- The dedicated controller nodes must have been deleted.
-
The
/migrationznode in ZooKeeper must have been removed manually using thezookeeper-shell.shcommand:delete /migration.
Failing to perform this cleanup before reattempting the migration can result in metadata loss and migration failure.
Prerequisites
- You are logged in to Red Hat Enterprise Linux as the Kafka user.
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- You are using Streams for Apache Kafka 2.7 or newer with Kafka 3.7.0 or newer. If you are using an earlier version of Streams for Apache Kafka, upgrade before migrating to KRaft mode.
Logging is enabled to check the migration process.
Set
DEBUGlevel inlog4j.propertiesfor the root logger on the controllers and brokers in the cluster. For detailed migration-specific logs, setTRACEfor the migration logger:Controller logging configuration
log4j.rootLogger=DEBUG log4j.logger.org.apache.kafka.metadata.migration=TRACE
log4j.rootLogger=DEBUG log4j.logger.org.apache.kafka.metadata.migration=TRACECopy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure
Retrieve the cluster ID of your Kafka cluster.
Use the
zookeeper-shelltool:./bin/zookeeper-shell.sh localhost:2181 get /cluster/id
./bin/zookeeper-shell.sh localhost:2181 get /cluster/idCopy to Clipboard Copied! Toggle word wrap Toggle overflow The command returns the cluster ID.
Install a KRaft controller quorum to the cluster.
Configure a controller node on each host using the
controller.propertiesfile.At a minimum, each controller requires the following configuration:
- A unique node ID
-
The migration enabled flag set to
true - ZooKeeper connection details
- Listener name used by the controller quorum
- A quorum of controllers (dynamic is recommended)
Listener name for inter-broker communication
Example controller configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The format for the controller quorum is
<node_id>@<hostname>:<port>in a comma-separated list. The inter-broker listener name is required for the KRaft controller to initiate the migration.
Set up log directories for each controller node:
./bin/kafka-storage.sh format -t <uuid> -c ./config/kraft/controller.properties
./bin/kafka-storage.sh format -t <uuid> -c ./config/kraft/controller.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Returns:
Formatting /tmp/kraft-controller-logs
Formatting /tmp/kraft-controller-logsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <uuid> with the cluster ID you retrieved. Use the same cluster ID for each controller node in your cluster.
By default, the log directory (
log.dirs) specified in thecontroller.propertiesconfiguration file is set to/tmp/kraft-controller-logs. The/tmpdirectory is typically cleared on each system reboot, making it suitable for development environments only. Set multiple log directories using a comma-separated list, if needed.Start each controller.
./bin/kafka-server-start.sh -daemon ./config/kraft/controller.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/controller.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that Kafka is running:
jcmd | grep kafka
jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Returns:
process ID kafka.Kafka ./config/kraft/controller.properties
process ID kafka.Kafka ./config/kraft/controller.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the logs of each controller to ensure that they have successfully joined the KRaft cluster:
tail -f ./logs/controller.log
tail -f ./logs/controller.logCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Enable migration on each broker.
If running, stop the Kafka broker running on the host.
./bin/kafka-server-stop.sh jcmd | grep kafka
./bin/kafka-server-stop.sh jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow If using a multi-node cluster, refer to Section 3.7, “Performing a graceful rolling restart of Kafka brokers”.
Enable migration using the
server.propertiesfile.At a minimum, each broker requires the following additional configuration:
- Inter-broker protocol version set to version 3.9
- The migration enabled flag
- Controller configuration that matches the controller nodes
- A quorum of controllers
Example broker configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The ZooKeeper connection details should already be present.
Restart the updated broker:
./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow The migration starts automatically and can take some time depending on the number of topics and partitions in the cluster.
Check that Kafka is running:
jcmd | grep kafka
jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Returns:
process ID kafka.Kafka ./config/kraft/server.properties
process ID kafka.Kafka ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Check the log on the active controller to confirm that the migration is complete:
./bin/zookeeper-shell.sh localhost:2181 get /controller
./bin/zookeeper-shell.sh localhost:2181 get /controllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Look for an
INFOlog entry that says the following:Completed migration of metadata from ZooKeeper to KRaft.Switch each broker to KRaft mode.
- Stop the broker, as before.
Update the broker configuration in the
server.propertiesfile:-
Replace the
broker.idwith anode.idusing the same ID -
Add a
brokerKRaft role for the broker -
Remove the inter-broker protocol version (
inter.broker.protocol.version) -
Remove the migration enabled flag (
zookeeper.metadata.migration.enable) - Remove ZooKeeper configuration
-
Remove the listener for controller and broker communication (
control.plane.listener.name)
Example broker configuration for KRaft
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace the
If you are using ACLS in your broker configuration, update the authorizer using the
authorizer.class.nameproperty to the KRaft-based standard authorizer.ZooKeeper-based brokers use
authorizer.class.name=kafka.security.authorizer.AclAuthorizer.When migrating to KRaft-based brokers, specify
authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer.- Restart the broker, as before.
Switch each controller out of migration mode.
- Stop the controller in the same way as the broker, as described previously.
Update the controller configuration in the
controller.propertiesfile:- Remove the ZooKeeper connection details
-
Remove the
zookeeper.metadata.migration.enableproperty -
Remove
inter.broker.listener.name
Example controller configuration following migration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Restart the controller in the same way as the broker, as described previously.
4.1.1. Performing a rollback on the migration Copy linkLink copied to clipboard!
If you have finalized the migration, rollback is not possible.
Before finalizing the migration, while the cluster is still in migration mode, you can roll back to ZooKeeper mode. The steps depend on how far the migration has progressed.
- If you’ve only completed the preparation and controller quorum setup, stop and remove the dedicated KRaft controller nodes from the cluster. No other changes are required, and the cluster continues to operate in ZooKeeper mode.
If you’ve enabled metadata migration on the brokers, follow these steps:
- Stop and remove the dedicated KRaft controller nodes from the cluster.
Use
zookeeper-shell.shto do the following:-
Run
delete /controllerto allow a ZooKeeper-based broker to become the active controller. -
Run
get /migration, thendelete /migrationto inspect and clear the migration metadata (stored in the znode). This restores a clean state in ZooKeeper for retrying the migration.
WarningRun
delete /controllerpromptly to minimize controller downtime. Temporary errors in broker logs are expected until rollback completes.-
Run
On each broker:
Remove the following KRaft-related configurations:
-
zookeeper.metadata.migration.enable -
controller.listener.names -
controller.quorum.bootstrap.servers
-
-
Replace
node.idwithbroker.id
- Perform a rolling restart of all brokers.
If you’ve begun migrating brokers to KRaft, follow these steps:
On each broker:
-
Remove
process.roles. -
Replace
node.idwithbroker.id.
-
Remove
-
Restore
zookeeper.connectand any required ZooKeeper settings. Perform a first rolling restart of the brokers.
ImportantRetain
zookeeper.metadata.migration.enable=truefor the first restart.- Stop and remove the dedicated KRaft controller nodes from the cluster.
Use
zookeeper-shell.shto do the following:-
Run
delete /controllerto allow a ZooKeeper-based broker to become the active controller. -
Run
get /migration, thendelete /migrationto inspect and clear the migration metadata (stored in the znode). This restores a clean state in ZooKeeper for retrying the migration.
-
Run
On each broker, remove KRaft-related configuration:
-
zookeeper.metadata.migration.enable -
controller.listener.names -
controller.quorum.bootstrap.servers
-
- Perform a second rolling restart of the brokers.