此内容没有您所选择的语言版本。
Using Streams for Apache Kafka on RHEL in KRaft mode
Configure and manage a deployment of Streams for Apache Kafka 2.9 on Red Hat Enterprise Linux
Abstract
Preface 复制链接链接已复制到粘贴板!
Providing feedback on Red Hat documentation 复制链接链接已复制到粘贴板!
We appreciate your feedback on our documentation.
To propose improvements, open a Jira issue and describe your suggested changes. Provide as much detail as possible to enable us to address your request quickly.
Prerequisite
-
You have a Red Hat Customer Portal account. This account enables you to log in to the Red Hat Jira Software instance.
If you do not have an account, you will be prompted to create one.
Procedure
- Click the following: Create issue.
- In the Summary text box, enter a brief description of the issue.
In the Description text box, provide the following information:
- The URL of the page where you found the issue.
-
A detailed description of the issue.
You can leave the information in any other fields at their default values.
- Add a reporter name.
- Click Create to submit the Jira issue to the documentation team.
Thank you for taking the time to provide feedback.
Chapter 1. Overview of Streams for Apache Kafka 复制链接链接已复制到粘贴板!
Streams for Apache Kafka supports highly scalable, distributed, and high-performance data streaming based on the Apache Kafka project.
The main components comprise:
- Kafka Broker
- Messaging broker responsible for delivering records from producing clients to consuming clients.
- Kafka Streams API
- API for writing stream processor applications.
- Producer and Consumer APIs
- Java-based APIs for producing and consuming messages to and from Kafka brokers.
- Kafka Bridge
- Streams for Apache Kafka Bridge provides a RESTful interface that allows HTTP-based clients to interact with a Kafka cluster.
- Kafka Connect
- A toolkit for streaming data between Kafka brokers and other systems using Connector plugins.
- Kafka MirrorMaker
- Replicates data between two Kafka clusters, within or across data centers.
- Kafka Exporter
- An exporter used in the extraction of Kafka metrics data for monitoring.
A cluster of Kafka brokers is the hub connecting all these components.
Figure 1.1. Streams for Apache Kafka architecture
You can use the Kafka Bridge API to create and manage consumers and send and receive records over HTTP rather than the native Kafka protocol.
When you set up the Kafka Bridge you configure HTTP access to the Kafka cluster. You can then use the Kafka Bridge to produce and consume messages from the cluster, as well as performing other operations through its REST interface.
1.2. Document conventions 复制链接链接已复制到粘贴板!
User-replaced values
User-replaced values, also known as replaceables, are shown in with angle brackets (< >). Underscores ( _ ) are used for multi-word values. If the value refers to code or commands, monospace is also used.
For example, the following code shows that <bootstrap_address> and <topic_name> must be replaced with your own address and topic name:
bin/kafka-console-consumer.sh --bootstrap-server <broker_host>:<port> --topic <topic_name> --from-beginning
bin/kafka-console-consumer.sh --bootstrap-server <broker_host>:<port> --topic <topic_name> --from-beginning
Chapter 2. FIPS support 复制链接链接已复制到粘贴板!
Federal Information Processing Standards (FIPS) are standards for computer security and interoperability. To use FIPS with Streams for Apache Kafka, you must have a FIPS-compliant OpenJDK (Open Java Development Kit) installed on your system. If your RHEL system is FIPS-enabled, OpenJDK automatically switches to FIPS mode when running Streams for Apache Kafka. This ensures that Streams for Apache Kafka uses the FIPS-compliant security libraries provided by OpenJDK.
Minimum password length
When running in the FIPS mode, SCRAM-SHA-512 passwords need to be at least 32 characters long. If you have a Kafka cluster with custom configuration that uses a password length that is less than 32 characters, you need to update your configuration. If you have any users with passwords shorter than 32 characters, you need to regenerate a password with the required length.
Enable FIPS mode before you install Streams for Apache Kafka on RHEL. Red Hat recommends installing RHEL with FIPS mode enabled, as opposed to enabling FIPS mode later. Enabling FIPS mode during the installation ensures that the system generates all keys with FIPS-approved algorithms and continuous monitoring tests in place.
With RHEL running in FIPS mode, you must ensure that the Streams for Apache Kafka configuration is FIPS-compliant. Additionally, your Java implementation must also be FIPS-compliant.
Running Streams for Apache Kafka on RHEL in FIPS mode requires a FIPS-compliant JDK.
Procedure
Install RHEL in FIPS mode.
For further information, see the information on security hardening in the RHEL documentation.
- Proceed with the installation of Streams for Apache Kafka.
Configure Streams for Apache Kafka to use FIPS-compliant algorithms and protocols.
If used, ensure that the following configuration is compliant:
- SSL cipher suites and TLS versions must be supported by the JDK framework.
- SCRAM-SHA-512 passwords must be at least 32 characters long.
Make sure that your installation environment and Streams for Apache Kafka configuration remains compliant as FIPS requirements change.
Chapter 3. Getting started 复制链接链接已复制到粘贴板!
Streams for Apache Kafka is distributed in a ZIP file that contains installation artifacts for the Kafka components.
The Kafka Bridge has separate installation files. For information on installing and using the Kafka Bridge, see Using the Streams for Apache Kafka Bridge.
3.1. Installation environment 复制链接链接已复制到粘贴板!
Streams for Apache Kafka runs on Red Hat Enterprise Linux. The host (node) can be a physical or virtual machine (VM). Use the installation files provided with Streams for Apache Kafka to install Kafka components. You can install Kafka in a single-node or multi-node environment.
- Single-node environment
- A single-node Kafka cluster runs instances of Kafka components on a single host. This configuration is not suitable for a production environment.
- Multi-node environment
- A multi-node Kafka cluster runs instances of Kafka components on multiple hosts.
We recommended that you run Kafka and other Kafka components, such as Kafka Connect, on separate hosts. By running the components in this way, it’s easier to maintain and upgrade each component.
Kafka clients establish a connection to the Kafka cluster using the bootstrap.servers configuration property. If you are using Kafka Connect, for example, the Kafka Connect configuration properties must include a bootstrap.servers value that specifies the hostname and port of the hosts where the Kafka brokers are running. If the Kafka cluster is running on more than one host with multiple Kafka brokers, you specify a hostname and port for each broker. Each Kafka broker is identified by a node.id.
3.1.1. Data storage considerations 复制链接链接已复制到粘贴板!
Apache Kafka stores records in data logs, which are configured using the log.dirs property. For more information, see Section 5.3.2, “Data logs”.
Efficient data storage is essential for Strimzi to operate effectively, and block storage is strongly recommended. Strimzi has been tested only with block storage, and file storage solutions like NFS are not guaranteed to work.
Common block storage types supported by Kubernetes include:
Cloud-based block storage solutions:
- Amazon EBS (for AWS)
- Azure Disk Storage (for Microsoft Azure) **Persistent Disk (for Google Cloud)
- Persistent storage (for bare metal deployments) using local persistent volumes
- Storage Area Network (SAN) volumes accessed by protocols like Fibre Channel or iSCSI
3.1.2. File systems 复制链接链接已复制到粘贴板!
Kafka uses a file system for storing messages. Streams for Apache Kafka is compatible with the XFS and ext4 file systems, which are commonly used with Kafka. Consider the underlying architecture and requirements of your deployment when choosing and setting up your file system.
For more information, refer to Filesystem Selection in the Kafka documentation.
3.1.3. Disk usage 复制链接链接已复制到粘贴板!
Solid-state drives (SSDs), though not essential, can improve the performance of Kafka in large clusters where data is sent to and received from multiple topics asynchronously.
Replicated storage is not required, as Kafka provides built-in data replication.
3.2. Downloading Streams for Apache Kafka 复制链接链接已复制到粘贴板!
A ZIP file distribution of Streams for Apache Kafka is available for download from the Red Hat website. You can download the latest version of Red Hat Streams for Apache Kafka from the Streams for Apache Kafka software downloads page.
-
For Kafka and other Kafka components, download the
amq-streams-<version>-kafka-bin.zipfile For Kafka Bridge, download the
amq-streams-<version>-bridge-bin.zipfile.For installation instructions, see Using the Streams for Apache Kafka Bridge.
After downloading Streams for Apache Kafka, set up the directory structure and configure a dedicated user to manage the Kafka installation. This provides proper permissions and a consistent environment for running Kafka.
The following setup is assumed throughout the rest of this guide, but you can adapt it to suit your specific requirements.
- Installation directory
-
Install Kafka in the
/opt/kafka/directory to maintain a standard location. This guide assumes all Kafka commands are executed from this directory. - Dedicated Kafka User
-
Create a system user, such as
kafka, to manage the installation. Change ownership of the/opt/kafka/directory to this user to ensure secure access. You can use any username that fits your environment.
3.4. Installing Kafka 复制链接链接已复制到粘贴板!
Use the Streams for Apache Kafka ZIP files to install Kafka on Red Hat Enterprise Linux. You can install Kafka in either a single-node or multi-node environment. This procedure focuses on installing a single Kafka instance on a single host (node). For this setup, Kafka is installed in the /opt/kafka/ directory, and a dedicated Kafka user (kafka) is used to manage the installation.
The Streams for Apache Kafka installation files include the binaries for running other Kafka components, like Kafka Connect, Kafka MirrorMaker 2, and Kafka Bridge. In a single-node environment, you can run these components from the same host where you installed Kafka. However, we recommend that you add the installation files and run other Kafka components on separate hosts.
Prerequisites
- You have downloaded the installation files.
- You have reviewed the supported configurations in the Streams for Apache Kafka 2.9 on Red Hat Enterprise Linux Release Notes.
-
You are logged in to Red Hat Enterprise Linux as admin (
root) user.
Procedure
Install Kafka on your host.
Add a new Kafka user and group:
groupadd kafka useradd -g kafka kafka passwd kafka
groupadd kafka useradd -g kafka kafka passwd kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow This step creates the necessary user and group for managing Kafka.
Extract and move the contents of the
amq-streams-<version>-kafka-bin.zipfile and place it in the/opt/kafkadirectory:unzip amq-streams-<version>-kafka-bin.zip -d /opt mv /opt/kafka*redhat* /opt/kafka
unzip amq-streams-<version>-kafka-bin.zip -d /opt mv /opt/kafka*redhat* /opt/kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Change the ownership of the
/opt/kafkadirectory to the Kafka user:chown -R kafka:kafka /opt/kafka
chown -R kafka:kafka /opt/kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
/var/lib/kafkadirectory for storing Kafka data and change its ownership to the Kafka user:mkdir /var/lib/kafka chown -R kafka:kafka /var/lib/kafka
mkdir /var/lib/kafka chown -R kafka:kafka /var/lib/kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow You can now run a default configuration of Kafka as a single-node cluster.
(Optional) Run other Kafka components, like Kafka Connect, on the same host.
To run other components, specify the hostname and port to connect to the Kafka broker using the
bootstrap.serversproperty in the component configuration.Example bootstrap servers configuration pointing to a single Kafka broker on the same host
bootstrap.servers=localhost:9092
bootstrap.servers=localhost:9092Copy to Clipboard Copied! Toggle word wrap Toggle overflow However, we recommend installing and running Kafka components on separate hosts.
(Optional) Install Kafka components on separate hosts.
-
Extract the Kafka installation files into the
/opt/kafka/directory on each host. -
Change the ownership of the
/opt/kafka/directory to the Kafka user on each host. Update the
bootstrap.serversproperty to connect the components to the Kafka brokers running on different hosts.Example bootstrap servers configuration pointing to Kafka brokers on different hosts
bootstrap.servers=kafka0.<host_ip_address>:9092,kafka1.<host_ip_address>:9092,kafka2.<host_ip_address>:9092
bootstrap.servers=kafka0.<host_ip_address>:9092,kafka1.<host_ip_address>:9092,kafka2.<host_ip_address>:9092Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can use this configuration for Kafka Connect, MirrorMaker 2, and the Kafka Bridge.
-
Extract the Kafka installation files into the
3.5. Running a Kafka cluster in KRaft mode 复制链接链接已复制到粘贴板!
Configure and run a Kafka cluster in KRaft mode. Nodes in the cluster perform the role of broker, controller, or both.
- Broker role
- Manages the storage and passing of messages.
- Controller role
- Coordinates the cluster and manages metadata.
- Combined role
- A single node acts as both broker and controller.
The minimum recommended setup is three brokers and three controllers with topic replication for stability and availability. The internal __cluster_metadata topic stores cluster-wide information.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
-
The procedure uses
kafka-storage.sh,kafka-server-start.sh, andkafka-metadata-quorum.shtools.
Procedure
Generate a unique Kafka cluster ID using the
kafka-storagetool:./bin/kafka-storage.sh random-uuid
./bin/kafka-storage.sh random-uuidCopy to Clipboard Copied! Toggle word wrap Toggle overflow Save this ID as it is reused for all nodes.
Create a configuration file for each node.
Base the configuration file on Kafka’s provided examples:
-
controller.propertiesfor controller-only nodes -
broker.propertiesfor broker-only nodes -
server.propertiesfor nodes with both roles
NoteThe example
server.propertiesfile usescontroller.quorum.votersby default, which is intended for static quorum setup. To use dynamic quorum (recommended), replace this withcontroller.quorum.bootstrap.servers, as shown in this procedure. Thecontroller.propertiesfile usescontroller.quorum.bootstrap.serversby default.Adjust the following properties for each node:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Important considerations:
-
Each node must have a unique
node.id. -
listenershostname (0.0.0.0) binds to all interfaces but can be changed to a specific hostname or IP address if needed for each node. -
advertised.listenersmust reflect the actual address that clients use to connect to the Kafka node. -
The
log.dirspath in the configuration file defines where metadata is stored. If not set, it defaults to a temporary location, which is cleared on reboot and suitable only for development.
-
Bootstrap the controller quorum (one controller only):
./bin/kafka-storage.sh format --cluster-id <uuid> --standalone --config ./config/controller.properties
./bin/kafka-storage.sh format --cluster-id <uuid> --standalone --config ./config/controller.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow This initializes the metadata log and bootstraps the quorum. Use it on one node only. Replace
<uuid>with the generated cluster ID.NoteTo bootstrap with multiple controllers, use
--initial-controllersinstead.Start the controller:
./bin/kafka-server-start.sh ./config/controller.properties
./bin/kafka-server-start.sh ./config/controller.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add additional controllers to the quorum.
Make sure
controller.quorum.bootstrap.serversis correctly set across all nodes and the controllers are started.Format the remaining nodes (brokers):
./bin/kafka-storage.sh format --cluster-id <uuid> --config ./config/server.properties --no-initial-controllers
./bin/kafka-storage.sh format --cluster-id <uuid> --config ./config/server.properties --no-initial-controllersCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use the appropriate configuration file based on the role of each node. Use the same cluster ID for all nodes. This step prepares each node’s storage without modifying the controller quorum.
Start each broker node:
./bin/kafka-server-start.sh ./config/server.properties
./bin/kafka-server-start.sh ./config/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that Kafka is running:
jcmd | grep kafka
jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Returns:
process ID kafka.Kafka ./config/server.properties
process ID kafka.Kafka ./config/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the logs of each node to ensure that they have successfully joined the KRaft cluster:
tail -f ./logs/server.log
tail -f ./logs/server.logCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Next steps
You can now create topics, and send and receive messages from the brokers.
For brokers passing messages, you can use topic replication across the brokers in a cluster for data durability. Configure topics to have a replication factor of at least three and a minimum number of in-sync replicas set to 1 less than the replication factor (replication.factor=3, with min.insync.replicas=2). For more information, see Section 9.7, “Creating a topic”.
3.6. Stopping the Streams for Apache Kafka services 复制链接链接已复制到粘贴板!
You can stop Kafka services by running a script. After running the script, all connections to the Kafka services are terminated.
Procedure
Stop the Kafka node.
./bin/kafka-server-stop.sh
./bin/kafka-server-stop.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the Kafka node is stopped.
jcmd | grep kafka
jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow
This procedure shows how to do a graceful rolling restart of brokers in a multi-node cluster. A rolling restart is usually required following an upgrade or change to the Kafka cluster configuration properties.
Some broker configurations do not need a restart of the broker. For more information, see Updating Broker Configs in the Apache Kafka documentation.
After you perform a restart of a broker, check for under-replicated topic partitions to make sure that replica partitions have caught up.
To achieve a graceful restart with no loss of availability, ensure that you are replicating topics and that at least the minimum number of replicas (min.insync.replicas) replicas are in sync. The min.insync.replicas configuration determines the minimum number of replicas that must acknowledge a write for the write to be considered successful.
For a multi-node cluster, the standard approach is to have a topic replication factor of at least 3 and a minimum number of in-sync replicas set to 1 less than the replication factor. If you are using acks=all in your producer configuration for data durability, check that the broker you restarted is in sync with all the partitions it’s replicating before restarting the next broker.
Single-node clusters are unavailable during a restart, since all partitions are on the same broker.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
The Kafka cluster is operating as expected.
Check for under-replicated partitions or any other issues affecting broker operation. The steps in this procedure describe how to check for under-replicated partitions.
Procedure
Perform the following steps on each Kafka broker. Complete the steps on the first broker before moving on to the next. Perform the steps on the brokers that also act as controllers last. Otherwise, the controllers need to change on more than one restart.
Stop the Kafka broker:
./bin/kafka-server-stop.sh
./bin/kafka-server-stop.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Make any changes to the broker configuration that require a restart after completion.
For further information, see the following:
Restart the Kafka broker:
./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that Kafka is running:
jcmd | grep kafka
jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Returns:
process ID kafka.Kafka ./config/kraft/server.properties
process ID kafka.Kafka ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the logs of each node to ensure that they have successfully joined the KRaft cluster:
tail -f ./logs/server.log
tail -f ./logs/server.logCopy to Clipboard Copied! Toggle word wrap Toggle overflow Wait until the broker has zero under-replicated partitions. You can check from the command line or use metrics.
Use the
kafka-topics.shcommand with the--under-replicated-partitionsparameter:./bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --describe --under-replicated-partitions
./bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --describe --under-replicated-partitionsCopy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe --under-replicated-partitions
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe --under-replicated-partitionsCopy to Clipboard Copied! Toggle word wrap Toggle overflow The command provides a list of topics with under-replicated partitions in a cluster.
Topics with under-replicated partitions
Topic: topic3 Partition: 4 Leader: 2 Replicas: 2,3 Isr: 2 Topic: topic3 Partition: 5 Leader: 3 Replicas: 1,2 Isr: 1 Topic: topic1 Partition: 1 Leader: 3 Replicas: 1,3 Isr: 3 # …
Topic: topic3 Partition: 4 Leader: 2 Replicas: 2,3 Isr: 2 Topic: topic3 Partition: 5 Leader: 3 Replicas: 1,2 Isr: 1 Topic: topic1 Partition: 1 Leader: 3 Replicas: 1,3 Isr: 3 # …Copy to Clipboard Copied! Toggle word wrap Toggle overflow Under-replicated partitions are listed if the ISR (in-sync replica) count is less than the number of replicas. If a list is not returned, there are no under-replicated partitions.
Use the
UnderReplicatedPartitionsmetric:kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions
kafka.server:type=ReplicaManager,name=UnderReplicatedPartitionsCopy to Clipboard Copied! Toggle word wrap Toggle overflow The metric provides a count of partitions where replicas have not caught up. You wait until the count is zero.
TipUse the Kafka Exporter to create an alert when there are one or more under-replicated partitions for a topic.
Checking logs when restarting
If a broker fails to start, check the application logs for information. You can also check the status of a broker shutdown and restart in the ./logs/server.log application log.
Chapter 4. Using Kafka in KRaft mode 复制链接链接已复制到粘贴板!
KRaft (Kafka Raft metadata) mode replaces Kafka’s dependency on ZooKeeper for cluster management. KRaft mode simplifies the deployment and management of Kafka clusters by bringing metadata management and coordination of clusters into Kafka.
Kafka in KRaft mode is designed to offer enhanced reliability, scalability, and throughput. Metadata operations become more efficient as they are directly integrated. And by removing the need to maintain a ZooKeeper cluster, there’s also a reduction in the operational and security overhead.
Through Kafka configuration, nodes are assigned the role of broker, controller, or both:
- Controller nodes operate in the control plane to manage cluster metadata and the state of the cluster using a Raft-based consensus protocol.
- Broker nodes operate in the data plane to manage the streaming of messages, receiving and storing data in topic partitions.
- Dual-role nodes fulfill the responsibilities of controllers and brokers.
You can use a dynamic or static quorum of controllers. Dynamic is recommended as it supports dynamic scaling.
Controllers use a metadata log, stored as a single-partition topic (__cluster_metadata) on every node, which records the state of the cluster. When requests are made to change the cluster configuration, an active (lead) controller manages updates to the metadata log, and follower controllers replicate these updates. The metadata log stores information on brokers, replicas, topics, and partitions, including the state of in-sync replicas and partition leadership. Kafka uses this metadata to coordinate changes and manage the cluster effectively.
Broker nodes act as observers, storing the metadata log passively to stay up-to-date with the cluster’s state. Each node fetches updates to the log independently. If you are using JBOD storage, you can change the directory that stores the metadata log.
The KRaft metadata version used in the Kafka cluster must be supported by the Kafka version in use.
In the following example, a Kafka cluster comprises a quorum of controller and broker nodes for fault tolerance and high availability.
Figure 4.1. Example cluster with separate broker and controller nodes
In a typical production environment, use dedicated broker and controller nodes. However, you might want to use nodes in a dual-role configuration for development or testing.
You can use a combination of nodes that combine roles with nodes that perform a single role. In the following example, three nodes perform a dual role and two nodes act only as brokers.
Figure 4.2. Example cluster with dual-role nodes and dedicated broker nodes
4.1. Migrating to KRaft mode 复制链接链接已复制到粘贴板!
If you are using ZooKeeper for metadata management of your Kafka cluster, you can migrate to using Kafka in KRaft mode. KRaft mode replaces ZooKeeper for distributed coordination, offering enhanced reliability, scalability, and throughput.
To migrate your cluster, do as follows:
- Install a quorum of controller nodes to replace ZooKeeper for cluster management.
-
Enable KRaft migration in the controller configuration by setting the
zookeeper.metadata.migration.enableproperty totrue. - Start the controllers and enable KRaft migration on the current cluster brokers using the same configuration property.
- Perform a rolling restart of the brokers to apply the configuration changes.
- When migration is complete, switch the brokers to KRaft mode and disable migration on the controllers.
Once KRaft mode has been finalized, rollback to ZooKeeper is not possible. Carefully consider this before proceeding with the migration.
Before starting the migration, verify that your environment can support Kafka in KRaft mode:
- Migration is only supported on dedicated controller nodes, not on nodes with dual roles as brokers and controllers.
- Throughout the migration process, ZooKeeper and KRaft controller nodes operate in parallel, requiring sufficient compute resources in your cluster.
If you previously rolled back a KRaft migration, ensure that all required cleanup steps were completed before attempting to migrate again.
- The dedicated controller nodes must have been deleted.
-
The
/migrationznode in ZooKeeper must have been removed manually using thezookeeper-shell.shcommand:delete /migration.
Failing to perform this cleanup before reattempting the migration can result in metadata loss and migration failure.
Prerequisites
- You are logged in to Red Hat Enterprise Linux as the Kafka user.
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- You are using Streams for Apache Kafka 2.7 or newer with Kafka 3.7.0 or newer. If you are using an earlier version of Streams for Apache Kafka, upgrade before migrating to KRaft mode.
Logging is enabled to check the migration process.
Set
DEBUGlevel inlog4j.propertiesfor the root logger on the controllers and brokers in the cluster. For detailed migration-specific logs, setTRACEfor the migration logger:Controller logging configuration
log4j.rootLogger=DEBUG log4j.logger.org.apache.kafka.metadata.migration=TRACE
log4j.rootLogger=DEBUG log4j.logger.org.apache.kafka.metadata.migration=TRACECopy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure
Retrieve the cluster ID of your Kafka cluster.
Use the
zookeeper-shelltool:./bin/zookeeper-shell.sh localhost:2181 get /cluster/id
./bin/zookeeper-shell.sh localhost:2181 get /cluster/idCopy to Clipboard Copied! Toggle word wrap Toggle overflow The command returns the cluster ID.
Install a KRaft controller quorum to the cluster.
Configure a controller node on each host using the
controller.propertiesfile.At a minimum, each controller requires the following configuration:
- A unique node ID
-
The migration enabled flag set to
true - ZooKeeper connection details
- Listener name used by the controller quorum
- A quorum of controllers (dynamic is recommended)
Listener name for inter-broker communication
Example controller configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The format for the controller quorum is
<node_id>@<hostname>:<port>in a comma-separated list. The inter-broker listener name is required for the KRaft controller to initiate the migration.
Set up log directories for each controller node:
./bin/kafka-storage.sh format -t <uuid> -c ./config/kraft/controller.properties
./bin/kafka-storage.sh format -t <uuid> -c ./config/kraft/controller.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Returns:
Formatting /tmp/kraft-controller-logs
Formatting /tmp/kraft-controller-logsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <uuid> with the cluster ID you retrieved. Use the same cluster ID for each controller node in your cluster.
By default, the log directory (
log.dirs) specified in thecontroller.propertiesconfiguration file is set to/tmp/kraft-controller-logs. The/tmpdirectory is typically cleared on each system reboot, making it suitable for development environments only. Set multiple log directories using a comma-separated list, if needed.Start each controller.
./bin/kafka-server-start.sh -daemon ./config/kraft/controller.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/controller.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that Kafka is running:
jcmd | grep kafka
jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Returns:
process ID kafka.Kafka ./config/kraft/controller.properties
process ID kafka.Kafka ./config/kraft/controller.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the logs of each controller to ensure that they have successfully joined the KRaft cluster:
tail -f ./logs/controller.log
tail -f ./logs/controller.logCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Enable migration on each broker.
If running, stop the Kafka broker running on the host.
./bin/kafka-server-stop.sh jcmd | grep kafka
./bin/kafka-server-stop.sh jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow If using a multi-node cluster, refer to Section 3.7, “Performing a graceful rolling restart of Kafka brokers”.
Enable migration using the
server.propertiesfile.At a minimum, each broker requires the following additional configuration:
- Inter-broker protocol version set to version 3.9
- The migration enabled flag
- Controller configuration that matches the controller nodes
- A quorum of controllers
Example broker configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The ZooKeeper connection details should already be present.
Restart the updated broker:
./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow The migration starts automatically and can take some time depending on the number of topics and partitions in the cluster.
Check that Kafka is running:
jcmd | grep kafka
jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Returns:
process ID kafka.Kafka ./config/kraft/server.properties
process ID kafka.Kafka ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Check the log on the active controller to confirm that the migration is complete:
./bin/zookeeper-shell.sh localhost:2181 get /controller
./bin/zookeeper-shell.sh localhost:2181 get /controllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Look for an
INFOlog entry that says the following:Completed migration of metadata from ZooKeeper to KRaft.Switch each broker to KRaft mode.
- Stop the broker, as before.
Update the broker configuration in the
server.propertiesfile:-
Replace the
broker.idwith anode.idusing the same ID -
Add a
brokerKRaft role for the broker -
Remove the inter-broker protocol version (
inter.broker.protocol.version) -
Remove the migration enabled flag (
zookeeper.metadata.migration.enable) - Remove ZooKeeper configuration
-
Remove the listener for controller and broker communication (
control.plane.listener.name)
Example broker configuration for KRaft
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace the
If you are using ACLS in your broker configuration, update the authorizer using the
authorizer.class.nameproperty to the KRaft-based standard authorizer.ZooKeeper-based brokers use
authorizer.class.name=kafka.security.authorizer.AclAuthorizer.When migrating to KRaft-based brokers, specify
authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer.- Restart the broker, as before.
Switch each controller out of migration mode.
- Stop the controller in the same way as the broker, as described previously.
Update the controller configuration in the
controller.propertiesfile:- Remove the ZooKeeper connection details
-
Remove the
zookeeper.metadata.migration.enableproperty -
Remove
inter.broker.listener.name
Example controller configuration following migration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Restart the controller in the same way as the broker, as described previously.
4.1.1. Performing a rollback on the migration 复制链接链接已复制到粘贴板!
If you have finalized the migration, rollback is not possible.
Before finalizing the migration, while the cluster is still in migration mode, you can roll back to ZooKeeper mode. The steps depend on how far the migration has progressed.
- If you’ve only completed the preparation and controller quorum setup, stop and remove the dedicated KRaft controller nodes from the cluster. No other changes are required, and the cluster continues to operate in ZooKeeper mode.
If you’ve enabled metadata migration on the brokers, follow these steps:
- Stop and remove the dedicated KRaft controller nodes from the cluster.
Use
zookeeper-shell.shto do the following:-
Run
delete /controllerto allow a ZooKeeper-based broker to become the active controller. -
Run
get /migration, thendelete /migrationto inspect and clear the migration metadata (stored in the znode). This restores a clean state in ZooKeeper for retrying the migration.
WarningRun
delete /controllerpromptly to minimize controller downtime. Temporary errors in broker logs are expected until rollback completes.-
Run
On each broker:
Remove the following KRaft-related configurations:
-
zookeeper.metadata.migration.enable -
controller.listener.names -
controller.quorum.bootstrap.servers
-
-
Replace
node.idwithbroker.id
- Perform a rolling restart of all brokers.
If you’ve begun migrating brokers to KRaft, follow these steps:
On each broker:
-
Remove
process.roles. -
Replace
node.idwithbroker.id.
-
Remove
-
Restore
zookeeper.connectand any required ZooKeeper settings. Perform a first rolling restart of the brokers.
ImportantRetain
zookeeper.metadata.migration.enable=truefor the first restart.- Stop and remove the dedicated KRaft controller nodes from the cluster.
Use
zookeeper-shell.shto do the following:-
Run
delete /controllerto allow a ZooKeeper-based broker to become the active controller. -
Run
get /migration, thendelete /migrationto inspect and clear the migration metadata (stored in the znode). This restores a clean state in ZooKeeper for retrying the migration.
-
Run
On each broker, remove KRaft-related configuration:
-
zookeeper.metadata.migration.enable -
controller.listener.names -
controller.quorum.bootstrap.servers
-
- Perform a second rolling restart of the brokers.
Chapter 5. Configuring Streams for Apache Kafka 复制链接链接已复制到粘贴板!
Use the Kafka configuration properties files to configure Streams for Apache Kafka.
The properties file is in Java format, with each property on a separate line in the following format:
<option> = <value>
<option> = <value>
Lines starting with # or ! are treated as comments and are ignored by Streams for Apache Kafka components.
This is a comment
# This is a comment
Values can be split into multiple lines by using \ directly before the newline/carriage return.
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="bob" \
password="bobs-password";
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="bob" \
password="bobs-password";
After saving the changes in the properties file, you need to restart the Kafka node. In a multi-node environment, repeat the process on each node in the cluster.
5.1. Using standard Kafka configuration properties 复制链接链接已复制到粘贴板!
Use standard Kafka configuration properties to configure Kafka components.
The properties provide options to control and tune the configuration of the following Kafka components:
- Brokers
- Topics
- Producer, consumer, and management clients
- Kafka Connect
- Kafka Streams
Broker and client parameters include options to configure authorization, authentication and encryption.
For further information on Kafka configuration properties and how to use the properties to tune your deployment, see the following guides:
Use the Environment Variables Configuration Provider plugin to load configuration data from environment variables. You can use the Environment Variables Configuration Provider, for example, to load certificates or JAAS configuration from environment variables.
You can use the provider to load configuration data for all Kafka components, including producers and consumers. Use the provider, for example, to provide the credentials for Kafka Connect connector configuration.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
The Environment Variables Configuration Provider JAR file.
The JAR file is available from the Streams for Apache Kafka archive.
Procedure
-
Add the Environment Variables Configuration Provider JAR file to the Kafka
libsdirectory. Initialize the Environment Variables Configuration Provider in the configuration properties file of the Kafka component. For example, to initialize the provider for Kafka, add the configuration to the
server.propertiesfile.Configuration to enable the Environment Variables Configuration Provider
config.providers.env.class=org.apache.kafka.common.config.provider.EnvVarConfigProvider
config.providers.env.class=org.apache.kafka.common.config.provider.EnvVarConfigProviderCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add configuration to the properties file to load data from environment variables.
Configuration to load data from an environment variable
option=${env:<MY_ENV_VAR_NAME>}option=${env:<MY_ENV_VAR_NAME>}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use capitalized or upper-case environment variable naming conventions, such as
MY_ENV_VAR_NAME.- Save the changes.
Restart the Kafka component.
For information on restarting brokers in a multi-node cluster, see Section 3.7, “Performing a graceful rolling restart of Kafka brokers”.
5.3. Configuring Kafka 复制链接链接已复制到粘贴板!
Kafka uses properties files to store static configuration. The recommended location for the configuration files is ./config/kraft/. The configuration files have to be readable by the Kafka user.
Streams for Apache Kafka ships example configuration files that highlight various basic and advanced features of the product. They can be found under config/kraft/ in the Streams for Apache Kafka installation directory as follows:
-
(default)
config/kraft/server.propertiesfor nodes running in combined mode -
config/kraft/broker.propertiesfor nodes running as brokers -
config/kraft/controller.propertiesfor nodes running as controllers
This chapter explains the most important configuration options.
5.3.1. Listeners 复制链接链接已复制到粘贴板!
Listeners are used to connect to Kafka brokers. Each Kafka broker can be configured to use multiple listeners. Each listener requires a different configuration so it can listen on a different port or network interface.
To configure listeners, edit the listeners property in the Kafka configuration properties file. Add listeners to the listeners property as a comma-separated list. Configure each property as follows:
<listener_name>://<hostname>:<port>
<listener_name>://<hostname>:<port>
If <hostname> is empty, Kafka uses the java.net.InetAddress.getCanonicalHostName() class as the hostname.
Example configuration for multiple listeners
listeners=internal-1://:9092,internal-2://:9093,replication://:9094
listeners=internal-1://:9092,internal-2://:9093,replication://:9094
When a Kafka client wants to connect to a Kafka cluster, it first connects to the bootstrap server, which is one of the cluster nodes. The bootstrap server provides the client with a list of all the brokers in the cluster, and the client connects to each one individually. The list of brokers is based on the configured listeners.
Advertised listeners
Optionally, you can use the advertised.listeners property to provide the client with a different set of listener addresses than those given in the listeners property. This is useful if additional network infrastructure, such as a proxy, is between the client and the broker, or an external DNS name is being used instead of an IP address.
The advertised.listeners property is formatted in the same way as the listeners property.
Example configuration for advertised listeners
listeners=internal-1://:9092,internal-2://:9093 advertised.listeners=internal-1://my-broker-1.my-domain.com:1234,internal-2://my-broker-1.my-domain.com:1235
listeners=internal-1://:9092,internal-2://:9093
advertised.listeners=internal-1://my-broker-1.my-domain.com:1234,internal-2://my-broker-1.my-domain.com:1235
The names of the advertised listeners must match those listed in the listeners property.
Inter-broker listeners
Inter-broker listeners are used for communication between Kafka brokers. Inter-broker communication is required for:
- Coordinating workloads between different brokers
- Replicating messages between partitions stored on different brokers
The inter-broker listener can be assigned to a port of your choice. When multiple listeners are configured, you can define the name of the inter-broker listener in the inter.broker.listener.name property of your broker configuration.
Here, the inter-broker listener is named as REPLICATION:
listeners=REPLICATION://0.0.0.0:9091 inter.broker.listener.name=REPLICATION
listeners=REPLICATION://0.0.0.0:9091
inter.broker.listener.name=REPLICATION
Controller listeners
Controller configuration is used to connect and communicate with the controller that coordinates the cluster and manages the metadata used to track the status of brokers and partitions.
By default, communication between the controllers and brokers uses a dedicated controller listener. Controllers are responsible for coordinating administrative tasks, such as partition leadership changes, so one or more of these listeners is required.
Specify listeners to use for controllers using the controller.listener.names property. You can specify a dynamic quorum of controllers using the controller.quorum.bootstrap.servers property. The quorum enables a leader-follower structure for administrative tasks, with the leader actively managing operations and followers as hot standbys, ensuring metadata consistency in memory and facilitating failover.
listeners=CONTROLLER://0.0.0.0:9090 controller.listener.names=CONTROLLER controller.quorum.bootstrap.servers=localhost:9090
listeners=CONTROLLER://0.0.0.0:9090
controller.listener.names=CONTROLLER
controller.quorum.bootstrap.servers=localhost:9090
The format for the controller quorum is <hostname>:<port>.
5.3.2. Data logs 复制链接链接已复制到粘贴板!
Apache Kafka stores all records it receives from producers in logs. The logs contain the actual data, in the form of records, that Kafka needs to deliver. Note that these records differ from application log files, which detail the broker’s activities.
Log directories
You can configure log directories using the log.dirs property in the server configuration properties file to store logs in one or multiple log directories. It should be set to /var/lib/kafka directory created during installation:
Data log configuration
log.dirs=/var/lib/kafka
log.dirs=/var/lib/kafka
For performance reasons, you can configure log.dirs to multiple directories and place each of them on a different physical device to improve disk I/O performance. For example:
Configuration for multiple directories
log.dirs=/var/lib/kafka1,/var/lib/kafka2,/var/lib/kafka3
log.dirs=/var/lib/kafka1,/var/lib/kafka2,/var/lib/kafka3
5.3.3. Metadata log 复制链接链接已复制到粘贴板!
Controllers use a metadata log stored as a single-partition topic (__cluster_metadata) on every node. It records the state of the cluster, storing information on brokers, replicas, topics, and partitions, including the state of in-sync replicas and partition leadership.
Metadata log directory
You can configure the directory for storing the metadata log using the metadata.log.dir property. By default, if this property is not set, Kafka uses the log.dirs property to determine the storage directory for both data logs and metadata logs. The metadata log is placed in the first directory specified for log.dirs.
Isolating metadata operations from data operations can improve manageability and potentially lead to performance gains. To set a specific directory for the metadata log, include the metadata.log.dir property in the server configuration properties file.
For example:
Metadata log configuration
log.dirs=/var/lib/kafka metadata.log.dir=/var/lib/kafka-metadata
log.dirs=/var/lib/kafka
metadata.log.dir=/var/lib/kafka-metadata
Kafka tools are available for inspecting and debugging the metadata log. For more information, see the Apache Kafka documentation.
5.3.4. Node ID 复制链接链接已复制到粘贴板!
Node ID is a unique identifier for each node (broker or controller) in the cluster. You can assign an integer greater than or equal to 0 as node ID. The node ID is used to identify the nodes after restarts or crashes and it is therefore important that the ID is stable and does not change over time.
The node ID is configured in the Kafka configuration properties file:
node.id=1
node.id=1
This procedure describes how to transition to using nodes with separate roles. If your Kafka cluster is using nodes with dual controller and broker roles, you can transition to using nodes with separate roles.
To do this, add new controllers, scale down the controllers on the dual-role nodes, and then switch the dual-role nodes to broker-only.
In this example, we update three dual-role nodes.
Prerequisites
- Streams for Apache Kafka (minimum 2.9) is installed on each host, and the configuration files are available.
-
The controller quorum is configured for dynamic scaling using the
controller.quorum.bootstrap.serversproperty. - Cruise Control is installed.
- A backup of the cluster is recommended.
Procedure
Add a quorum of three new controller-only nodes.
Integrate the controllers into the controller quorum by updating the
controller.quorum.bootstrap.serversproperty.For more information, see Section 14.2, “Adding new controllers”.
Using the
kafka-metadata-quorum.shtool, remove the dual-role controllers from the controller quorum.For more information, see Section 14.3, “Removing controllers”.
For each dual-role node, and one at a time:
Stop the dual-role node:
./bin/kafka-server-stop.sh
./bin/kafka-server-stop.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Configure the dual-role node to serve as a broker-node only by switching
process.roles=broker, controllerin the node configuration toprocess.roles=broker.Example broker configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the broker node that was previously serving a dual role:
./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.5. Transitioning to dual-role nodes 复制链接链接已复制到粘贴板!
This procedure describes how to transition from using separate nodes with broker-only and controller-only roles to using dual-role nodes. If your Kafka cluster is using nodes with dedicated controller and broker nodes, you can transition to using single nodes with both roles.
To do this, add dual-role configuration to the nodes, then rebalance the cluster to move partition replicas to the nodes that previously served as controllers only.
A dual-role configuration is suitable for development or testing. In a typical production environment, use dedicated broker and controller nodes.
Prerequisites
- Streams for Apache Kafka (minimum 2.9) is is installed on each host, and the configuration files are available.
-
The controller quorum is configured for dynamic scaling using the
controller.quorum.bootstrap.serversproperty. - Cruise Control is installed.
- A backup of the cluster is recommended.
Procedure
For each controller node, and one at a time:
Stop the controller node:
./bin/kafka-server-stop.sh
./bin/kafka-server-stop.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Configure the controller-only node to serve as a dual-role node by adding broker-specific configuration.
At a minimum, do the following:
-
Switch
process.roles=controllertoprocess.roles=broker, controller. -
Add or update the broker log directory using
log.dirs. - Add a listener for the broker to handle client requests. In this example, PLAINTEXT://:9092 is added.
-
Update mappings between listener names and security protocols using
listener.security.protocol.map. -
Configure a listener for inter-broker communication using
inter.broker.listener.name.
Example dual-role configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Switch
Restart the node that is now operating in a dual role:
./bin/kafka-server-start.sh -daemon ./config/kraft/dual-role.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/dual-role.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Use the Cruise Control
remove_brokerendpoint to reassign partition replicas from broker-only nodes to the nodes that now serve as dual-role nodes.The reassignment can take some time depending on the number of topics and partitions in the cluster.
For more information, see Section 15.7, “Generating optimization proposals”.
Unregister the broker nodes:
./bin/kafka-cluster.sh unregister \ --bootstrap-server <broker_host>:<port> \ --id <node_id_number>
./bin/kafka-cluster.sh unregister \ --bootstrap-server <broker_host>:<port> \ --id <node_id_number>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For more information, see Section 14.4, “Unregistering nodes after scale-down operations”.
Stop the broker nodes:
./bin/kafka-server-stop.sh
./bin/kafka-server-stop.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 6. Securing access to Kafka 复制链接链接已复制到粘贴板!
Secure your Kafka cluster by managing the access a client has to Kafka brokers. Specify configuration options to secure Kafka brokers and clients
A secure connection between Kafka brokers and clients can encompass the following:
- Encryption for data exchange
- Authentication to prove identity
- Authorization to allow or decline actions executed by users
The authentication and authorization mechanisms specified for a client must match those specified for the Kafka brokers.
6.1. Listener configuration 复制链接链接已复制到粘贴板!
Encryption and authentication in Kafka brokers is configured per listener. For more information about Kafka listener configuration, see Section 5.3.1, “Listeners”.
Each listener in the Kafka broker is configured with its own security protocol. The configuration property listener.security.protocol.map defines which listener uses which security protocol. It maps each listener name to its security protocol. Supported security protocols are:
PLAINTEXT- Listener without any encryption or authentication.
SSL- Listener using TLS encryption and, optionally, authentication using TLS client certificates.
SASL_PLAINTEXT- Listener without encryption but with SASL-based authentication.
SASL_SSL- Listener with TLS-based encryption and SASL-based authentication.
Given the following listeners configuration:
listeners=INT1://:9092,INT2://:9093,REPLICATION://:9094
listeners=INT1://:9092,INT2://:9093,REPLICATION://:9094
the listener.security.protocol.map might look like this:
listener.security.protocol.map=INT1:SASL_PLAINTEXT,INT2:SASL_SSL,REPLICATION:SSL
listener.security.protocol.map=INT1:SASL_PLAINTEXT,INT2:SASL_SSL,REPLICATION:SSL
This would configure the listener INT1 to use unencrypted connections with SASL authentication, the listener INT2 to use encrypted connections with SASL authentication and the REPLICATION interface to use TLS encryption (possibly with TLS client authentication). The same security protocol can be used multiple times. The following example is also a valid configuration:
listener.security.protocol.map=INT1:SSL,INT2:SSL,REPLICATION:SSL
listener.security.protocol.map=INT1:SSL,INT2:SSL,REPLICATION:SSL
Such a configuration would use TLS encryption and TLS authentication (optional) for all interfaces.
6.2. TLS Encryption 复制链接链接已复制到粘贴板!
Kafka supports TLS for encrypting communication with Kafka clients.
In order to use TLS encryption and server authentication, a keystore containing private and public keys has to be provided. This is usually done using a file in the Java Keystore (JKS) format. A path to this file is set in the ssl.keystore.location property. The ssl.keystore.password property should be used to set the password protecting the keystore. For example:
ssl.keystore.location=/path/to/keystore/server-1.jks ssl.keystore.password=123456
ssl.keystore.location=/path/to/keystore/server-1.jks
ssl.keystore.password=123456
In some cases, an additional password is used to protect the private key. Any such password can be set using the ssl.key.password property.
Kafka is able to use keys signed by certification authorities as well as self-signed keys. Using keys signed by certification authorities should always be the preferred method. In order to allow clients to verify the identity of the Kafka broker they are connecting to, the certificate should always contain the advertised hostname(s) as its Common Name (CN) or in the Subject Alternative Names (SAN).
It is possible to use different SSL configurations for different listeners. All options starting with ssl. can be prefixed with listener.name.<NameOfTheListener>., where the name of the listener has to be always in lowercase. This will override the default SSL configuration for that specific listener. The following example shows how to use different SSL configurations for different listeners:
Additional TLS configuration options
In addition to the main TLS configuration options described above, Kafka supports many options for fine-tuning the TLS configuration. For example, to enable or disable TLS / SSL protocols or cipher suites:
ssl.cipher.suites- List of enabled cipher suites. Each cipher suite is a combination of authentication, encryption, MAC and key exchange algorithms used for the TLS connection. By default, all available cipher suites are enabled.
ssl.enabled.protocols-
List of enabled TLS / SSL protocols. Defaults to
TLSv1.2,TLSv1.1,TLSv1.
6.2.1. Enabling TLS encryption 复制链接链接已复制到粘贴板!
This procedure describes how to enable encryption in Kafka brokers.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
Procedure
- Generate TLS certificates for all Kafka brokers in your cluster. The certificates should have their advertised and bootstrap addresses in their Common Name or Subject Alternative Name.
Edit the Kafka configuration properties file on all cluster nodes for the following:
-
Change the
listener.security.protocol.mapfield to specify theSSLprotocol for the listener where you want to use TLS encryption. -
Set the
ssl.keystore.locationoption to the path to the JKS keystore with the broker certificate. Set the
ssl.keystore.passwordoption to the password you used to protect the keystore.For example:
listeners=UNENCRYPTED://:9092,ENCRYPTED://:9093,REPLICATION://:9094 listener.security.protocol.map=UNENCRYPTED:PLAINTEXT,ENCRYPTED:SSL,REPLICATION:PLAINTEXT ssl.keystore.location=/path/to/keystore/server-1.jks ssl.keystore.password=123456
listeners=UNENCRYPTED://:9092,ENCRYPTED://:9093,REPLICATION://:9094 listener.security.protocol.map=UNENCRYPTED:PLAINTEXT,ENCRYPTED:SSL,REPLICATION:PLAINTEXT ssl.keystore.location=/path/to/keystore/server-1.jks ssl.keystore.password=123456Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Change the
- (Re)start the Kafka brokers
6.3. Authentication 复制链接链接已复制到粘贴板!
To authenticate client connections to your Kafka cluster, the following options are available:
- TLS client authentication
- TLS (Transport Layer Security) using X.509 certificates on encrypted connections
- Kafka SASL
- Kafka SASL (Simple Authentication and Security Layer) using supported authentication mechanisms
- OAuth 2.0
- OAuth 2.0 token-based authentication
SASL authentication supports various mechanisms for both plain unencrypted connections and TLS connections:
-
PLAIN― Authentication based on usernames and passwords. -
SCRAM-SHA-256andSCRAM-SHA-512― Authentication using Salted Challenge Response Authentication Mechanism (SCRAM). -
GSSAPI― Authentication against a Kerberos server.
The PLAIN mechanism sends usernames and passwords over the network in an unencrypted format. It should only be used in combination with TLS encryption.
6.3.1. Enabling TLS client authentication 复制链接链接已复制到粘贴板!
Enable TLS client authentication in Kafka brokers to enhance security for connections to Kafka nodes already using TLS encryption.
Use the ssl.client.auth property to set TLS authentication with one of these values:
-
none― TLS client authentication is off (default) -
requested― Optional TLS client authentication -
required― Clients must authenticate using a TLS client certificate
When a client authenticates using TLS client authentication, the authenticated principal name is derived from the distinguished name in the client certificate. For instance, a user with a certificate having a distinguished name CN=someuser will be authenticated with the principal CN=someuser,OU=Unknown,O=Unknown,L=Unknown,ST=Unknown,C=Unknown. This principal name provides a unique identifier for the authenticated user or entity. When TLS client authentication is not used, and SASL is disabled, the principal name defaults to ANONYMOUS.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- TLS encryption is enabled.
Procedure
- Prepare a JKS (Java Keystore ) truststore containing the public key of the CA (Certification Authority) used to sign the user certificates.
Edit the Kafka configuration properties file on all cluster nodes as follows:
-
Specify the path to the JKS truststore using the
ssl.truststore.locationproperty. -
If the truststore is password-protected, set the password using
ssl.truststore.passwordproperty. Set the
ssl.client.authproperty torequired.TLS client authentication configuration
ssl.truststore.location=/path/to/truststore.jks ssl.truststore.password=123456 ssl.client.auth=required
ssl.truststore.location=/path/to/truststore.jks ssl.truststore.password=123456 ssl.client.auth=requiredCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Specify the path to the JKS truststore using the
- (Re)start the Kafka brokers.
6.3.2. Enabling SASL PLAIN client authentication 复制链接链接已复制到粘贴板!
Enable SASL PLAIN authentication in Kafka to enhance security for connections to Kafka nodes.
SASL authentication is enabled through the Java Authentication and Authorization Service (JAAS) using the KafkaServer JAAS context. You can define the JAAS configuration in a dedicated file or directly in the Kafka configuration.
The recommended location for the dedicated file is ./config/jaas.conf. Ensure that the file is readable by the Kafka user. Keep the JAAS configuration file in sync on all Kafka nodes.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
Procedure
Edit or create the
./config/jaas.confJAAS configuration file to enable thePlainLoginModuleand specify the allowed usernames and passwords.Make sure this file is the same on all Kafka brokers.
JAAS configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the Kafka configuration properties file on all cluster nodes as follows:
-
Enable SASL PLAIN authentication on specific listeners using the
listener.security.protocol.mapproperty. SpecifySASL_PLAINTEXTorSASL_SSL. Set the
sasl.enabled.mechanismsproperty toPLAIN.SASL plain configuration
listeners=INSECURE://:9092,AUTHENTICATED://:9093,REPLICATION://:9094 listener.security.protocol.map=INSECURE:PLAINTEXT,AUTHENTICATED:SASL_PLAINTEXT,REPLICATION:PLAINTEXT sasl.enabled.mechanisms=PLAIN
listeners=INSECURE://:9092,AUTHENTICATED://:9093,REPLICATION://:9094 listener.security.protocol.map=INSECURE:PLAINTEXT,AUTHENTICATED:SASL_PLAINTEXT,REPLICATION:PLAINTEXT sasl.enabled.mechanisms=PLAINCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Enable SASL PLAIN authentication on specific listeners using the
(Re)start the Kafka brokers using the
KAFKA_OPTSenvironment variable to pass the JAAS configuration to Kafka brokers:export KAFKA_OPTS="-Djava.security.auth.login.config=./config/jaas.conf"; ./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
export KAFKA_OPTS="-Djava.security.auth.login.config=./config/jaas.conf"; ./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
6.3.3. Enabling SASL SCRAM client authentication 复制链接链接已复制到粘贴板!
Enable SASL SCRAM authentication in Kafka to enhance security for connections to Kafka nodes.
SASL authentication is enabled through the Java Authentication and Authorization Service (JAAS) using the KafkaServer JAAS context. You can define the JAAS configuration in a dedicated file or directly in the Kafka configuration.
The recommended location for the dedicated file is ./config/jaas.conf. Ensure that the file is readable by the Kafka user. Keep the JAAS configuration file in sync on all Kafka nodes.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
Procedure
Edit or create the
./config/jaas.confJAAS configuration file to enable theScramLoginModule.Make sure this file is the same on all Kafka brokers.
JAAS configuration
KafkaServer { org.apache.kafka.common.security.scram.ScramLoginModule required; };KafkaServer { org.apache.kafka.common.security.scram.ScramLoginModule required; };Copy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the Kafka configuration properties file on all cluster nodes as follows:
-
Enable SASL SCRAM authentication on specific listeners using the
listener.security.protocol.mapproperty. SpecifySASL_PLAINTEXTorSASL_SSL. Set the
sasl.enabled.mechanismsoption toSCRAM-SHA-256orSCRAM-SHA-512.For example:
listeners=INSECURE://:9092,AUTHENTICATED://:9093,REPLICATION://:9094 listener.security.protocol.map=INSECURE:PLAINTEXT,AUTHENTICATED:SASL_PLAINTEXT,REPLICATION:PLAINTEXT sasl.enabled.mechanisms=SCRAM-SHA-512
listeners=INSECURE://:9092,AUTHENTICATED://:9093,REPLICATION://:9094 listener.security.protocol.map=INSECURE:PLAINTEXT,AUTHENTICATED:SASL_PLAINTEXT,REPLICATION:PLAINTEXT sasl.enabled.mechanisms=SCRAM-SHA-512Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Enable SASL SCRAM authentication on specific listeners using the
(Re)start the Kafka brokers using the
KAFKA_OPTSenvironment variable to pass the JAAS configuration to Kafka brokers.export KAFKA_OPTS="-Djava.security.auth.login.config=./config/jaas.conf"; ./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
export KAFKA_OPTS="-Djava.security.auth.login.config=./config/jaas.conf"; ./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
6.3.4. Enabling multiple SASL mechanisms 复制链接链接已复制到粘贴板!
When using SASL authentication, you can enable more than one mechanism. Kafka can use more than one SASL mechanism simultaneously. When multiple mechanisms are enabled, you can choose the mechanism specific clients use.
To use more than one mechanism, you set up the configuration required for each mechanism. You can add different KafkaServer JAAS configurations to the same context and enable more than one mechanism in the Kafka configuration as a comma-separated list using the sasl.mechanism.inter.broker.protocol property.
JAAS configuration for more than one SASL mechanism
SASL mechanisms enabled
sasl.enabled.mechanisms=PLAIN,SCRAM-SHA-256,SCRAM-SHA-512
sasl.enabled.mechanisms=PLAIN,SCRAM-SHA-256,SCRAM-SHA-512
6.3.5. Enabling SASL for inter-broker authentication 复制链接链接已复制到粘贴板!
Enable SASL SCRAM authentication between Kafka nodes to enhance security for inter-broker connections. As well as using SASL authentication for client connections to a Kafka cluster, you can also use SASL for inter-broker authentication. Unlike SASL for client connections, you can only choose one mechanism for inter-broker communication.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
If you are using a SCRAM mechanism, register SCRAM credentials on the Kafka cluster.
For all nodes in the Kafka cluster, use the
kafka-storage.shtool to add the inter-broker SASL SCRAM user to the__cluster_metadatatopic. This ensures that the credentials for authentication are updated for bootstrapping before the Kafka cluster is running.Registering an inter-broker SASL SCRAM user
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure
Specify an inter-broker SASL mechanism in the Kafka configuration using the
sasl.mechanism.inter.broker.protocolproperty.Inter-broker SASL mechanism
sasl.mechanism.inter.broker.protocol=SCRAM-SHA-512
sasl.mechanism.inter.broker.protocol=SCRAM-SHA-512Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the username and password for inter-broker communication in the
KafkaServerJAAS context using theusernameandpasswordfields.Inter-broker JAAS context
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.3.6. Adding SASL SCRAM users 复制链接链接已复制到粘贴板!
This procedure outlines the steps to register new users for authentication using SASL SCRAM in Kafka. SASL SCRAM authentication enhances the security of client connections.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- SASL SCRAM authentication is enabled.
Procedure
Use the
kafka-configs.shtool to add new SASL SCRAM users../bin/kafka-configs.sh \ --bootstrap-server <broker_host>:<port> \ --alter \ --add-config 'SCRAM-SHA-512=[password=<password>]' \ --entity-type users --entity-name <username>
./bin/kafka-configs.sh \ --bootstrap-server <broker_host>:<port> \ --alter \ --add-config 'SCRAM-SHA-512=[password=<password>]' \ --entity-type users --entity-name <username>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.3.7. Deleting SASL SCRAM users 复制链接链接已复制到粘贴板!
This procedure outlines the steps to remove users registered for authentication using SASL SCRAM in Kafka.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- SASL SCRAM authentication is enabled.
Procedure
Use the
kafka-configs.shtool to delete SASL SCRAM users./bin/kafka-configs.sh \
--bootstrap-server <broker_host>:<port> \ --alter \ --delete-config 'SCRAM-SHA-512' \ --entity-type users \ --entity-name <username>
--bootstrap-server <broker_host>:<port> \ --alter \ --delete-config 'SCRAM-SHA-512' \ --entity-type users \ --entity-name <username>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
/bin/kafka-configs.sh \
--bootstrap-server localhost:9092 \ --alter \ --delete-config 'SCRAM-SHA-512' \ --entity-type users \ --entity-name user1
--bootstrap-server localhost:9092 \ --alter \ --delete-config 'SCRAM-SHA-512' \ --entity-type users \ --entity-name user1Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.3.8. Enabling Kerberos (GSSAPI) authentication 复制链接链接已复制到粘贴板!
Streams for Apache Kafka supports the use of the Kerberos (GSSAPI) authentication protocol for secure single sign-on access to your Kafka cluster. GSSAPI is an API wrapper for Kerberos functionality, insulating applications from underlying implementation changes.
Kerberos is a network authentication system that allows clients and servers to authenticate to each other by using symmetric encryption and a trusted third party, the Kerberos Key Distribution Centre (KDC).
This procedure shows how to configure Streams for Apache Kafka so that Kafka clients can access Kafka using Kerberos (GSSAPI) authentication.
The procedure assumes that a Kerberos krb5 resource server has been set up on a Red Hat Enterprise Linux host. For this setup, Kafka is installed in the /opt/kafka/ directory.
The procedure shows, with examples, how to configure:
- Service principals
- Kafka brokers to use the Kerberos login
- Producer and consumer clients to access Kafka using Kerberos authentication
The instructions describe Kerberos set up for a Kafka installation on a single host, with additional configuration for a producer and consumer client.
Prerequisites
- You are logged in to Red Hat Enterprise Linux as the Kafka user.
To be able to configure Kafka to authenticate and authorize Kerberos credentials, you need the following:
- Access to a Kerberos server
- A Kerberos client on each Kafka broker host
Add service principals for authentication
From your Kerberos server, create service principals (users) for Kafka brokers, and Kafka producer and consumer clients. Service principals must take the form SERVICE-NAME/FULLY-QUALIFIED-HOST-NAME@DOMAIN-REALM.
Create the service principals, and keytabs that store the principal keys, through the Kerberos KDC.
Make sure the domain name in the Kerberos principal is in uppercase.
For example:
-
kafka/node1.example.redhat.com@EXAMPLE.REDHAT.COM -
producer1/node1.example.redhat.com@EXAMPLE.REDHAT.COM -
consumer1/node1.example.redhat.com@EXAMPLE.REDHAT.COM
-
Create a directory on the host and add the keytab files:
For example:
/opt/kafka/krb5/kafka-node1.keytab /opt/kafka/krb5/kafka-producer1.keytab /opt/kafka/krb5/kafka-consumer1.keytab
/opt/kafka/krb5/kafka-node1.keytab /opt/kafka/krb5/kafka-producer1.keytab /opt/kafka/krb5/kafka-consumer1.keytabCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Configure the Kafka broker server to use a Kerberos login
Configure Kafka to use the Kerberos Key Distribution Center (KDC) for authentication using the user principals and keytabs previously created for kafka.
Modify the
opt/kafka/config/jaas.conffile with the following elements:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Configure each broker in the Kafka cluster by modifying the listener configuration in the
config/server.propertiesfile so the listeners use the SASL/GSSAPI login.Add the SASL protocol to the map of security protocols for the listener, and remove any unwanted protocols.
For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Two listeners are configured: a secure listener for general-purpose communications with clients (supporting TLS for communications), and a replication listener for inter-broker communications.
- 2
- For TLS-enabled listeners, the protocol name is SASL_PLAINTEXT. For non-TLS-enabled connectors, the protocol name is SASL_PLAINTEXT. If SSL is not required, you can remove the
ssl.*properties. - 3
- SASL mechanism for Kerberos authentication is
GSSAPI. - 4
- Kerberos authentication for inter-broker communication.
- 5
- The name of the service used for authentication requests is specified to distinguish it from other services that may also be using the same Kerberos configuration.
Start the Kafka broker, with JVM parameters to specify the Kerberos login configuration:
export KAFKA_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf -Djava.security.auth.login.config=/opt/kafka/config/jaas.conf"; /opt/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/kraft/server.properties
export KAFKA_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf -Djava.security.auth.login.config=/opt/kafka/config/jaas.conf"; /opt/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Configure Kafka producer and consumer clients to use Kerberos authentication.
Configure Kafka producer and consumer clients to use the Kerberos Key Distribution Center (KDC) for authentication using the user principals and keytabs previously created for producer1 and consumer1.
Add the Kerberos configuration to the producer or consumer configuration file.
For example:
Configuration in producer.properties
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Configuration in consumer.properties
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the clients to verify that you can send and receive messages from the Kafka brokers.
Producer client:
export KAFKA_HEAP_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf -Dsun.security.krb5.debug=true"; /opt/kafka/bin/kafka-console-producer.sh --producer.config /opt/kafka/config/producer.properties --topic topic1 --bootstrap-server node1.example.redhat.com:9094
export KAFKA_HEAP_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf -Dsun.security.krb5.debug=true"; /opt/kafka/bin/kafka-console-producer.sh --producer.config /opt/kafka/config/producer.properties --topic topic1 --bootstrap-server node1.example.redhat.com:9094Copy to Clipboard Copied! Toggle word wrap Toggle overflow Consumer client:
export KAFKA_HEAP_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf -Dsun.security.krb5.debug=true"; /opt/kafka/bin/kafka-console-consumer.sh --consumer.config /opt/kafka/config/consumer.properties --topic topic1 --bootstrap-server node1.example.redhat.com:9094
export KAFKA_HEAP_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf -Dsun.security.krb5.debug=true"; /opt/kafka/bin/kafka-console-consumer.sh --consumer.config /opt/kafka/config/consumer.properties --topic topic1 --bootstrap-server node1.example.redhat.com:9094Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.4. Authorization 复制链接链接已复制到粘贴板!
Authorization in Kafka brokers is implemented using authorizer plugins.
In this section we describe how to use the StandardAuthorizer plugin provided with Kafka.
Alternatively, you can use your own authorization plugins. For example, if you are using OAuth 2.0 token-based authentication, you can use OAuth 2.0 authorization.
6.4.1. Enabling an ACL authorizer 复制链接链接已复制到粘贴板!
Edit the Kafka configuration properties file to add an ACL authorizer. Enable the authorizer by specifying its fully-qualified name in the authorizer.class.name property:
Enabling the authorizer
authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer
authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer
6.4.1.1. ACL rules 复制链接链接已复制到粘贴板!
An ACL authorizer uses ACL rules to manage access to Kafka brokers.
ACL rules are defined in the following format:
Principal P is allowed / denied <operation> O on <kafka_resource> R from host H
For example, a rule might be set so that user John can view the topic comments from host 127.0.0.1. Host is the IP address of the machine that John is connecting from.
In most cases, the user is a producer or consumer application:
Consumer01 can write to the consumer group accounts from host 127.0.0.1
If ACL rules are not present for a given resource, all actions are denied. This behavior can be changed by setting the property allow.everyone.if.no.acl.found to true in the Kafka configuration file.
6.4.1.2. Principals 复制链接链接已复制到粘贴板!
A principal represents the identity of a user. The format of the ID depends on the authentication mechanism used by clients to connect to Kafka:
-
User:ANONYMOUSwhen connected without authentication. User:<username>when connected using simple authentication mechanisms, such as PLAIN or SCRAM.For example
User:adminorUser:user1.User:<DistinguishedName>when connected using TLS client authentication.For example
User:CN=user1,O=MyCompany,L=Prague,C=CZ.-
User:<Kerberos username>when connected using Kerberos.
The DistinguishedName is the distinguished name from the client certificate.
The Kerberos username is the primary part of the Kerberos principal, which is used by default when connecting using Kerberos. You can use the sasl.kerberos.principal.to.local.rules property to configure how the Kafka principal is built from the Kerberos principal.
6.4.1.3. Authentication of users 复制链接链接已复制到粘贴板!
To use authorization, you need to have authentication enabled and used by your clients. Otherwise, all connections will have the principal User:ANONYMOUS.
For more information on methods of authentication, see Section 6.3, “Authentication”.
6.4.1.4. Super users 复制链接链接已复制到粘贴板!
Super users are allowed to take all actions regardless of the ACL rules.
Super users are defined in the Kafka configuration file using the property super.users.
For example:
super.users=User:admin,User:operator
super.users=User:admin,User:operator
6.4.1.5. Replica broker authentication 复制链接链接已复制到粘贴板!
When authorization is enabled, it is applied to all listeners and all connections. This includes the inter-broker connections used for replication of data between brokers. If enabling authorization, therefore, ensure that you use authentication for inter-broker connections and give the users used by the brokers sufficient rights. For example, if authentication between brokers uses the kafka-broker user, then super user configuration must include the username super.users=User:kafka-broker.
For more information on the operations on Kafka resources you can control with ACLs, see the Apache Kafka documentation.
6.4.2. Adding ACL rules 复制链接链接已复制到粘贴板!
When using an ACL authorizer to control access to Kafka based on Access Control Lists (ACLs), you can add new ACL rules using the kafka-acls.sh utility.
Use kafka-acls.sh parameter options to add, list and remove ACL rules, and perform other functions. The parameters require a double-hyphen convention, such as --add.
Prerequisites
- Users have been created and granted appropriate permissions to access Kafka resources.
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- Authorization is enabled in Kafka brokers.
Procedure
Run
kafka-acls.shwith the--addoption.Examples:
Allow
user1anduser2access to read frommyTopicusing theMyConsumerGroupconsumer group.opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --add --operation Read --topic myTopic --allow-principal User:user1 --allow-principal User:user2 opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --add --operation Describe --topic myTopic --allow-principal User:user1 --allow-principal User:user2 opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --add --operation Read --operation Describe --group MyConsumerGroup --allow-principal User:user1 --allow-principal User:user2
opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --add --operation Read --topic myTopic --allow-principal User:user1 --allow-principal User:user2 opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --add --operation Describe --topic myTopic --allow-principal User:user1 --allow-principal User:user2 opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --add --operation Read --operation Describe --group MyConsumerGroup --allow-principal User:user1 --allow-principal User:user2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Deny
user1access to readmyTopicfrom IP address host127.0.0.1.opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --add --operation Describe --operation Read --topic myTopic --group MyConsumerGroup --deny-principal User:user1 --deny-host 127.0.0.1
opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --add --operation Describe --operation Read --topic myTopic --group MyConsumerGroup --deny-principal User:user1 --deny-host 127.0.0.1Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add
user1as the consumer ofmyTopicwithMyConsumerGroup.opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --add --consumer --topic myTopic --group MyConsumerGroup --allow-principal User:user1
opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --add --consumer --topic myTopic --group MyConsumerGroup --allow-principal User:user1Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.4.3. Listing ACL rules 复制链接链接已复制到粘贴板!
When using an ACL authorizer to control access to Kafka based on Access Control Lists (ACLs), you can list existing ACL rules using the kafka-acls.sh utility.
Prerequisites
Procedure
Run
kafka-acls.shwith the--listoption.For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.4.4. Removing ACL rules 复制链接链接已复制到粘贴板!
When using an ACL authorizer to control access to Kafka based on Access Control Lists (ACLs), you can remove existing ACL rules using the kafka-acls.sh utility.
Prerequisites
Procedure
Run
kafka-acls.shwith the--removeoption.Examples:
Remove the ACL allowing Allow
user1anduser2access to read frommyTopicusing theMyConsumerGroupconsumer group.opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --remove --operation Read --topic myTopic --allow-principal User:user1 --allow-principal User:user2 opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --remove --operation Describe --topic myTopic --allow-principal User:user1 --allow-principal User:user2 opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --remove --operation Read --operation Describe --group MyConsumerGroup --allow-principal User:user1 --allow-principal User:user2
opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --remove --operation Read --topic myTopic --allow-principal User:user1 --allow-principal User:user2 opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --remove --operation Describe --topic myTopic --allow-principal User:user1 --allow-principal User:user2 opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --remove --operation Read --operation Describe --group MyConsumerGroup --allow-principal User:user1 --allow-principal User:user2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the ACL adding
user1as the consumer ofmyTopicwithMyConsumerGroup.opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --remove --consumer --topic myTopic --group MyConsumerGroup --allow-principal User:user1
opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --remove --consumer --topic myTopic --group MyConsumerGroup --allow-principal User:user1Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the ACL denying
user1access to readmyTopicfrom IP address host127.0.0.1.opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --remove --operation Describe --operation Read --topic myTopic --group MyConsumerGroup --deny-principal User:user1 --deny-host 127.0.0.1
opt/kafka/bin/kafka-acls.sh --bootstrap-server localhost:9092 --remove --operation Describe --operation Read --topic myTopic --group MyConsumerGroup --deny-principal User:user1 --deny-host 127.0.0.1Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 7. Enabling OAuth 2.0 token-based access 复制链接链接已复制到粘贴板!
Streams for Apache Kafka supports OAuth 2.0 for securing Kafka clusters by integrating with an OAUth 2.0 authorization server. Kafka brokers and clients both need to be configured to use OAuth 2.0.
OAuth 2.0 enables standardized token-based authentication and authorization between applications, using a central authorization server to issue tokens that grant limited access to resources. You can define specific scopes for fine-grained access control. Scopes correspond to different levels of access to Kafka topics or operations within the cluster.
OAuth 2.0 also supports single sign-on and integration with identity providers.
7.1. Configuring an OAuth 2.0 authorization server 复制链接链接已复制到粘贴板!
Before you can use OAuth 2.0 token-based access, you must configure an authorization server for integration with Streams for Apache Kafka. The steps are dependent on the chosen authorization server. Consult the product documentation for the authorization server for information on how to set up OAuth 2.0 access.
Prepare the authorization server to work with Streams for Apache Kafka by defining OAUth 2.0 clients for Kafka and each Kafka client component of your application. In relation to the authorization server, the Kafka cluster and Kafka clients are both regarded as OAuth 2.0 clients.
In general, configure OAuth 2.0 clients in the authorization server with the following client credentials enabled:
-
Client ID (for example,
kafkafor the Kafka cluster) - Client ID and secret as the authentication mechanism
You only need to use a client ID and secret when using a non-public introspection endpoint of the authorization server. The credentials are not typically required when using public authorization server endpoints, as with fast local JWT token validation.
7.2. Using OAuth 2.0 token-based authentication 复制链接链接已复制到粘贴板!
Streams for Apache Kafka supports the use of OAuth 2.0 for token-based authentication. An OAuth 2.0 authorization server handles the granting of access and inquiries about access. Kafka clients authenticate to Kafka brokers. Brokers and clients communicate with the authorization server, as necessary, to obtain or validate access tokens.
For a deployment of Streams for Apache Kafka, OAuth 2.0 integration provides the following support:
- Server-side OAuth 2.0 authentication for Kafka brokers
- Client-side OAuth 2.0 authentication for Kafka MirrorMaker, Kafka Connect, and the Kafka Bridge
Streams for Apache Kafka on RHEL includes two OAuth 2.0 libraries:
kafka-oauth-client-
Provides a custom login callback handler class named
io.strimzi.kafka.oauth.client.JaasClientOauthLoginCallbackHandler. To handle theOAUTHBEARERauthentication mechanism, use the login callback handler with theOAuthBearerLoginModuleprovided by Apache Kafka. kafka-oauth-common-
A helper library that provides some of the functionality needed by the
kafka-oauth-clientlibrary.
The provided client libraries also have dependencies on some additional third-party libraries, such as: keycloak-core, jackson-databind, and slf4j-api.
We recommend using a Maven project to package your client to ensure that all the dependency libraries are included. Dependency libraries might change in future versions.
7.2.1. Configuring OAuth 2.0 authentication on listeners 复制链接链接已复制到粘贴板!
To secure Kafka brokers with OAuth 2.0 authentication, configure a Kafka listener to use OAUth 2.0 authentication and a client authentication mechanism in the Kafka server.properties file, and add further configuration depending on the authentication mechanism and type of token validation used in the authentication.
A minimum configuration is required. You can also configure a TLS listener, where TLS is used for inter-broker communication. We recommend using OAuth 2.0 authentication together with TLS encryption. Without encryption, the connection is vulnerable to network eavesdropping and unauthorized access through token theft.
When you have defined the type of authentication as OAuth 2.0, you add configuration based on the type of validation, either as fast local JWT validation or token validation using an introspection endpoint.
Enabling SASL authentication mechanisms
Use one or both of the following SASL mechanisms for clients to exchange credentials and establish authenticated sessions with Kafka.
OAUTHBEARERUsing the
OAUTHBEARERauthentication mechanism, credentials exchange uses a bearer token provided by an OAuth callback handler. Token provision can be configured to use the following methods:- Client ID and secret (using the OAuth 2.0 client credentials mechanism)
- Client ID and client assertion
- Long-lived access token
- Long-lived refresh token obtained manually
OAUTHBEARERis recommended as it provides a higher level of security thanPLAIN, though it can only be used by Kafka clients that support theOAUTHBEARERmechanism at the protocol level. Client credentials are never shared with Kafka.PLAINPLAINis a simple authentication mechanism used by all Kafka client tools. Consider usingPLAINonly with Kafka clients that do not supportOAUTHBEARER. Using thePLAINauthentication mechanism, credentials exchange can be configured to use the following methods:- Client ID and secret (using the OAuth 2.0 client credentials mechanism)
-
Long-lived access token
Regardless of the method used, the client must provideusernameandpasswordproperties to Kafka.
Credentials are handled centrally behind a compliant authorization server, similar to how
OAUTHBEARERauthentication is used. The username extraction process depends on the authorization server configuration.
Example listener configuration for the OAUTHBEARER mechanism
- 1
- Enables the
OAUTHBEARERmechanism for credentials exchange over SASL. - 2
- Configures a listener for client applications to connect to. The system
hostnameis used as an advertised hostname, which clients must resolve in order to reconnect. The listener is namedCLIENTin this example. - 3
- Specifies the channel protocol for the listener.
SASL_SSLis for TLS.SASL_PLAINTEXTis used for an unencrypted connection (no TLS), but there is risk of eavesdropping and interception at the TCP connection layer. - 4
- Specifies the
OAUTHBEARERmechanism for the CLIENT listener. The client name (CLIENT) is usually specified in uppercase in thelistenersproperty, in lowercase forlistener.nameproperties (listener.name.client), and in lowercase when part of alistener.name.client.*property. - 5
- Specifies the
OAUTHBEARERmechanism for inter-broker communication. - 6
- Specifies the listener for inter-broker communication. The specification is required for the configuration to be valid.
- 7
- Configures OAuth 2.0 authentication on the client listener.
Configuring OAuth 2.0 with properties or variables
Configure OAuth 2.0 settings using Java Authentication and Authorization Service (JAAS) properties or environment variables.
-
JAAS properties are configured in the
server.propertiesconfiguration file, and passed as key-values pairs of thelistener.name.<listener_name>.oauthbearer.sasl.jaas.configproperty. If using environment variables, you still need to provide the
listener.name.<listener_name>.oauthbearer.sasl.jaas.configproperty in theserver.propertiesfile, but you can omit the other JAAS properties.You can use capitalized or upper-case environment variable naming conventions.
The Streams for Apache Kafka OAuth 2.0 libraries use properties that start with:
-
oauth.to configure authentication -
strimzi.to configure OAuth 2.0 authorization
Configuring fast local JWT token validation
Fast local JWT token validation involves checking a JWT token signature locally to ensure that the token meets the following criteria:
-
Contains a
typ(type) ortoken_typeheader claim value ofBearerto indicate it is an access token - Is currently valid and not expired
-
Has an issuer that matches a
validIssuerURI
You specify a validIssuerURI attribute when you configure the listener, so that any tokens not issued by the authorization server are rejected.
The authorization server does not need to be contacted during fast local JWT token validation. You activate fast local JWT token validation by specifying a jwksEndpointUri attribute, the endpoint exposed by the OAuth 2.0 authorization server. The endpoint contains the public keys used to validate signed JWT tokens, which are sent as credentials by Kafka clients.
All communication with the authorization server should be performed using TLS encryption. You can configure a certificate truststore and point to the truststore file.
You might want to configure a userNameClaim to properly extract a username from the JWT token. If required, you can use a JsonPath expression like "['user.info'].['user.id']" to retrieve the username from nested JSON attributes within a token.
If you want to use Kafka ACL authorization, identify the user by their username during authentication. (The sub claim in JWT tokens is typically a unique ID, not a username.)
Example configuration for fast local JWT token validation
- 1
- Configures the CLIENT listener for OAuth 2.0. Connectivity with the authorization server should use secure HTTPS connections.
- 2
- A valid issuer URI. Only access tokens issued by this issuer will be accepted. (Always required.)
- 3
- The JWKS endpoint URL.
- 4
- The period between endpoint refreshes (default 300).
- 5
- The minimum pause in seconds between consecutive attempts to refresh JWKS public keys. When an unknown signing key is encountered, the JWKS keys refresh is scheduled outside the regular periodic schedule with at least the specified pause since the last refresh attempt. The refreshing of keys follows the rule of exponential backoff, retrying on unsuccessful refreshes with ever increasing pause, until it reaches
oauth.jwks.refresh.seconds. The default value is 1. - 6
- The duration the JWKs certificates are considered valid before they expire. Default is
360seconds. If you specify a longer time, consider the risk of allowing access to revoked certificates. - 7
- The token claim (or key) that contains the actual user name in the token. The user name is the principal used to identify the user. The value will depend on the authentication flow and the authorization server used. If required, you can use a JsonPath expression like
"['user.info'].['user.id']"to retrieve the username from nested JSON attributes within a token. - 8
- The location of the truststore used in the TLS configuration.
- 9
- Password to access the truststore.
- 10
- The truststore type in PKCS #12 format.
- 11
- (Optional) Enforces session expiry when a token expires, and also activates the Kafka re-authentication mechanism. If the specified value is less than the time left for the access token to expire, then the client will have to re-authenticate before the actual token expiry. By default, the session does not expire when the access token expires, and the client does not attempt re-authentication.
Configuring token validation using an introspection endpoint
Token validation using an OAuth 2.0 introspection endpoint treats a received access token as opaque. The Kafka broker sends an access token to the introspection endpoint, which responds with the token information necessary for validation. Importantly, it returns up-to-date information if the specific access token is valid, and also information about when the token expires.
To configure OAuth 2.0 introspection-based validation, you specify an introspection endpoint URI rather than the JWKs endpoint URI specified for fast local JWT token validation. Depending on the authorization server, you typically have to specify a client ID and client secret, because the introspection endpoint is usually protected.
Example token validation configuration using an introspection endpoint
- 1
- URI of the token introspection endpoint.
- 2
- Client ID of the Kafka broker.
- 3
- Secret for the Kafka broker.
- 4
- The location of the truststore used in the TLS configuration.
- 5
- Password to access the truststore.
- 6
- The truststore type in PKCS #12 format.
- 7
- The token claim (or key) that contains the actual user name in the token. The user name is the principal used to identify the user. The value will depend on the authentication flow and the authorization server used. If required, you can use a JsonPath expression like
"['user.info'].['user.id']"to retrieve the username from nested JSON attributes within a token.
Authenticating brokers to the authorization server protected endpoints
Usually, the certificates endpoint of the authorization server (oauth.jwks.endpoint.uri) is publicly accessible, while the introspection endpoint (oauth.introspection.endpoint.uri) is protected. However, this may vary depending on the authorization server configuration.
The Kafka broker can authenticate to the authorization server’s protected endpoints in one of two ways using HTTP authentication schemes:
- HTTP Basic authentication uses a client ID and secret.
- HTTP Bearer authentication uses a bearer token.
To configure HTTP Basic authentication, set the following properties:
-
oauth.client.id -
oauth.client.secret
For HTTP Bearer authentication, set one of the following properties:
-
oauth.server.bearer.token.locationto specify the file path on disk containing the bearer token. -
oauth.server.bearer.tokento specify the bearer token in clear text.
Including additional configuration options
Specify additional settings depending on the authentication requirements and the authorization server you are using. Some of these properties apply only to certain authentication mechanisms or when used in combination with other properties.
For example, when using OAUth over PLAIN, access tokens are passed as password property values with or without an $accessToken: prefix.
-
If you configure a token endpoint (
oauth.token.endpoint.uri) in the listener configuration, you need the prefix. - If you don’t configure a token endpoint in the listener configuration, you don’t need the prefix. The Kafka broker interprets the password as a raw access token.
If the password is set as the access token, the username must be set to the same principal name that the Kafka broker obtains from the access token. You can specify username extraction options in your listener using the oauth.username.claim, oauth.username.prefix, oauth.fallback.username.claim, oauth.fallback.username.prefix, and oauth.userinfo.endpoint.uri properties. The username extraction process also depends on your authorization server; in particular, how it maps client IDs to account names.
The PLAIN mechanism does not support password grant authentication. Use either client credentials (client ID + secret) or an access token for authentication.
Example additional configuration settings
- 1
- The OAuth 2.0 token endpoint URL to your authorization server. For production, always use
https://urls. Required whenKeycloakAuthorizeris used, or an OAuth 2.0 enabled listener is used for inter-broker communication. - 2
- (Optional) Custom claim checking. A JsonPath filter query that applies additional custom rules to the JWT access token during validation. If the access token does not contain the necessary data, it is rejected. When using the introspection endpoint method, the custom check is applied to the introspection endpoint response JSON.
- 3
- (Optional) A
scopeparameter passed to the token endpoint. A scope is used when obtaining an access token for inter-broker authentication. It is also used in the name of a client for OAuth 2.0 over PLAIN client authentication using aclientIdandsecret. This only affects the ability to obtain the token, and the content of the token, depending on the authorization server. It does not affect token validation rules by the listener. - 4
- (Optional) Audience checking. If your authorization server provides an
aud(audience) claim, and you want to enforce an audience check, setouath.check.audiencetotrue. Audience checks identify the intended recipients of tokens. As a result, the Kafka broker will reject tokens that do not have itsclientIdin theiraudclaims. Default isfalse. - 5
- (Optional) An
audienceparameter passed to the token endpoint. An audience is used when obtaining an access token for inter-broker authentication. It is also used in the name of a client for OAuth 2.0 over PLAIN client authentication using aclientIdandsecret. This only affects the ability to obtain the token, and the content of the token, depending on the authorization server. It does not affect token validation rules by the listener. - 6
- The configured client ID of the Kafka broker, which is the same for all brokers. This is the client registered with the authorization server as
kafka-broker. Required when an introspection endpoint is used for token validation, or whenKeycloakAuthorizeris used. - 7
- The configured secret for the Kafka broker, which is the same for all brokers. When the broker must authenticate to the authorization server, either a client secret, access token or a refresh token has to be specified.
- 8
- (Optional) The connect timeout in seconds when connecting to the authorization server. The default value is 60.
- 9
- (Optional) The read timeout in seconds when connecting to the authorization server. The default value is 60.
- 10
- The maximum number of times to retry a failed HTTP request to the authorization server. The default value is 0, meaning that no retries are performed. To use this option effectively, consider reducing the timeout times for the
oauth.connect.timeout.secondsandoauth.read.timeout.secondsoptions. However, note that retries may prevent the current worker thread from being available to other requests, and if too many requests stall, it could make the Kafka broker unresponsive. - 11
- The time to wait before attempting another retry of a failed HTTP request to the authorization server. By default, this time is set to zero, meaning that no pause is applied. This is because many issues that cause failed requests are per-request network glitches or proxy issues that can be resolved quickly. However, if your authorization server is under stress or experiencing high traffic, you may want to set this option to a value of 100 ms or more to reduce the load on the server and increase the likelihood of successful retries.
- 12
- A JsonPath query used to extract groups information from JWT token or introspection endpoint response. Not set by default. This can be used by a custom authorizer to make authorization decisions based on user groups.
- 13
- A delimiter used to parse groups information when returned as a single delimited string. The default value is ',' (comma).
- 14
- (Optional) Sets
oauth.include.accept.headertofalseto remove theAcceptheader from requests. You can use this setting if including the header is causing issues when communicating with the authorization server. - 15
- If your authorization server does not provide an
issclaim, it is not possible to perform an issuer check. In this situation, setoauth.check.issuertofalseand do not specify aoauth.valid.issuer.uri. Default istrue. - 16
- The prefix used when constructing the user ID. This only takes effect if
oauth.username.claimis configured. - 17
- An authorization server may not provide a single attribute to identify both regular users and clients. When a client authenticates in its own name, the server might provide a client ID attribute. When a user authenticates using a username and password, to obtain a refresh token or an access token, the server might provide a username attribute in addition to a client ID. Use this fallback option to specify the username claim (attribute) to use if a primary user ID attribute is not available. If required, you can use a JsonPath expression like
"['client.info'].['client.id']"to retrieve the fallback username from nested JSON attributes within a token. - 18
- In situations where
oauth.fallback.username.claimis applicable, it may also be necessary to prevent name collisions between the values of the username claim, and those of the fallback username claim. Consider a situation where a client calledproducerexists, but also a regular user calledproducerexists. In order to differentiate between the two, you can use this property to add a prefix to the user ID of the client. - 19
- (Only applicable when using
oauth.introspection.endpoint.uri) Depending on the authorization server you are using, the introspection endpoint may or may not return the token type attribute, or it may contain different values. You can specify a valid token type value that the response from the introspection endpoint has to contain. - 20
- (Only applicable when using
oauth.introspection.endpoint.uri) The authorization server may be configured or implemented in such a way to not provide any identifiable information in an introspection endpoint response. In order to obtain the user ID, you can configure the URI of theuserinfoendpoint as a fallback. Theoauth.username.claim,oauth.username.prefix,oauth.fallback.username.claim, andoauth.fallback.username.prefixsettings are also applied to the response of theuserinfoendpoint.
Configuring listeners for inter-broker communication
The following example uses the OAUTHBEARER mechanism for fast token validation in a minimum configuration where inter-broker communication goes through the same listener as application clients.
The oauth.client.id, oauth.client.secret, and auth.token.endpoint.uri properties relate to inter-broker communication.
Example inter-broker configuration using the OAUTHBEARER mechanism
- 1
- Configures authentication settings for client and inter-broker communication.
- 2
- Client ID of the Kafka broker, which is the same for all brokers. This is the client registered with the authorization server as
kafka-broker. - 3
- Secret for the Kafka broker, which is the same for all brokers.
- 4
- The OAuth 2.0 token endpoint URL to your authorization server. For production, always use
https://urls. - 5
- Enables (and is only required for) OAuth 2.0 authentication for inter-broker communication.
The following example shows a minimum configuration for a TLS listener used for inter-broker communication.
Example inter-broker configuration configuration with TLS
- 1
- Separate configurations are required for inter-broker communication and client applications.
- 2
- Configures the REPLICATION listener to use TLS, and the CLIENT listener to use SASL over an unencrypted channel. The client could use an encrypted channel (
SASL_SSL) in a production environment. - 3
- The
ssl.properties define the TLS configuration. - 4
- Random number generator implementation. If not set, the Java platform SDK default is used.
- 5
- Hostname verification. If set to an empty string, the hostname verification is turned off. If not set, the default value is
HTTPS, which enforces hostname verification for server certificates. - 6
- Path to the keystore for the listener.
- 7
- Path to the truststore for the listener.
- 8
- Specifies that clients of the REPLICATION listener have to authenticate with a client certificate when establishing a TLS connection (used for inter-broker connectivity).
The following example uses the PLAIN mechanism for fast token validation in a minimum configuration where inter-broker communication goes through the same listener as application clients.
Example inter-broker configuration configuration using the PLAIN mechanism
- 1
- Enables OAuth 2.0 authentication for inter-broker communication.
- 2
- Configures the server callback handler for
PLAINauthentication. - 3
- Configures authentication settings for client communication using
PLAINauthentication.oauth.token.endpoint.uriis an optional property that enables OAuth 2.0 over PLAIN using the OAuth 2.0 client credentials mechanism. - 4
- The OAuth 2.0 token endpoint URL to your authorization server. If specified, clients can authenticate over PLAIN by passing an access token as the
passwordusing an$accessToken:prefix.
7.2.2. Configuring OAuth 2.0 on client applications 复制链接链接已复制到粘贴板!
To configure OAuth 2.0 on client applications, you must specify the following:
- SASL (Simple Authentication and Security Layer) security protocols
- SASL mechanisms
- A JAAS (Java Authentication and Authorization Service) module
- Authentication properties to access the authorization server
Configuring SASL protocols
Specify SASL protocols in the client configuration:
-
SASL_SSLfor authentication over TLS encrypted connections -
SASL_PLAINTEXTfor authentication over unencrypted connections
Use SASL_SSL for production and SASL_PLAINTEXT for local development only.
When using SASL_SSL, additional ssl.truststore configuration is needed. The truststore configuration is required for secure connection (https://) to the OAuth 2.0 authorization server. To verify the OAuth 2.0 authorization server, add the CA certificate for the authorization server to the truststore in your client configuration. You can configure a truststore in PEM or PKCS #12 format.
Configuring SASL authentication mechanisms
Specify SASL mechanisms in the client configuration:
-
OAUTHBEARERfor credentials exchange using a bearer token -
PLAINto pass client credentials (clientId + secret) or an access token
Configuring a JAAS module
Specify a JAAS module that implements the SASL authentication mechanism as a sasl.jaas.config property value:
-
org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModuleimplements theOAUTHBEARERmechanism -
org.apache.kafka.common.security.plain.PlainLoginModuleimplements thePLAINmechanism
For the OAUTHBEARER mechanism, Streams for Apache Kafka provides a callback handler for clients that use Kafka Client Java libraries to enable credentials exchange. For clients in other languages, custom code may be required to obtain the access token. For the PLAIN mechanism, Streams for Apache Kafka provides server-side callbacks to enable credentials exchange.
To be able to use the OAUTHBEARER mechanism, you must also add the custom io.strimzi.kafka.oauth.client.JaasClientOauthLoginCallbackHandler class as the callback handler. JaasClientOauthLoginCallbackHandler handles OAuth callbacks to the authorization server for access tokens during client login. This enables automatic token renewal, ensuring continuous authentication without user intervention. Additionally, it handles login credentials for clients using the OAuth 2.0 password grant method.
Configuring authentication properties
Configure the client to use credentials or access tokens for OAuth 2.0 authentication.
- Using client credentials
- Using client credentials involves configuring the client with the necessary credentials (client ID and secret, or client ID and client assertion) to obtain a valid access token from an authorization server. This is the simplest mechanism.
- Using access tokens
- Using access tokens, the client is configured with a valid long-lived access token or refresh token obtained from an authorization server. Using access tokens adds more complexity because there is an additional dependency on authorization server tools. If you are using long-lived access tokens, you may need to configure the client in the authorization server to increase the maximum lifetime of the token.
The only information ever sent to Kafka is the access token. The credentials used to obtain the token are never sent to Kafka. When a client obtains an access token, no further communication with the authorization server is needed.
SASL authentication properties support the following authentication methods:
- OAuth 2.0 client credentials
- Access token or Service account token
- Refresh token
- OAuth 2.0 password grant (deprecated)
Add the authentication properties as JAAS configuration (sasl.jaas.config and sasl.login.callback.handler.class).
If the client application is not configured with an access token directly, the client exchanges one of the following sets of credentials for an access token during Kafka session initiation:
- Client ID and secret
- Client ID and client assertion
- Client ID, refresh token, and (optionally) a secret
- Username and password, with client ID and (optionally) a secret
You can also specify authentication properties as environment variables, or as Java system properties. For Java system properties, you can set them using setProperty and pass them on the command line using the -D option.
Example client credentials configuration using the client secret
- 1
SASL_SSLsecurity protocol for TLS-encrypted connections. UseSASL_PLAINTEXTover unencrypted connections for local development only.- 2
- The SASL mechanism specified as
OAUTHBEARERorPLAIN. - 3
- The truststore configuration for secure access to the Kafka cluster.
- 4
- URI of the authorization server token endpoint.
- 5
- Client ID, which is the name used when creating the client in the authorization server.
- 6
- Client secret created when creating the client in the authorization server.
- 7
- The location contains the public key certificate (
truststore.p12) for the authorization server. - 8
- The password for accessing the truststore.
- 9
- The truststore type.
- 10
- (Optional) The
scopefor requesting the token from the token endpoint. An authorization server may require a client to specify the scope. - 11
- (Optional) The
audiencefor requesting the token from the token endpoint. An authorization server may require a client to specify the audience.
Example client credentials configuration using the client assertion
- 1
- Path to the client assertion file used for authenticating the client. This file is a private key file as an alternative to the client secret. Alternatively, use the
oauth.client.assertionoption to specify the client assertion value in clear text. - 2
- (Optional) Sometimes you may need to specify the client assertion type. In not specified, the default value is
urn:ietf:params:oauth:client-assertion-type:jwt-bearer.
Example password grants configuration
- 1
- Client ID, which is the name used when creating the client in the authorization server.
- 2
- (Optional) Client secret created when creating the client in the authorization server.
- 3
- Username for password grant authentication. OAuth password grant configuration (username and password) uses the OAuth 2.0 password grant method. To use password grants, create a user account for a client on your authorization server with limited permissions. The account should act like a service account. Use in environments where user accounts are required for authentication, but consider using a refresh token first.
- 4
- Password for password grant authentication.Note
SASL
PLAINdoes not support passing a username and password (password grants) using the OAuth 2.0 password grant method.
Example access token configuration
- 1
- Long-lived access token for Kafka clients. Alternatively,
oauth.access.token.locationcan be used to specify the file that contains the access token.
Example OpenShift service account token configuration
- 1
- Location to the service account token on the filesystem (assuming that the client is deployed as an OpenShift pod)
Example refresh token configuration
SASL extensions for custom OAUTHBEARER implementations
If your Kafka broker uses a custom OAUTHBEARER implementation, you may need to pass additional SASL extension options. These extensions can include attributes or information required as client context by the authorization server. The options are passed as key-value pairs and are sent to the Kafka broker when a new session is started.
Pass SASL extension values using oauth.sasl.extension. as a key prefix.
Example configuration to pass SASL extension values
oauth.sasl.extension.key1="value1" oauth.sasl.extension.key2="value2"
oauth.sasl.extension.key1="value1"
oauth.sasl.extension.key2="value2"
7.2.3. OAuth 2.0 client authentication flows 复制链接链接已复制到粘贴板!
OAuth 2.0 authentication flows depend on the underlying Kafka client and Kafka broker configuration. The flows must also be supported by the authorization server used.
The Kafka broker listener configuration determines how clients authenticate using an access token. The client can pass a client ID and secret to request an access token.
If a listener is configured to use PLAIN authentication, the client can authenticate with a client ID and secret or username and access token. These values are passed as the username and password properties of the PLAIN mechanism.
Listener configuration supports the following token validation options:
- You can use fast local token validation based on JWT signature checking and local token introspection, without contacting an authorization server. The authorization server provides a JWKS endpoint with public certificates that are used to validate signatures on the tokens.
- You can use a call to a token introspection endpoint provided by an authorization server. Each time a new Kafka broker connection is established, the broker passes the access token received from the client to the authorization server. The Kafka broker checks the response to confirm whether the token is valid.
An authorization server might only allow the use of opaque access tokens, which means that local token validation is not possible.
Kafka client credentials can also be configured for the following types of authentication:
- Direct local access using a previously generated long-lived access token
- Contact with the authorization server for a new access token to be issued (using a client ID and credentials, or a refresh token, or a username and a password)
You can use the following communication flows for Kafka authentication using the SASL OAUTHBEARER mechanism.
Client using client ID and credentials, with broker delegating validation to authorization server
- The Kafka client requests an access token from the authorization server using a client ID and credentials, and optionally a refresh token. Alternatively, the client may authenticate using a username and a password.
- The authorization server generates a new access token.
-
The Kafka client authenticates with the Kafka broker using the SASL
OAUTHBEARERmechanism to pass the access token. - The Kafka broker validates the access token by calling a token introspection endpoint on the authorization server using its own client ID and secret.
- A Kafka client session is established if the token is valid.
Client using client ID and credentials, with broker performing fast local token validation
- The Kafka client authenticates with the authorization server from the token endpoint, using a client ID and credentials, and optionally a refresh token. Alternatively, the client may authenticate using a username and a password.
- The authorization server generates a new access token.
-
The Kafka client authenticates with the Kafka broker using the SASL
OAUTHBEARERmechanism to pass the access token. - The Kafka broker validates the access token locally using a JWT token signature check, and local token introspection.
Client using long-lived access token, with broker delegating validation to authorization server
-
The Kafka client authenticates with the Kafka broker using the SASL
OAUTHBEARERmechanism to pass the long-lived access token. - The Kafka broker validates the access token by calling a token introspection endpoint on the authorization server, using its own client ID and secret.
- A Kafka client session is established if the token is valid.
Client using long-lived access token, with broker performing fast local validation
-
The Kafka client authenticates with the Kafka broker using the SASL
OAUTHBEARERmechanism to pass the long-lived access token. - The Kafka broker validates the access token locally using a JWT token signature check and local token introspection.
Fast local JWT token signature validation is suitable only for short-lived tokens as there is no check with the authorization server if a token has been revoked. Token expiration is written into the token, but revocation can happen at any time, so cannot be accounted for without contacting the authorization server. Any issued token would be considered valid until it expires.
You can use the following communication flows for Kafka authentication using the OAuth PLAIN mechanism.
Client using a client ID and secret, with the broker obtaining the access token for the client
-
The Kafka client passes a
clientIdas a username and asecretas a password. -
The Kafka broker uses a token endpoint to pass the
clientIdandsecretto the authorization server. - The authorization server returns a fresh access token or an error if the client credentials are not valid.
The Kafka broker validates the token in one of the following ways:
- If a token introspection endpoint is specified, the Kafka broker validates the access token by calling the endpoint on the authorization server. A session is established if the token validation is successful.
- If local token introspection is used, a request is not made to the authorization server. The Kafka broker validates the access token locally using a JWT token signature check.
Client using a long-lived access token without a client ID and secret
- The Kafka client passes a username and password. The password provides the value of an access token that was obtained manually and configured before running the client.
The password is passed with or without an
$accessToken:string prefix depending on whether or not the Kafka broker listener is configured with a token endpoint for authentication.-
If the token endpoint is configured, the password should be prefixed by
$accessToken:to let the broker know that the password parameter contains an access token rather than a client secret. The Kafka broker interprets the username as the account username. -
If the token endpoint is not configured on the Kafka broker listener (enforcing a
no-client-credentials mode), the password should provide the access token without the prefix. The Kafka broker interprets the username as the account username. In this mode, the client doesn’t use a client ID and secret, and thepasswordparameter is always interpreted as a raw access token.
-
If the token endpoint is configured, the password should be prefixed by
The Kafka broker validates the token in one of the following ways:
- If a token introspection endpoint is specified, the Kafka broker validates the access token by calling the endpoint on the authorization server. A session is established if token validation is successful.
- If local token introspection is used, there is no request made to the authorization server. Kafka broker validates the access token locally using a JWT token signature check.
7.2.4. Re-authenticating sessions 复制链接链接已复制到粘贴板!
You can configure OAuth listeners to use Kafka session re-authentication for OAuth 2.0 sessions between Kafka clients and Kafka brokers. This mechanism enforces the expiry of an authenticated session between the client and the broker after a defined period of time. When a session expires, the client immediately starts a new session by reusing the existing connection rather than dropping it.
Session re-authentication is disabled by default. To enable it, set a time value for the connections.max.reauth.ms property in the server.properties file. For an example configuration, see Section 7.2.1, “Configuring OAuth 2.0 authentication on listeners”.
Session re-authentication must be supported by the Kafka client libraries used by the client.
Session re-authentication can be used with fast local JWT or introspection endpoint token validation.
Client re-authentication
When the broker’s authenticated session expires, the client must re-authenticate to the existing session by sending a new, valid access token to the broker, without dropping the connection.
If token validation is successful, a new client session is started using the existing connection. If the client fails to re-authenticate, the broker will close the connection if further attempts are made to send or receive messages. Java clients that use Kafka client library 2.2 or later automatically re-authenticate if the re-authentication mechanism is enabled on the broker.
Session re-authentication also applies to refresh tokens, if used. When the session expires, the client refreshes the access token by using its refresh token. The client then uses the new access token to re-authenticate over the existing connection.
Session expiry for OAUTHBEARER and PLAIN
When session re-authentication is configured, session expiry works differently for OAUTHBEARER and PLAIN authentication.
For OAUTHBEARER and PLAIN, using the client ID and secret method:
-
The broker’s authenticated session will expire at the configured
connections.max.reauth.ms. - The session will expire earlier if the access token expires before the configured time.
For PLAIN using the long-lived access token method:
-
The broker’s authenticated session will expire at the configured
connections.max.reauth.ms. - Re-authentication will fail if the access token expires before the configured time. Although session re-authentication is attempted, PLAIN has no mechanism for refreshing tokens.
If connections.max.reauth.ms is not configured, OAUTHBEARER and PLAIN clients can remain connected to brokers indefinitely, without needing to re-authenticate. Authenticated sessions do not end with access token expiry.
However, this can be considered when configuring authorization, for example, by using keycloak authorization or installing a custom authorizer.
7.2.5. Example: Enabling OAuth 2.0 authentication 复制链接链接已复制到粘贴板!
This example shows how to configure client access to a Kafka cluster using OAUth 2.0 authentication. The procedures describe the configuration required to set up OAuth 2.0 authentication on Kafka listeners and Kafka Java clients.
7.2.5.1. Configuring OAuth 2.0 support for Kafka brokers 复制链接链接已复制到粘贴板!
This procedure describes how to configure Kafka brokers so that the broker listeners are enabled to use OAuth 2.0 authentication using an authorization server.
We advise use of OAuth 2.0 over an encrypted interface through configuration of TLS listeners. Plain listeners are not recommended.
Configure the Kafka brokers using properties that support your chosen authorization server, and the type of authorization you are implementing.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- An OAuth 2.0 authorization server is deployed.
Procedure
Configure the Kafka broker listener configuration in the
server.propertiesfile.For example, using the OAUTHBEARER mechanism:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Configure broker connection settings as part of the
listener.name.client.oauthbearer.sasl.jaas.config.If required, configure access to the authorization server.
This step is normally required for a production environment, unless a technology like service mesh is used to configure secure channels outside containers.
Provide a custom truststore for connecting to a secured authorization server. SSL is always required for access to the authorization server.
Set properties to configure the truststore.
For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the certificate hostname does not match the access URL hostname, you can turn off certificate hostname validation:
oauth.ssl.endpoint.identification.algorithm=""
oauth.ssl.endpoint.identification.algorithm=""Copy to Clipboard Copied! Toggle word wrap Toggle overflow The check ensures that client connection to the authorization server is authentic. You may wish to turn off the validation in a non-production environment.
What to do next
7.2.5.2. Setting up OAuth 2.0 on Kafka Java clients 复制链接链接已复制到粘贴板!
Configure Kafka producer and consumer APIs to use OAuth 2.0 for interaction with Kafka brokers. Add a callback plugin to your client pom.xml file, then configure your client for OAuth 2.0.
How you configure the authentication properties depends on the authentication method you are using to access the OAuth 2.0 authorization server. In this procedure, the properties are specified in a properties file, then loaded into the client configuration.
Prerequisites
- Streams for Apache Kafka and Kafka are running
- An OAuth 2.0 authorization server is deployed and configured for OAuth access to Kafka brokers
- Kafka brokers are configured for OAuth 2.0
Procedure
Add the client library with OAuth 2.0 support to the
pom.xmlfile for the Kafka client:<dependency> <groupId>io.strimzi</groupId> <artifactId>kafka-oauth-client</artifactId> <version>0.15.1.redhat-00005</version> </dependency>
<dependency> <groupId>io.strimzi</groupId> <artifactId>kafka-oauth-client</artifactId> <version>0.15.1.redhat-00005</version> </dependency>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Configure the client depending on the OAuth 2.0 authentication method:
For example, specify the properties for the authentication method in a
client.propertiesfile.Input the client properties for OAUTH 2.0 authentication into the Java client code.
Example showing input of client properties
Properties props = new Properties(); try (FileReader reader = new FileReader("client.properties", StandardCharsets.UTF_8)) { props.load(reader); }Properties props = new Properties(); try (FileReader reader = new FileReader("client.properties", StandardCharsets.UTF_8)) { props.load(reader); }Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Verify that the Kafka client can access the Kafka brokers.
7.3. Using OAuth 2.0 token-based authorization 复制链接链接已复制到粘贴板!
Streams for Apache Kafka supports the use of OAuth 2.0 token-based authorization through Red Hat build of Keycloak Authorization Services, which lets you manage security policies and permissions centrally.
Security policies and permissions defined in Red Hat build of Keycloak grant access to Kafka resources. Users and clients are matched against policies that permit access to perform specific actions on Kafka brokers.
Kafka allows all users full access to brokers by default, but also provides the AclAuthorizer and StandardAuthorizer plugins to configure authorization based on Access Control Lists (ACLs). The ACL rules managed by these plugins are used to grant or deny access to resources based on username, and these rules are stored within the Kafka cluster itself.
However, OAuth 2.0 token-based authorization with Red Hat build of Keycloak offers far greater flexibility on how you wish to implement access control to Kafka brokers. In addition, you can configure your Kafka brokers to use OAuth 2.0 authorization and ACLs.
7.3.1. Example: Enabling OAuth 2.0 authorization 复制链接链接已复制到粘贴板!
This procedure describes how to configure Kafka brokers to use OAuth 2.0 authorization using Red Hat build of Keycloak Authorization Services.
Red Hat build of Keycloak server Authorization Services REST endpoints extend token-based authentication with Red Hat build of Keycloak by applying defined security policies on a particular user, and providing a list of permissions granted on different resources for that user. Policies use roles and groups to match permissions to users. OAuth 2.0 authorization enforces permissions locally based on the received list of grants for the user from Red Hat build of Keycloak Authorization Services.
A Red Hat build of Keycloak authorizer (KeycloakAuthorizer) is provided with Streams for Apache Kafka. The authorizer fetches a list of granted permissions from the authorization server as needed, and enforces authorization locally on Kafka, making rapid authorization decisions for each client request.
Before you begin
Consider the access you require or want to limit for certain users. You can use a combination of Red Hat build of Keycloak groups, roles, clients, and users to configure access in Red Hat build of Keycloak.
Typically, groups are used to match users based on organizational departments or geographical locations. And roles are used to match users based on their function.
With Red Hat build of Keycloak, you can store users and groups in LDAP, whereas clients and roles cannot be stored this way. Storage and access to user data may be a factor in how you choose to configure authorization policies.
Super users always have unconstrained access to a Kafka broker regardless of the authorization implemented on the Kafka broker.
Prerequisites
- Streams for Apache Kafka must be configured to use OAuth 2.0 with Red Hat build of Keycloak token-based authentication. You use the same RRed Hat build of Keycloak endpoint when you set up authorization.
- You need to understand how to manage policies and permissions for Red Hat build of Keycloak Authorization Services, as described in the Red Hat build of Keycloak documentation.
Procedure
- Access the Red Hat build of Keycloak Admin Console or use the Red Hat build of Keycloak Admin CLI to enable Authorization Services for the OAuth 2.0 client for Kafka you created when setting up OAuth 2.0 authentication.
- Use Authorization Services to define resources, authorization scopes, policies, and permissions for the client.
- Bind the permissions to users and clients by assigning them roles and groups.
Configure the Kafka brokers to use Red Hat build of Keycloak authorization.
Add the following to the Kafka
server.propertiesconfiguration file to install the authorizer in Kafka:authorizer.class.name=io.strimzi.kafka.oauth.server.authorizer.KeycloakAuthorizer principal.builder.class=io.strimzi.kafka.oauth.server.OAuthKafkaPrincipalBuilder
authorizer.class.name=io.strimzi.kafka.oauth.server.authorizer.KeycloakAuthorizer principal.builder.class=io.strimzi.kafka.oauth.server.OAuthKafkaPrincipalBuilderCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add configuration for the Kafka brokers to access the authorization server and Authorization Services.
Here we show example configuration added as additional properties to
server.properties, but you can also define them as environment variables using capitalized or upper-case naming conventions.strimzi.authorization.token.endpoint.uri="https://<auth_server_address>/auth/realms/REALM-NAME/protocol/openid-connect/token" strimzi.authorization.client.id="kafka"
strimzi.authorization.token.endpoint.uri="https://<auth_server_address>/auth/realms/REALM-NAME/protocol/openid-connect/token"1 strimzi.authorization.client.id="kafka"2 Copy to Clipboard Copied! Toggle word wrap Toggle overflow (Optional) Add configuration for specific Kafka clusters.
For example:
strimzi.authorization.kafka.cluster.name="kafka-cluster"
strimzi.authorization.kafka.cluster.name="kafka-cluster"1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The name of a specific Kafka cluster. Names are used to target permissions, making it possible to manage multiple clusters within the same Red Hat build of Keycloak realm. The default value is
kafka-cluster.
(Optional) Delegate to simple authorization:
strimzi.authorization.delegate.to.kafka.acl="true"
strimzi.authorization.delegate.to.kafka.acl="true"1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Delegate authorization to Kafka
AclAuthorizerif access is denied by Red Hat build of Keycloak Authorization Services policies. The default isfalse.
(Optional) Add configuration for TLS connection to the authorization server.
For example:
strimzi.authorization.ssl.truststore.location=<path_to_truststore> strimzi.authorization.ssl.truststore.password=<my_truststore_password> strimzi.authorization.ssl.truststore.type=JKS strimzi.authorization.ssl.secure.random.implementation=SHA1PRNG strimzi.authorization.ssl.endpoint.identification.algorithm=HTTPS
strimzi.authorization.ssl.truststore.location=<path_to_truststore>1 strimzi.authorization.ssl.truststore.password=<my_truststore_password>2 strimzi.authorization.ssl.truststore.type=JKS3 strimzi.authorization.ssl.secure.random.implementation=SHA1PRNG4 strimzi.authorization.ssl.endpoint.identification.algorithm=HTTPS5 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The path to the truststore that contain the certificates.
- 2
- The password for the truststore.
- 3
- The truststore type. If not set, the default Java keystore type is used.
- 4
- Random number generator implementation. If not set, the Java platform SDK default is used.
- 5
- Hostname verification. If set to an empty string, the hostname verification is turned off. If not set, the default value is
HTTPS, which enforces hostname verification for server certificates.
(Optional) Configure the refresh of grants from the authorization server. The grants refresh job works by enumerating the active tokens and requesting the latest grants for each.
For example:
strimzi.authorization.grants.refresh.period.seconds="120" strimzi.authorization.grants.refresh.pool.size="10" strimzi.authorization.grants.max.idle.time.seconds="300" strimzi.authorization.grants.gc.period.seconds="300" strimzi.authorization.reuse.grants="false"
strimzi.authorization.grants.refresh.period.seconds="120"1 strimzi.authorization.grants.refresh.pool.size="10"2 strimzi.authorization.grants.max.idle.time.seconds="300"3 strimzi.authorization.grants.gc.period.seconds="300"4 strimzi.authorization.reuse.grants="false"5 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specifies how often the list of grants from the authorization server is refreshed (once per minute by default). To turn grants refresh off for debugging purposes, set to
"0". - 2
- Specifies the size of the thread pool (the degree of parallelism) used by the grants refresh job. The default value is
"5". - 3
- The time, in seconds, after which an idle grant in the cache can be evicted. The default value is 300.
- 4
- The time, in seconds, between consecutive runs of a job that cleans stale grants from the cache. The default value is 300.
- 5
- Controls whether the latest grants are fetched for a new session. When disabled, grants are retrieved from Red Hat build of Keycloak and cached for the user. The default value is
true.
(Optional) Configure network timeouts when communicating with the authorization server.
For example:
strimzi.authorization.connect.timeout.seconds="60" strimzi.authorization.read.timeout.seconds="60" strimzi.authorization.http.retries="2"
strimzi.authorization.connect.timeout.seconds="60"1 strimzi.authorization.read.timeout.seconds="60"2 strimzi.authorization.http.retries="2"3 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The connect timeout in seconds when connecting to the Red Hat build of Keycloak token endpoint. The default value is
60. - 2
- The read timeout in seconds when connecting to the Red Hat build of Keycloak token endpoint. The default value is
60. - 3
- The maximum number of times to retry (without pausing) a failed HTTP request to the authorization server. The default value is
0, meaning that no retries are performed. To use this option effectively, consider reducing the timeout times for thestrimzi.authorization.connect.timeout.secondsandstrimzi.authorization.read.timeout.secondsoptions. However, note that retries may prevent the current worker thread from being available to other requests, and if too many requests stall, it could make Kafka unresponsive.
(Optional) Enable OAuth 2.0 metrics for token validation and authorization:
oauth.enable.metrics="true"
oauth.enable.metrics="true"1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Controls whether to enable or disable OAuth metrics. The default value is
false.
(Optional) Remove the
Acceptheader from requests:oauth.include.accept.header="false"
oauth.include.accept.header="false"1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Set to
falseif including the header is causing issues when communicating with the authorization server. The default value istrue.
- Verify the configured permissions by accessing Kafka brokers as clients or users with specific roles, ensuring they have the necessary access and do not have unauthorized access.
Chapter 8. Using OPA policy-based authorization 复制链接链接已复制到粘贴板!
Open Policy Agent (OPA) is an open-source policy engine. You can integrate OPA with Streams for Apache Kafka to act as a policy-based authorization mechanism for permitting client operations on Kafka brokers.
When a request is made from a client, OPA will evaluate the request against policies defined for Kafka access, then allow or deny the request.
Red Hat does not support the OPA server.
8.1. Defining OPA policies 复制链接链接已复制到粘贴板!
Before integrating OPA with Streams for Apache Kafka, consider how you will define policies to provide fine-grained access controls.
You can define access control for Kafka clusters, consumer groups and topics. For instance, you can define an authorization policy that allows write access from a producer client to a specific broker topic.
For this, the policy might specify the:
- User principal and host address associated with the producer client
- Operations allowed for the client
-
Resource type (
topic) and resource name the policy applies to
Allow and deny decisions are written into the policy, and a response is provided based on the request and client identification data provided.
In our example the producer client would have to satisfy the policy to be allowed to write to the topic.
8.2. Connecting to the OPA 复制链接链接已复制到粘贴板!
To enable Kafka to access the OPA policy engine to query access control policies, , you configure a custom OPA authorizer plugin (kafka-authorizer-opa-VERSION.jar) in your Kafka server.properties file.
When a request is made by a client, the OPA policy engine is queried by the plugin using a specified URL address and a REST endpoint, which must be the name of the defined policy.
The plugin provides the details of the client request — user principal, operation, and resource — in JSON format to be checked against the policy. The details will include the unique identity of the client; for example, taking the distinguished name from the client certificate if TLS authentication is used.
OPA uses the data to provide a response — either true or false — to the plugin to allow or deny the request.
8.3. Configuring OPA authorization support 复制链接链接已复制到粘贴板!
This procedure describes how to configure Kafka brokers to use OPA authorization.
Before you begin
Consider the access you require or want to limit for certain users. You can use a combination of users and Kafka resources to define OPA policies.
It is possible to set up OPA to load user information from an LDAP data source.
Super users always have unconstrained access to a Kafka broker regardless of the authorization implemented on the Kafka broker.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- An OPA server must be available for connection.
- The OPA authorizer plugin for Kafka.
Procedure
Write the OPA policies required for authorizing client requests to perform operations on the Kafka brokers.
Now configure the Kafka brokers to use OPA.
Install the OPA authorizer plugin for Kafka.
Make sure that the plugin files are included in the Kafka classpath.
Add the following to the Kafka
server.propertiesconfiguration file to enable the OPA plugin:authorizer.class.name: com.bisnode.kafka.authorization.OpaAuthorizer
authorizer.class.name: com.bisnode.kafka.authorization.OpaAuthorizerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add further configuration to
server.propertiesfor the Kafka brokers to access the OPA policy engine and policies.For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- (Required) The OAuth 2.0 token endpoint URL for the policy the authorizer plugin will query. In this example, the policy is called
allow. - 2
- Flag to specify whether a client is allowed or denied access by default if the authorizer plugin fails to connect with the OPA policy engine.
- 3
- Initial capacity in bytes of the local cache. The cache is used so that the plugin does not have to query the OPA policy engine for every request.
- 4
- Maximum capacity in bytes of the local cache.
- 5
- Time in milliseconds that the local cache is refreshed by reloading from the OPA policy engine.
- 6
- A list of user principals treated as super users, so that they are always allowed without querying the Open Policy Agent policy.
Refer to the Open Policy Agent website for information on authentication and authorization options.
- Verify the configured permissions by accessing Kafka brokers using clients that have and do not have the correct authorization.
Chapter 9. Creating and managing topics 复制链接链接已复制到粘贴板!
Messages in Kafka are always sent to or received from a topic. This chapter describes how to create and manage Kafka topics.
9.1. Partitions and replicas 复制链接链接已复制到粘贴板!
A topic is always split into one or more partitions. Partitions act as shards. That means that every message sent by a producer is always written only into a single partition.
Each partition can have one or more replicas, which will be stored on different brokers in the cluster. When creating a topic you can configure the number of replicas using the replication factor. Replication factor defines the number of copies which will be held within the cluster. One of the replicas for a given partition will be elected as a leader. The leader replica will be used by the producers to send new messages and by the consumers to consume messages. The other replicas will be follower replicas. The followers replicate the leader.
If the leader fails, one of the in-sync followers will automatically become the new leader. Each server acts as a leader for some of its partitions and a follower for others so the load is well balanced within the cluster.
The replication factor determines the number of replicas including the leader and the followers. For example, if you set the replication factor to 3, then there will be one leader and two follower replicas.
9.2. Message retention 复制链接链接已复制到粘贴板!
The message retention policy defines how long the messages will be stored on the Kafka brokers. It can be defined based on time, partition size or both.
For example, you can define that the messages should be kept:
- For 7 days
- Until the partition has 1GB of messages. Once the limit is reached, the oldest messages will be removed.
- For 7 days or until the 1GB limit has been reached. Whatever limit comes first will be used.
Kafka brokers store messages in log segments. The messages which are past their retention policy will be deleted only when a new log segment is created. New log segments are created when the previous log segment exceeds the configured log segment size. Additionally, users can request new segments to be created periodically.
Kafka brokers support a compacting policy.
For a topic with the compacted policy, the broker will always keep only the last message for each key. The older messages with the same key will be removed from the partition. Because compacting is a periodically executed action, it does not happen immediately when the new message with the same key is sent to the partition. Instead it might take some time until the older messages are removed.
For more information about the message retention configuration options, see Section 9.5, “Topic configuration”.
9.3. Topic auto-creation 复制链接链接已复制到粘贴板!
By default, Kafka automatically creates a topic if a producer or consumer attempts to send or receive messages from a non-existent topic. This behavior is governed by the auto.create.topics.enable configuration property, which is set to true by default.
For production environments, it is recommended to disable automatic topic creation. To do so, set auto.create.topics.enable to false in the Kafka configuration properties file:
Disabling automatic topic creation
auto.create.topics.enable=false
auto.create.topics.enable=false
9.4. Topic deletion 复制链接链接已复制到粘贴板!
Kafka provides the option to prevent topic deletion, controlled by the delete.topic.enable property. By default, this property is set to true, allowing topics to be deleted.
However, setting it to false in the Kafka configuration properties file will disable topic deletion. In this case, attempts to delete a topic will return a success status, but the topic itself will not be deleted.
Disabling topic deletion
delete.topic.enable=false
delete.topic.enable=false
9.5. Topic configuration 复制链接链接已复制到粘贴板!
Auto-created topics will use the default topic configuration which can be specified in the broker properties file. However, when creating topics manually, their configuration can be specified at creation time. It is also possible to change a topic’s configuration after it has been created. The main topic configuration options for manually created topics are:
cleanup.policy-
Configures the retention policy to
deleteorcompact. Thedeletepolicy will delete old records. Thecompactpolicy will enable log compaction. The default value isdelete. For more information about log compaction, see Kafka website. compression.type-
Specifies the compression which is used for stored messages. Valid values are
gzip,snappy,lz4,uncompressed(no compression) andproducer(retain the compression codec used by the producer). The default value isproducer. max.message.bytes-
The maximum size of a batch of messages allowed by the Kafka broker, in bytes. The default value is
1000012. min.insync.replicas-
The minimum number of replicas which must be in sync for a write to be considered successful. The default value is
1. retention.ms-
Maximum number of milliseconds for which log segments will be retained. Log segments older than this value will be deleted. The default value is
604800000(7 days). retention.bytes-
The maximum number of bytes a partition will retain. Once the partition size grows over this limit, the oldest log segments will be deleted. Value of
-1indicates no limit. The default value is-1. segment.bytes-
The maximum file size of a single commit log segment file in bytes. When the segment reaches its size, a new segment will be started. The default value is
1073741824bytes (1 gibibyte).
The defaults for auto-created topics can be specified in the Kafka broker configuration using similar options:
log.cleanup.policy-
See
cleanup.policyabove. compression.type-
See
compression.typeabove. message.max.bytes-
See
max.message.bytesabove. min.insync.replicas-
See
min.insync.replicasabove. log.retention.ms-
See
retention.msabove. log.retention.bytes-
See
retention.bytesabove. log.segment.bytes-
See
segment.bytesabove. default.replication.factor-
Default replication factor for automatically created topics. Default value is
1. num.partitions-
Default number of partitions for automatically created topics. Default value is
1.
9.6. Internal topics 复制链接链接已复制到粘贴板!
Internal topics are created and used internally by the Kafka brokers and clients. Kafka has several internal topics, two of which are used to store consumer offsets (__consumer_offsets) and transaction state (__transaction_state).
__consumer_offsets and __transaction_state topics can be configured using dedicated Kafka broker configuration options starting with prefix offsets.topic. and transaction.state.log..
The most important configuration options are:
offsets.topic.replication.factor-
Number of replicas for
__consumer_offsetstopic. The default value is3. offsets.topic.num.partitions-
Number of partitions for
__consumer_offsetstopic. The default value is50. transaction.state.log.replication.factor-
Number of replicas for
__transaction_statetopic. The default value is3. transaction.state.log.num.partitions-
Number of partitions for
__transaction_statetopic. The default value is50. transaction.state.log.min.isr-
Minimum number of replicas that must acknowledge a write to
__transaction_statetopic to be considered successful. If this minimum cannot be met, then the producer will fail with an exception. The default value is2.
9.7. Creating a topic 复制链接链接已复制到粘贴板!
Use the kafka-topics.sh tool to manage topics. kafka-topics.sh is part of the Streams for Apache Kafka distribution and is found in the bin directory.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
Creating a topic
Create a topic using the
kafka-topics.shutility and specify the following:-
Host and port of the Kafka broker in the
--bootstrap-serveroption. -
The new topic to be created in the
--createoption. -
Topic name in the
--topicoption. -
The number of partitions in the
--partitionsoption. Topic replication factor in the
--replication-factoroption.You can also override some of the default topic configuration options using the option
--config. This option can be used multiple times to override different options../bin/kafka-topics.sh --bootstrap-server <broker_address> --create --topic <topic_name> --partitions <number_of_partitions> --replication-factor <replication_factor> --config <option_1>=<value_1> --config <option_2>=<value_2>
./bin/kafka-topics.sh --bootstrap-server <broker_address> --create --topic <topic_name> --partitions <number_of_partitions> --replication-factor <replication_factor> --config <option_1>=<value_1> --config <option_2>=<value_2>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example of the command to create a topic named
mytopic[source,shell,subs=+quotes] ./bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic mytopic --partitions 50 --replication-factor 3 --config cleanup.policy=compact --config min.insync.replicas=2
[source,shell,subs=+quotes] ./bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic mytopic --partitions 50 --replication-factor 3 --config cleanup.policy=compact --config min.insync.replicas=2Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Host and port of the Kafka broker in the
Verify that the topic exists using
kafka-topics.sh../bin/kafka-topics.sh --bootstrap-server <broker_address> --describe --topic <topic_name>
./bin/kafka-topics.sh --bootstrap-server <broker_address> --describe --topic <topic_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example of the command to describe a topic named
mytopic./bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic mytopic
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic mytopicCopy to Clipboard Copied! Toggle word wrap Toggle overflow
9.8. Listing and describing topics 复制链接链接已复制到粘贴板!
The kafka-topics.sh tool can be used to list and describe topics. kafka-topics.sh is part of the Streams for Apache Kafka distribution and can be found in the bin directory.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
Describing a topic
Describe a topic using the
kafka-topics.shutility and specify the following:-
Host and port of the Kafka broker in the
--bootstrap-serveroption. -
Use the
--describeoption to specify that you want to describe a topic. -
Topic name must be specified in the
--topicoption. When the
--topicoption is omitted, it describes all available topics../bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --describe --topic <topic_name>
./bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --describe --topic <topic_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example of the command to describe a topic named
mytopic./bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic mytopic
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic mytopicCopy to Clipboard Copied! Toggle word wrap Toggle overflow The command lists all partitions and replicas which belong to this topic. It also lists all topic configuration options.
-
Host and port of the Kafka broker in the
9.9. Modifying a topic configuration 复制链接链接已复制到粘贴板!
The kafka-configs.sh tool can be used to modify topic configurations. kafka-configs.sh is part of the Streams for Apache Kafka distribution and can be found in the bin directory.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
Modify topic configuration
Use the
kafka-configs.shtool to get the current configuration.-
Specify the host and port of the Kafka broker in the
--bootstrap-serveroption. -
Set the
--entity-typeastopicand--entity-nameto the name of your topic. Use
--describeoption to get the current configuration../bin/kafka-configs.sh --bootstrap-server <broker_host>:<port> --entity-type topics --entity-name <topic_name> --describe
./bin/kafka-configs.sh --bootstrap-server <broker_host>:<port> --entity-type topics --entity-name <topic_name> --describeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example of the command to get configuration of a topic named
mytopic./bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name mytopic --describe
./bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name mytopic --describeCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Specify the host and port of the Kafka broker in the
Use the
kafka-configs.shtool to change the configuration.-
Specify the host and port of the Kafka broker in the
--bootstrap-serveroption. -
Set the
--entity-typeastopicand--entity-nameto the name of your topic. -
Use
--alteroption to modify the current configuration. Specify the options you want to add or change in the option
--add-config../bin/kafka-configs.sh --bootstrap-server <broker_host>:<port> --entity-type topics --entity-name <topic_name> --alter --add-config <option>=<value>
./bin/kafka-configs.sh --bootstrap-server <broker_host>:<port> --entity-type topics --entity-name <topic_name> --alter --add-config <option>=<value>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example of the command to change configuration of a topic named
mytopic./bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name mytopic --alter --add-config min.insync.replicas=1
./bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name mytopic --alter --add-config min.insync.replicas=1Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Specify the host and port of the Kafka broker in the
Use the
kafka-configs.shtool to delete an existing configuration option.-
Specify the host and port of the Kafka broker in the
--bootstrap-serveroption. -
Set the
--entity-typeastopicand--entity-nameto the name of your topic. -
Use
--delete-configoption to remove existing configuration option. Specify the options you want to remove in the option
--remove-config../bin/kafka-configs.sh --bootstrap-server <broker_host>:<port> --entity-type topics --entity-name <topic_name> --alter --delete-config <option>
./bin/kafka-configs.sh --bootstrap-server <broker_host>:<port> --entity-type topics --entity-name <topic_name> --alter --delete-config <option>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example of the command to change configuration of a topic named
mytopic./bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name mytopic --alter --delete-config min.insync.replicas
./bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name mytopic --alter --delete-config min.insync.replicasCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Specify the host and port of the Kafka broker in the
9.10. Deleting a topic 复制链接链接已复制到粘贴板!
The kafka-topics.sh tool can be used to manage topics. kafka-topics.sh is part of the Streams for Apache Kafka distribution and can be found in the bin directory.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
Deleting a topic
Delete a topic using the
kafka-topics.shutility.-
Host and port of the Kafka broker in the
--bootstrap-serveroption. -
Use the
--deleteoption to specify that an existing topic should be deleted. Topic name must be specified in the
--topicoption../bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --delete --topic <topic_name>
./bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --delete --topic <topic_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example of the command to create a topic named
mytopic./bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic mytopic
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic mytopicCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Host and port of the Kafka broker in the
Verify that the topic was deleted using
kafka-topics.sh../bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --list
./bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example of the command to list all topics
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --list
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --listCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Use Kafka Connect to stream data between Kafka and external systems. Kafka Connect provides a framework for moving large amounts of data while maintaining scalability and reliability. Kafka Connect is typically used to integrate Kafka with database, storage, and messaging systems that are external to your Kafka cluster.
Kafka Connect runs in standalone or distributed modes.
- Standalone mode
- In standalone mode, Kafka Connect runs on a single node. Standalone mode is intended for development and testing.
- Distributed mode
- In distributed mode, Kafka Connect runs across one or more worker nodes and the workloads are distributed among them. Distributed mode is intended for production.
Kafka Connect uses connector plugins that implement connectivity for different types of external systems. There are two types of connector plugins: sink and source. Sink connectors stream data from Kafka to external systems. Source connectors stream data from external systems into Kafka.
You can also use the Kafka Connect REST API to create, manage, and monitor connector instances.
Connector configuration specifies details such as the source or sink connectors and the Kafka topics to read from or write to. How you manage the configuration depends on whether you are running Kafka Connect in standalone or distributed mode.
- In standalone mode, you can provide the connector configuration as JSON through the Kafka Connect REST API or you can use properties files to define the configuration.
- In distributed mode, you can only provide the connector configuration as JSON through the Kafka Connect REST API.
Handling high volumes of messages
You can tune the configuration to handle high volumes of messages. For more information, see Handling high volumes of messages.
10.1. Using Kafka Connect in standalone mode 复制链接链接已复制到粘贴板!
In Kafka Connect standalone mode, connectors run on the same node as the Kafka Connect worker process, which runs as a single process in a single JVM. This means that the worker process and connectors share the same resources, such as CPU, memory, and disk.
10.1.1. Configuring Kafka Connect in standalone mode 复制链接链接已复制到粘贴板!
To configure Kafka Connect in standalone mode, edit the config/connect-standalone.properties configuration file. The following options are the most important.
bootstrap.servers-
A list of Kafka broker addresses used as bootstrap connections to Kafka. For example,
kafka0.my-domain.com:9092,kafka1.my-domain.com:9092,kafka2.my-domain.com:9092. key.converter-
The class used to convert message keys to and from Kafka format. For example,
org.apache.kafka.connect.json.JsonConverter. value.converter-
The class used to convert message payloads to and from Kafka format. For example,
org.apache.kafka.connect.json.JsonConverter. offset.storage.file.filename- Specifies the file in which the offset data is stored.
Connector plugins open client connections to the Kafka brokers using the bootstrap address. To configure these connections, use the standard Kafka producer and consumer configuration options prefixed by producer. or consumer..
10.1.2. Running Kafka Connect in standalone mode 复制链接链接已复制到粘贴板!
Configure and run Kafka Connect in standalone mode.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
You have specified connector configuration in properties files.
You can also use the Kafka Connect REST API to manage connectors.
Procedure
Edit the
./config/connect-standalone.propertiesKafka Connect configuration file and setbootstrap.serverto point to your Kafka brokers. For example:bootstrap.servers=kafka0.my-domain.com:9092,kafka1.my-domain.com:9092,kafka2.my-domain.com:9092
bootstrap.servers=kafka0.my-domain.com:9092,kafka1.my-domain.com:9092,kafka2.my-domain.com:9092Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start Kafka Connect with the configuration file and specify one or more connector configurations.
./bin/connect-standalone.sh ./config/connect-standalone.properties connector1.properties [connector2.properties ...]
./bin/connect-standalone.sh ./config/connect-standalone.properties connector1.properties [connector2.properties ...]Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that Kafka Connect is running.
jcmd | grep ConnectStandalone
jcmd | grep ConnectStandaloneCopy to Clipboard Copied! Toggle word wrap Toggle overflow
10.2. Using Kafka Connect in distributed mode 复制链接链接已复制到粘贴板!
In distributed mode, Kafka Connect runs as a cluster of worker processes, with each worker running on a separate node. Connectors can run on any worker in the cluster, allowing for greater scalability and fault tolerance. The connectors are managed by the workers, which coordinate with each other to distribute the work and ensure that each connector is running on a single node at any given time.
10.2.1. Configuring Kafka Connect in distributed mode 复制链接链接已复制到粘贴板!
To configure Kafka Connect in distributed mode, edit the config/connect-distributed.properties configuration file. The following options are the most important.
bootstrap.servers-
A list of Kafka broker addresses used as bootstrap connections to Kafka. For example,
kafka0.my-domain.com:9092,kafka1.my-domain.com:9092,kafka2.my-domain.com:9092. key.converter-
The class used to convert message keys to and from Kafka format. For example,
org.apache.kafka.connect.json.JsonConverter. value.converter-
The class used to convert message payloads to and from Kafka format. For example,
org.apache.kafka.connect.json.JsonConverter. group.id-
The name of the distributed Kafka Connect cluster. This must be unique and must not conflict with another consumer group ID. The default value is
connect-cluster. config.storage.topic-
The Kafka topic used to store connector configurations. The default value is
connect-configs. offset.storage.topic-
The Kafka topic used to store offsets. The default value is
connect-offset. status.storage.topic-
The Kafka topic used for worker node statuses. The default value is
connect-status.
Streams for Apache Kafka includes an example configuration file for Kafka Connect in distributed mode – see config/connect-distributed.properties in the Streams for Apache Kafka installation directory.
Connector plugins open client connections to the Kafka brokers using the bootstrap address. To configure these connections, use the standard Kafka producer and consumer configuration options prefixed by producer. or consumer..
10.2.2. Running Kafka Connect in distributed mode 复制链接链接已复制到粘贴板!
Configure and run Kafka Connect in distributed mode.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
Running the cluster
Edit the
./config/connect-distributed.propertiesKafka Connect configuration file on all Kafka Connect worker nodes.-
Set the
bootstrap.serveroption to point to your Kafka brokers. -
Set the
group.idoption. -
Set the
config.storage.topicoption. -
Set the
offset.storage.topicoption. Set the
status.storage.topicoption.For example:
bootstrap.servers=kafka0.my-domain.com:9092,kafka1.my-domain.com:9092,kafka2.my-domain.com:9092 group.id=my-group-id config.storage.topic=my-group-id-configs offset.storage.topic=my-group-id-offsets status.storage.topic=my-group-id-status
bootstrap.servers=kafka0.my-domain.com:9092,kafka1.my-domain.com:9092,kafka2.my-domain.com:9092 group.id=my-group-id config.storage.topic=my-group-id-configs offset.storage.topic=my-group-id-offsets status.storage.topic=my-group-id-statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Set the
Start the Kafka Connect workers with the
./config/connect-distributed.propertiesconfiguration file on all Kafka Connect nodes../bin/connect-distributed.sh ./config/connect-distributed.properties
./bin/connect-distributed.sh ./config/connect-distributed.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that Kafka Connect is running.
jcmd | grep ConnectDistributed
jcmd | grep ConnectDistributedCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Use the Kafka Connect REST API to manage connectors.
10.3. Managing connectors 复制链接链接已复制到粘贴板!
The Kafka Connect REST API provides endpoints for creating, updating, and deleting connectors directly. You can also use the API to check the status of connectors or change logging levels. When you create a connector through the API, you provide the configuration details for the connector as part of the API call.
You can also add and manage connectors as plugins. Plugins are packaged as JAR files that contain the classes to implement the connectors through the Kafka Connect API. You just need to specify the plugin in the classpath or add it to a plugin path for Kafka Connect to run the connector plugin on startup.
In addition to using the Kafka Connect REST API or plugins to manage connectors, you can also add connector configuration using properties files when running Kafka Connect in standalone mode. To do this, you simply specify the location of the properties file when starting the Kafka Connect worker process. The properties file should contain the configuration details for the connector, including the connector class, source and destination topics, and any required authentication or serialization settings.
10.3.1. Limiting access to the Kafka Connect API 复制链接链接已复制到粘贴板!
The Kafka Connect REST API can be accessed by anyone who has authenticated access and knows the endpoint URL, which includes the hostname/IP address and port number. It is crucial to restrict access to the Kafka Connect API only to trusted users to prevent unauthorized actions and potential security issues.
For improved security, we recommend configuring the following properties for the Kafka Connect API:
-
(Kafka 3.4 or later)
org.apache.kafka.disallowed.login.modulesto specifically exclude insecure login modules -
connector.client.config.override.policyset toNONEto prevent connector configurations from overriding the Kafka Connect configuration and the consumers and producers it uses
10.3.2. Configuring connectors 复制链接链接已复制到粘贴板!
Use the Kafka Connect REST API or properties files to create, manage, and monitor connector instances. You can use the REST API when using Kafka Connect in standalone or distributed mode. You can use properties files when using Kafka Connect in standalone mode.
When using the Kafka Connect REST API, you can create connectors dynamically by sending PUT or POST HTTP requests to the Kafka Connect REST API, specifying the connector configuration details in the request body.
When you use the PUT command, it’s the same command for starting and updating connectors.
The REST interface listens on port 8083 by default and supports the following endpoints:
GET /connectors- Return a list of existing connectors.
POST /connectors- Create a connector. The request body has to be a JSON object with the connector configuration.
GET /connectors/<connector_name>- Get information about a specific connector.
GET /connectors/<connector_name>/config- Get configuration of a specific connector.
PUT /connectors/<connector_name>/config- Update the configuration of a specific connector.
GET /connectors/<connector_name>/status- Get the status of a specific connector.
GET /connectors/<connector_name>/tasks- Get a list of tasks for a specific connector
GET /connectors/<connector_name>/tasks/<task_id>/status- Get the status of a task for a specific connector
PUT /connectors/<connector_name>/pause- Pause the connector and all its tasks. The connector will stop processing any messages.
PUT /connectors/<connector_name>/stop- Stop the connector and all its tasks. The connector will stop processing any messages. Stopping a connector from running may be more suitable for longer durations than just pausing.
PUT /connectors/<connector_name>/resume- Resume a paused connector.
POST /connectors/<connector_name>/restart- Restart a connector in case it has failed.
POST /connectors/<connector_name>/tasks/<task_id>/restart- Restart a specific task.
DELETE /connectors/<connector_name>- Delete a connector.
GET /connectors/<connector_name>/topics- Get the topics for a specific connector.
PUT /connectors/<connector_name>/topics/reset- Empty the set of active topics for a specific connector.
GET /connectors/<connector_name>/offsets- Get the current offsets for a connector.
DELETE /connectors/<connector_name>/offsets- Reset the offsets for a connector, which must be in a stopped state.
PATCH /connectors/<connector_name>/offsets-
Adjust the offsets (using an
offsetproperty in the request) for a connector, which must be in a stopped state. GET /connector-plugins- Get a list of all supported connector plugins.
GET /connector-plugins/<connector_plugin_type>/config- Get the configuration for a connector plugin.
PUT /connector-plugins/<connector_type>/config/validate- Validate connector configuration.
10.3.2.2. Specifying connector configuration properties 复制链接链接已复制到粘贴板!
To configure a Kafka Connect connector, you need to specify the configuration details for source or sink connectors. There are two ways to do this: through the Kafka Connect REST API, using JSON to provide the configuration, or by using properties files to define the configuration properties. The specific configuration options available for each type of connector may differ, but both methods provide a flexible way to specify the necessary settings.
The following options apply to all connectors:
name- The name of the connector, which must be unique within the current Kafka Connect instance.
connector.class-
The class of the connector plug-in. For example,
org.apache.kafka.connect.file.FileStreamSinkConnector. tasks.max- The maximum number of tasks that the specified connector can use. Tasks enable the connector to perform work in parallel. The connector might create fewer tasks than specified.
key.converter-
The class used to convert message keys to and from Kafka format. This overrides the default value set by the Kafka Connect configuration. For example,
org.apache.kafka.connect.json.JsonConverter. value.converter-
The class used to convert message payloads to and from Kafka format. This overrides the default value set by the Kafka Connect configuration. For example,
org.apache.kafka.connect.json.JsonConverter.
You must set at least one of the following options for sink connectors:
topics- A comma-separated list of topics used as input.
topics.regex- A Java regular expression of topics used as input.
For all other options, see the connector properties in the Apache Kafka documentation.
Streams for Apache Kafka includes the example connector configuration files config/connect-file-sink.properties and config/connect-file-source.properties in the Streams for Apache Kafka installation directory.
10.3.3. Creating connectors using the Kafka Connect API 复制链接链接已复制到粘贴板!
Use the Kafka Connect REST API to create a connector to use with Kafka Connect.
Prerequisites
- A Kafka Connect installation.
Procedure
Prepare a JSON payload with the connector configuration. For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Send a POST request to
<KafkaConnectAddress>:8083/connectorsto create the connector. The following example usescurl:curl -X POST -H "Content-Type: application/json" --data @sink-connector.json http://connect0.my-domain.com:8083/connectors
curl -X POST -H "Content-Type: application/json" --data @sink-connector.json http://connect0.my-domain.com:8083/connectorsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the connector was deployed by sending a GET request to
<KafkaConnectAddress>:8083/connectors. The following example usescurl:curl http://connect0.my-domain.com:8083/connectors
curl http://connect0.my-domain.com:8083/connectorsCopy to Clipboard Copied! Toggle word wrap Toggle overflow
10.3.4. Deleting connectors using the Kafka Connect API 复制链接链接已复制到粘贴板!
Use the Kafka Connect REST API to delete a connector from Kafka Connect.
Prerequisites
- A Kafka Connect installation.
Deleting connectors
Verify that the connector exists by sending a
GETrequest to<KafkaConnectAddress>:8083/connectors/<ConnectorName>. The following example usescurl:curl http://connect0.my-domain.com:8083/connectors
curl http://connect0.my-domain.com:8083/connectorsCopy to Clipboard Copied! Toggle word wrap Toggle overflow To delete the connector, send a
DELETErequest to<KafkaConnectAddress>:8083/connectors. The following example usescurl:curl -X DELETE http://connect0.my-domain.com:8083/connectors/my-connector
curl -X DELETE http://connect0.my-domain.com:8083/connectors/my-connectorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the connector was deleted by sending a GET request to
<KafkaConnectAddress>:8083/connectors. The following example usescurl:curl http://connect0.my-domain.com:8083/connectors
curl http://connect0.my-domain.com:8083/connectorsCopy to Clipboard Copied! Toggle word wrap Toggle overflow
10.3.5. Adding connector plugins 复制链接链接已复制到粘贴板!
Kafka provides example connectors to use as a starting point for developing connectors. The following example connectors are included with Streams for Apache Kafka:
- FileStreamSink
- Reads data from Kafka topics and writes the data to a file.
- FileStreamSource
- Reads data from a file and sends the data to Kafka topics.
Both connectors are contained in the libs/connect-file-<kafka_version>.redhat-<build>.jar plugin.
To use the connector plugins in Kafka Connect, you can add them to the classpath or specify a plugin path in the Kafka Connect properties file and copy the plugins to the location.
Specifying the example connectors in the classpath
CLASSPATH=/opt/kafka/libs/connect-file-<kafka_version>.redhat-<build>.jar opt/kafka/bin/connect-distributed.sh
CLASSPATH=/opt/kafka/libs/connect-file-<kafka_version>.redhat-<build>.jar opt/kafka/bin/connect-distributed.sh
Setting a plugin path
plugin.path=/opt/kafka/connector-plugins,/opt/connectors
plugin.path=/opt/kafka/connector-plugins,/opt/connectors
The plugin.path configuration option can contain a comma-separated list of paths.
You can add more connector plugins if needed. Kafka Connect searches for and runs connector plugins at startup.
When running Kafka Connect in distributed mode, plugins must be made available on all worker nodes.
Use MirrorMaker 2 to replicate data between two or more active Kafka clusters, within or across data centers.
To configure MirrorMaker 2, edit the config/connect-mirror-maker.properties configuration file. If required, you can enable distributed tracing for MirrorMaker 2.
Handling high volumes of messages
You can tune the configuration to handle high volumes of messages. For more information, see Handling high volumes of messages.
MirrorMaker 2 has features not supported by the previous version of MirrorMaker. However, you can configure MirrorMaker 2 to be used in legacy mode.
11.1. Configuring active/active or active/passive modes 复制链接链接已复制到粘贴板!
You can use MirrorMaker 2 in active/passive or active/active cluster configurations.
- active/active cluster configuration
- An active/active configuration has two active clusters replicating data bidirectionally. Applications can use either cluster. Each cluster can provide the same data. In this way, you can make the same data available in different geographical locations. As consumer groups are active in both clusters, consumer offsets for replicated topics are not synchronized back to the source cluster.
- active/passive cluster configuration
- An active/passive configuration has an active cluster replicating data to a passive cluster. The passive cluster remains on standby. You might use the passive cluster for data recovery in the event of system failure.
The expectation is that producers and consumers connect to active clusters only. A MirrorMaker 2 cluster is required at each target destination.
11.1.1. Bidirectional replication (active/active) 复制链接链接已复制到粘贴板!
The MirrorMaker 2 architecture supports bidirectional replication in an active/active cluster configuration.
Each cluster replicates the data of the other cluster using the concept of source and remote topics. As the same topics are stored in each cluster, remote topics are automatically renamed by MirrorMaker 2 to represent the source cluster. The name of the originating cluster is prepended to the name of the topic.
Figure 11.1. Topic renaming
By flagging the originating cluster, topics are not replicated back to that cluster.
The concept of replication through remote topics is useful when configuring an architecture that requires data aggregation. Consumers can subscribe to source and remote topics within the same cluster, without the need for a separate aggregation cluster.
11.1.2. Unidirectional replication (active/passive) 复制链接链接已复制到粘贴板!
The MirrorMaker 2 architecture supports unidirectional replication in an active/passive cluster configuration.
You can use an active/passive cluster configuration to make backups or migrate data to another cluster. In this situation, you might not want automatic renaming of remote topics.
You can override automatic renaming by adding IdentityReplicationPolicy to the source connector configuration. With this configuration applied, topics retain their original names.
11.2. Configuring MirrorMaker 2 connectors 复制链接链接已复制到粘贴板!
Use MirrorMaker 2 connector configuration for the internal connectors that orchestrate the synchronization of data between Kafka clusters.
MirrorMaker 2 consists of the following connectors:
MirrorSourceConnector-
The source connector replicates topics from a source cluster to a target cluster. It also replicates ACLs and is necessary for the
MirrorCheckpointConnectorto run. MirrorCheckpointConnector- The checkpoint connector periodically tracks offsets. If enabled, it also synchronizes consumer group offsets between the source and target cluster.
MirrorHeartbeatConnector- The heartbeat connector periodically checks connectivity between the source and target cluster.
The following table describes connector properties and the connectors you configure to use them.
| Property | sourceConnector | checkpointConnector | heartbeatConnector |
|---|---|---|---|
| ✓ | ✓ | ✓ |
| ✓ | ✓ | ✓ |
| ✓ | ✓ | ✓ |
| ✓ | ✓ | |
| ✓ | ✓ | |
| ✓ | ✓ | |
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ | ||
| ✓ |
MirrorMaker 2 tracks offsets for consumer groups using internal topics.
offset-syncstopic-
The
offset-syncstopic maps the source and target offsets for replicated topic partitions from record metadata. checkpointstopic-
The
checkpointstopic maps the last committed offset in the source and target cluster for replicated topic partitions in each consumer group.
As they are used internally by MirrorMaker 2, you do not interact directly with these topics.
MirrorCheckpointConnector emits checkpoints for offset tracking. Offsets for the checkpoints topic are tracked at predetermined intervals through configuration. Both topics enable replication to be fully restored from the correct offset position on failover.
The location of the offset-syncs topic is the source cluster by default. You can use the offset-syncs.topic.location connector configuration to change this to the target cluster. You need read/write access to the cluster that contains the topic. Using the target cluster as the location of the offset-syncs topic allows you to use MirrorMaker 2 even if you have only read access to the source cluster.
11.2.2. Synchronizing consumer group offsets 复制链接链接已复制到粘贴板!
The __consumer_offsets topic stores information on committed offsets for each consumer group. Offset synchronization periodically transfers the consumer offsets for the consumer groups of a source cluster into the consumer offsets topic of a target cluster.
Offset synchronization is particularly useful in an active/passive configuration. If the active cluster goes down, consumer applications can switch to the passive (standby) cluster and pick up from the last transferred offset position.
To use topic offset synchronization, enable the synchronization by adding sync.group.offsets.enabled to the checkpoint connector configuration, and setting the property to true. Synchronization is disabled by default.
When using the IdentityReplicationPolicy in the source connector, it also has to be configured in the checkpoint connector configuration. This ensures that the mirrored consumer offsets will be applied for the correct topics.
Consumer offsets are only synchronized for consumer groups that are not active in the target cluster. If the consumer groups are in the target cluster, the synchronization cannot be performed and an UNKNOWN_MEMBER_ID error is returned.
If enabled, the synchronization of offsets from the source cluster is made periodically. You can change the frequency by adding sync.group.offsets.interval.seconds and emit.checkpoints.interval.seconds to the checkpoint connector configuration. The properties specify the frequency in seconds that the consumer group offsets are synchronized, and the frequency of checkpoints emitted for offset tracking. The default for both properties is 60 seconds. You can also change the frequency of checks for new consumer groups using the refresh.groups.interval.seconds property, which is performed every 10 minutes by default.
Because the synchronization is time-based, any switchover by consumers to a passive cluster will likely result in some duplication of messages.
If you have an application written in Java, you can use the RemoteClusterUtils.java utility to synchronize offsets through the application. The utility fetches remote offsets for a consumer group from the checkpoints topic.
11.2.3. Deciding when to use the heartbeat connector 复制链接链接已复制到粘贴板!
The heartbeat connector emits heartbeats to check connectivity between source and target Kafka clusters. An internal heartbeat topic is replicated from the source cluster, which means that the heartbeat connector must be connected to the source cluster. The heartbeat topic is located on the target cluster, which allows it to do the following:
- Identify all source clusters it is mirroring data from
- Verify the liveness and latency of the mirroring process
This helps to make sure that the process is not stuck or has stopped for any reason. While the heartbeat connector can be a valuable tool for monitoring the mirroring processes between Kafka clusters, it’s not always necessary to use it. For example, if your deployment has low network latency or a small number of topics, you might prefer to monitor the mirroring process using log messages or other monitoring tools. If you decide not to use the heartbeat connector, simply omit it from your MirrorMaker 2 configuration.
To ensure that MirrorMaker 2 connectors work properly, make sure to align certain configuration settings across connectors. Specifically, ensure that the following properties have the same value across all applicable connectors:
-
replication.policy.class -
replication.policy.separator -
offset-syncs.topic.location -
topic.filter.class
For example, the value for replication.policy.class must be the same for the source, checkpoint, and heartbeat connectors. Mismatched or missing settings cause issues with data replication or offset syncing, so it’s essential to keep all relevant connectors configured with the same settings.
11.2.5. Listing the offsets of MirrorMaker 2 connectors 复制链接链接已复制到粘贴板!
To list the offset positions of the internal MirrorMaker 2 connectors, use the GET /connectors/{connector}/offsets request with the Kafka Connect API. For more information on the Kafka Connect REST API endpoints, see Section 10.3.2, “Configuring connectors”.
11.3. Connector producer and consumer configuration 复制链接链接已复制到粘贴板!
MirrorMaker 2 connectors use internal producers and consumers. If needed, you can configure these producers and consumers to override the default settings.
Producer and consumer configuration options depend on the MirrorMaker 2 implementation, and may be subject to change.
Producer and consumer configuration applies to all connectors. You specify the configuration in the config/connect-mirror-maker.properties file.
Use the properties file to override any default configuration for the producers and consumers in the following format:
-
<source_cluster_name>.consumer.<property> -
<source_cluster_name>.producer.<property> -
<target_cluster_name>.consumer.<property> -
<target_cluster_name>.producer.<property>
The following example shows how you configure the producers and consumers. Though the properties are set for all connectors, some configuration properties are only relevant to certain connectors.
Example configuration for connector producers and consumers
11.4. Specifying a maximum number of tasks 复制链接链接已复制到粘贴板!
Connectors create the tasks that are responsible for moving data in and out of Kafka. Each connector comprises one or more tasks that are distributed across a group of worker pods that run the tasks. Increasing the number of tasks can help with performance issues when replicating a large number of partitions or synchronizing the offsets of a large number of consumer groups.
Tasks run in parallel. Workers are assigned one or more tasks. A single task is handled by one worker pod, so you don’t need more worker pods than tasks. If there are more tasks than workers, workers handle multiple tasks.
You can specify the maximum number of connector tasks in your MirrorMaker configuration using the tasks.max property. Without specifying a maximum number of tasks, the default setting is a single task.
The heartbeat connector always uses a single task.
The number of tasks that are started for the source and checkpoint connectors is the lower value between the maximum number of possible tasks and the value for tasks.max. For the source connector, the maximum number of tasks possible is one for each partition being replicated from the source cluster. For the checkpoint connector, the maximum number of tasks possible is one for each consumer group being replicated from the source cluster. When setting a maximum number of tasks, consider the number of partitions and the hardware resources that support the process.
If the infrastructure supports the processing overhead, increasing the number of tasks can improve throughput and latency. For example, adding more tasks reduces the time taken to poll the source cluster when there is a high number of partitions or consumer groups.
tasks.max configuration for MirrorMaker connectors
clusters=cluster-1,cluster-2 # ... tasks.max = 10
clusters=cluster-1,cluster-2
# ...
tasks.max = 10
By default, MirrorMaker 2 checks for new consumer groups every 10 minutes. You can adjust the refresh.groups.interval.seconds configuration to change the frequency. Take care when adjusting lower. More frequent checks can have a negative impact on performance.
11.5. ACL rules synchronization 复制链接链接已复制到粘贴板!
If AclAuthorizer is being used, ACL rules that manage access to brokers also apply to remote topics. Users that can read a source topic can read its remote equivalent.
OAuth 2.0 authorization does not support access to remote topics in this way.
11.6. Running MirrorMaker 2 in dedicated mode 复制链接链接已复制到粘贴板!
Use MirrorMaker 2 to synchronize data between Kafka clusters through configuration. This procedure shows how to configure and run a dedicated single-node MirrorMaker 2 cluster. Dedicated clusters use Kafka Connect worker nodes to mirror data between Kafka clusters.
It is also possible to run MirrorMaker 2 in distributed mode. MirrorMaker 2 operates as connectors in both dedicated and distributed modes. When running a dedicated MirrorMaker cluster, connectors are configured in the Kafka Connect cluster. As a consequence, this allows direct access to the Kafka Connect cluster, the running of additional connectors, and use of the REST API. For more information, refer to the Apache Kafka documentation.
The configuration must specify:
- Each Kafka cluster
- Connection information for each cluster, including TLS authentication
The replication flow and direction
- Cluster to cluster
- Topic to topic
- Replication rules
- Committed offset tracking intervals
This procedure describes how to implement MirrorMaker 2 by creating the configuration in a properties file, then passing the properties when using the MirrorMaker script file to set up the connections.
You can specify the topics and consumer groups you wish to replicate from a source cluster. You specify the names of the source and target clusters, then specify the topics and consumer groups to replicate.
In the following example, topics and consumer groups are specified for replication from cluster 1 to 2.
Example configuration to replicate specific topics and consumer groups
clusters=cluster-1,cluster-2 cluster-1->cluster-2.topics = topic-1, topic-2 cluster-1->cluster-2.groups = group-1, group-2
clusters=cluster-1,cluster-2
cluster-1->cluster-2.topics = topic-1, topic-2
cluster-1->cluster-2.groups = group-1, group-2
You can provide a list of names or use a regular expression. By default, all topics and consumer groups are replicated if you do not set these properties. You can also replicate all topics and consumer groups by using .* as a regular expression. However, try to specify only the topics and consumer groups you need to avoid causing any unnecessary extra load on the cluster.
Before you begin
A sample configuration properties file is provided in ./config/connect-mirror-maker.properties.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
Procedure
Open the sample properties file in a text editor, or create a new one, and edit the file to include connection information and the replication flows for each Kafka cluster.
The following example shows a configuration to connect two clusters, cluster-1 and cluster-2, bidirectionally. Cluster names are configurable through the
clustersproperty.Example MirrorMaker 2 configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Each Kafka cluster is identified with its alias.
- 2
- Connection information for cluster-1, using the bootstrap address and port 443. Both clusters use port 443 to connect to Kafka using OpenShift Routes.
- 3
- The
ssl.properties define TLS configuration for cluster-1. - 4
- Connection information for cluster-2.
- 5
- The
ssl.properties define the TLS configuration for cluster-2. - 6
- Replication flow enabled from cluster-1 to cluster-2.
- 7
- Replication flow enabled from cluster-2 to cluster-1.
- 8
- Replication of all topics from cluster-1 to cluster-2. The source connector replicates the specified topics. The checkpoint connector tracks offsets for the specified topics.
- 9
- Replication of specific topics from cluster-2 to cluster-1.
- 10
- Replication of all consumer groups from cluster-1 to cluster-2. The checkpoint connector replicates the specified consumer groups.
- 11
- Replication of specific consumer groups from cluster-2 to cluster-1.
- 12
- Defines the separator used for the renaming of remote topics.
- 13
- When enabled, ACLs are applied to synchronized topics. The default is
false. - 14
- The period between checks for new topics to synchronize.
- 15
- The period between checks for new consumer groups to synchronize.
OPTION: If required, add a policy that overrides the automatic renaming of remote topics. Instead of prepending the name with the name of the source cluster, the topic retains its original name.
This optional setting is used for active/passive backups and data migration.
replication.policy.class=org.apache.kafka.connect.mirror.IdentityReplicationPolicy
replication.policy.class=org.apache.kafka.connect.mirror.IdentityReplicationPolicyCopy to Clipboard Copied! Toggle word wrap Toggle overflow OPTION: If you want to synchronize consumer group offsets, add configuration to enable and manage the synchronization:
refresh.groups.interval.seconds=60 sync.group.offsets.enabled=true sync.group.offsets.interval.seconds=60 emit.checkpoints.interval.seconds=60
refresh.groups.interval.seconds=60 sync.group.offsets.enabled=true1 sync.group.offsets.interval.seconds=602 emit.checkpoints.interval.seconds=603 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Optional setting to synchronize consumer group offsets, which is useful for recovery in an active/passive configuration. Synchronization is not enabled by default.
- 2
- If the synchronization of consumer group offsets is enabled, you can adjust the frequency of the synchronization.
- 3
- Adjusts the frequency of checks for offset tracking. If you change the frequency of offset synchronization, you might also need to adjust the frequency of these checks.
Start Kafka in the target clusters:
./bin/kafka-server-start.sh -daemon \ ./config/kraft/server.properties
./bin/kafka-server-start.sh -daemon \ ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Start MirrorMaker with the cluster connection configuration and replication policies you defined in your properties file:
./bin/connect-mirror-maker.sh \ ./config/connect-mirror-maker.properties
./bin/connect-mirror-maker.sh \ ./config/connect-mirror-maker.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow MirrorMaker sets up connections between the clusters.
For each target cluster, verify that the topics are being replicated:
./bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --list
./bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --listCopy to Clipboard Copied! Toggle word wrap Toggle overflow
11.7. (Deprecated) Using MirrorMaker 2 in legacy mode 复制链接链接已复制到粘贴板!
This procedure describes how to configure MirrorMaker 2 to use it in legacy mode. Legacy mode supports the previous version of MirrorMaker.
The MirrorMaker script ./bin/kafka-mirror-maker.sh can run MirrorMaker 2 in legacy mode.
Kafka MirrorMaker 1 (referred to as just MirrorMaker in the documentation) has been deprecated in Apache Kafka 3.0.0 and will be removed in Apache Kafka 4.0.0. As a result, Kafka MirrorMaker 1 has been deprecated in Streams for Apache Kafka as well. Kafka MirrorMaker 1 will be removed from Streams for Apache Kafka when we adopt Apache Kafka 4.0.0. As a replacement, use MirrorMaker 2 with the IdentityReplicationPolicy.
Prerequisites
You need the properties files you currently use with the legacy version of MirrorMaker.
-
./config/consumer.properties -
./config/producer.properties
Procedure
Edit the MirrorMaker
consumer.propertiesandproducer.propertiesfiles to turn off MirrorMaker 2 features.For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Save the changes and restart MirrorMaker with the properties files you used with the previous version of MirrorMaker:
./bin/kafka-mirror-maker.sh \ --consumer.config ./config/consumer.properties \ --producer.config ./config/producer.properties \ --num.streams=2
./bin/kafka-mirror-maker.sh \ --consumer.config ./config/consumer.properties \ --producer.config ./config/producer.properties \ --num.streams=2Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
consumerproperties provide the configuration for the source cluster and theproducerproperties provide the target cluster configuration.MirrorMaker sets up connections between the clusters.
Start Kafka in the target cluster:
./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow For the target cluster, verify that the topics are being replicated:
./bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --list
./bin/kafka-topics.sh --bootstrap-server <broker_host>:<port> --listCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 12. Configuring logging for Kafka components 复制链接链接已复制到粘贴板!
Configure the logging levels of Kafka components directly in the configuration properties. You can also change the broker levels dynamically for Kafka brokers, Kafka Connect, and MirrorMaker 2.
Increasing the log level detail, such as from INFO to DEBUG, can aid in troubleshooting a Kafka cluster. However, more verbose logs may also negatively impact performance and make it more difficult to diagnose issues.
12.1. Configuring Kafka logging properties 复制链接链接已复制到粘贴板!
Kafka components use the Log4j framework for error logging. By default, logging configuration is read from the classpath or config directory using the following properties files:
-
log4j.propertiesfor Kafka -
connect-log4j.propertiesfor Kafka Connect and MirrorMaker 2
If they are not set explicitly, loggers inherit the log4j.rootLogger logging level configuration in each file. You can change the logging level in these files. You can also add and set logging levels for other loggers.
You can change the location and name of logging properties file using the KAFKA_LOG4J_OPTS environment variable, which is used by the start script for the component.
Passing the name and location of the logging properties file used by Kafka nodes
export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:/my/path/to/log4j.properties"; \ ./bin/kafka-server-start.sh \ ./config/kraft/server.properties
export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:/my/path/to/log4j.properties"; \
./bin/kafka-server-start.sh \
./config/kraft/server.properties
Passing the name and location of the logging properties file used by Kafka Connect
export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:/my/path/to/connect-log4j.properties"; \ ./bin/connect-distributed.sh \ ./config/connect-distributed.properties
export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:/my/path/to/connect-log4j.properties"; \
./bin/connect-distributed.sh \
./config/connect-distributed.properties
Passing the name and location of the logging properties file used by MirrorMaker 2
export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:/my/path/to/connect-log4j.properties"; \ ./bin/connect-mirror-maker.sh \ ./config/connect-mirror-maker.properties
export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:/my/path/to/connect-log4j.properties"; \
./bin/connect-mirror-maker.sh \
./config/connect-mirror-maker.properties
Kafka broker logging is provided by broker loggers in each broker. Dynamically change the logging level for broker loggers at runtime without having to restart the broker.
You can also reset broker loggers dynamically to their default logging levels.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- Kafka is running.
Procedure
List all the broker loggers for a broker by using the
kafka-configs.shtool:./bin/kafka-configs.sh --bootstrap-server <broker_address> --describe --entity-type broker-loggers --entity-name <broker_id>
./bin/kafka-configs.sh --bootstrap-server <broker_address> --describe --entity-type broker-loggers --entity-name <broker_id>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, for broker
0:./bin/kafka-configs.sh --bootstrap-server localhost:9092 --describe --entity-type broker-loggers --entity-name 0
./bin/kafka-configs.sh --bootstrap-server localhost:9092 --describe --entity-type broker-loggers --entity-name 0Copy to Clipboard Copied! Toggle word wrap Toggle overflow This returns the logging level for each logger:
TRACE,DEBUG,INFO,WARN,ERROR, orFATAL.For example:
#... kafka.controller.ControllerChannelManager=INFO sensitive=false synonyms={} kafka.log.TimeIndex=INFO sensitive=false synonyms={}#... kafka.controller.ControllerChannelManager=INFO sensitive=false synonyms={} kafka.log.TimeIndex=INFO sensitive=false synonyms={}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Change the logging level for one or more broker loggers. Use the
--alterand--add-configoptions and specify each logger and its level as a comma-separated list in double quotes../bin/kafka-configs.sh --bootstrap-server <broker_address> --alter --add-config "LOGGER-ONE=NEW-LEVEL,LOGGER-TWO=NEW-LEVEL" --entity-type broker-loggers --entity-name <broker_id>
./bin/kafka-configs.sh --bootstrap-server <broker_address> --alter --add-config "LOGGER-ONE=NEW-LEVEL,LOGGER-TWO=NEW-LEVEL" --entity-type broker-loggers --entity-name <broker_id>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, for broker
0:./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --add-config "kafka.controller.ControllerChannelManager=WARN,kafka.log.TimeIndex=WARN" --entity-type broker-loggers --entity-name 0
./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --add-config "kafka.controller.ControllerChannelManager=WARN,kafka.log.TimeIndex=WARN" --entity-type broker-loggers --entity-name 0Copy to Clipboard Copied! Toggle word wrap Toggle overflow If successful this returns:
Completed updating config for broker: 0.
Completed updating config for broker: 0.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Resetting a broker logger
You can reset one or more broker loggers to their default logging levels by using the kafka-configs.sh tool. Use the --alter and --delete-config options and specify each broker logger as a comma-separated list in double quotes:
./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --delete-config "LOGGER-ONE,LOGGER-TWO" --entity-type broker-loggers --entity-name <broker_id>
./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --delete-config "LOGGER-ONE,LOGGER-TWO" --entity-type broker-loggers --entity-name <broker_id>
Dynamically change logging levels for Kafka Connect workers or MirrorMaker 2 connectors at runtime without having to restart.
Use the Kafka Connect API to change the log level temporarily for a worker or connector logger. The Kafka Connect API provides an admin/loggers endpoint to get or modify logging levels. When you change the log level using the API, the logger configuration in the connect-log4j.properties configuration file does not change. If required, you can permanently change the logging levels in the configuration file.
You can only change the logging level of MirrorMaker 2 at runtime when in distributed or standalone mode. Dedicated MirrorMaker 2 clusters have no Kafka Connect REST API, so changing the logging level is not possible.
The default listener for the Kafka Connect API is on port 8083, which is used in this procedure. You can change or add more listeners, and also enable TLS authentication, using admin.listeners configuration.
Example listener configuration for the admin endpoint
admin.listeners=https://localhost:8083 admin.listeners.https.ssl.truststore.location=/path/to/truststore.jks admin.listeners.https.ssl.truststore.password=123456 admin.listeners.https.ssl.keystore.location=/path/to/keystore.jks admin.listeners.https.ssl.keystore.password=123456
admin.listeners=https://localhost:8083
admin.listeners.https.ssl.truststore.location=/path/to/truststore.jks
admin.listeners.https.ssl.truststore.password=123456
admin.listeners.https.ssl.keystore.location=/path/to/keystore.jks
admin.listeners.https.ssl.keystore.password=123456
If you do not want the admin endpoint to be available, you can disable it in the configuration by specifying an empty string.
Example listener configuration to disable the admin endpoint
admin.listeners=
admin.listeners=
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- Kafka is running.
- Kafka Connect or MirrorMaker 2 is running.
Procedure
Check the current logging level for the loggers configured in the
connect-log4j.propertiesfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use a curl command to check the logging levels from the
admin/loggersendpoint of the Kafka Connect API:Copy to Clipboard Copied! Toggle word wrap Toggle overflow jqprints the output in JSON format. The list shows standardorgandrootlevel loggers, plus any specific loggers with modified logging levels.If you configure TLS (Transport Layer Security) authentication for the
admin.listenersconfiguration in Kafka Connect, then the address of the loggers endpoint is the value specified foradmin.listenerswith the protocol as https, such ashttps://localhost:8083.You can also get the log level of a specific logger:
curl -s http://localhost:8083/admin/loggers/org.apache.kafka.connect.mirror.MirrorCheckpointConnector | jq { "level": "INFO" }curl -s http://localhost:8083/admin/loggers/org.apache.kafka.connect.mirror.MirrorCheckpointConnector | jq { "level": "INFO" }Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use a PUT method to change the log level for a logger:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you change the
rootlogger, the logging level for loggers that used the root logging level by default are also changed.
This procedure describes how to set throughput and storage limits on brokers in your Kafka cluster. Enable the Strimzi Quotas plugin and configure limits using quota properties
The plugin provides storage utilization quotas and dynamic distribution of throughput limits.
-
Storage quotas throttle Kafka producers based on disk storage utilization. Limits can be specified in bytes (
storage.per.volume.limit.min.available.bytes) or percentage (storage.per.volume.limit.min.available.ratio) of available disk space, applying to each disk individually. When any broker in the cluster exceeds the configured disk threshold, clients are throttled to prevent disks from filling up too quickly and exceeding capacity. - A total throughput limit is distributed dynamically across all clients. For example, if you set a 40 MBps producer byte-rate threshold, the distribution across two producers is not static. If one producer is using 10 MBps, the other can use up to 30 MBps.
- Specific users (clients) can be excluded from the restrictions.
With the plugin, you see only aggregated quota metrics, not per-client metrics.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
Procedure
Edit the Kafka configuration properties file.
Example plugin configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Loads the plugin.
- 2
- Sets the producer byte-rate threshold of 1 MBps.
- 3
- Sets the consumer byte-rate threshold. 1 MBps.
- 4
- Sets an available bytes limit of 500 GB.
- 5
- Sets the interval in seconds between checks on storage to 5 seconds. The default is 60 seconds. Set this property to 0 to disable the check.
- 6
- Kafka cluster bootstrap servers address. This property is required if
storage.check-intervalis >0. All configuration properties starting withclient.quota.callback.static.kafka.admin.prefix are passed to the Kafka Admin client configuration. - 7
- Excludes
my-user-1andmy-user-2from the restrictions. Each principal must be be prefixed withUser:.
storage.per.volume.limit.min.available.bytesandstorage.per.volume.limit.min.available.ratioare mutually exclusive. Only configure one of these parameters.NoteThe full list of supported configuration properties can be found in the plugin documentation.
Start the Kafka broker with the default configuration file.
./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the Kafka broker is running.
jcmd | grep Kafka
jcmd | grep KafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Scaling Kafka clusters by adding brokers can increase the performance and reliability of the cluster. Adding more brokers increases available resources, allowing the cluster to handle larger workloads and process more messages. It can also improve fault tolerance by providing more replicas and backups. Conversely, removing underutilized brokers can reduce resource consumption and improve efficiency. Scaling must be done carefully to avoid disruption or data loss. By redistributing partitions across all brokers in the cluster, the resource utilization of each broker is reduced, which can increase the overall throughput of the cluster.
To increase the throughput of a Kafka topic, you can increase the number of partitions for that topic. This allows the load of the topic to be shared between different brokers in the cluster. However, if every broker is constrained by a specific resource (such as I/O), adding more partitions will not increase the throughput. In this case, you need to add more brokers to the cluster.
Adding brokers when running a multi-node Kafka cluster affects the number of brokers in the cluster that act as replicas. The actual replication factor for topics is determined by settings for the default.replication.factor and min.insync.replicas, and the number of available brokers. For example, a replication factor of 3 means that each partition of a topic is replicated across three brokers, ensuring fault tolerance in the event of a broker failure.
Example replica configuration
default.replication.factor = 3 min.insync.replicas = 2
default.replication.factor = 3
min.insync.replicas = 2
When you add or remove brokers, Kafka does not automatically reassign partitions. The best way to do this is using Cruise Control. You can use Cruise Control’s add_broker and remove_broker modes when scaling a cluster up or down.
-
Use the
add_brokermode after scaling up a Kafka cluster to move partition replicas from existing brokers to the newly added brokers. -
Use the
remove_brokermode before scaling down a Kafka cluster to move partition replicas off the brokers that are going to be removed.
14.1. Scaling controller clusters dynamically 复制链接链接已复制到粘贴板!
Dynamic controller quorums support scaling without requiring system downtime. Dynamic scaling is useful not only for adding or removing controllers, but supports the following:
- Replacing controllers because of hardware failure
- Migrating clusters onto new machines
- Moving nodes from dedicated controller roles to combined roles or vice versa
A dynamic quorum is specified in the controller configuration using the controller.quorum.bootstrap.servers property to list host:port endpoints for each controller. Only one controller can be added or removed from the cluster at a time, so complex quorum changes are implemented as a series of single changes. New controllers join as observers, replicating the metadata log but not counting towards the quorum. When caught up with the active controller, the new controller is eligible to join the quorum.
When removing controllers, it’s recommended that they are first shutdown to avoid unnecessary leader elections. If the removed controller is the active one, it will step down from the quorum only after the new quorum is confirmed. However, it will not include itself when calculating the last commit position in the __cluster_metadata log.
In a dynamic quorum, the active Kraft version is at 1 or above for all cluster nodes. Find the active KRaft version using the kafka-features.sh tool:
./bin/kafka-features.sh --bootstrap-controller localhost:9093 describe | grep kraft.version
./bin/kafka-features.sh --bootstrap-controller localhost:9093 describe | grep kraft.version
In this example output, the active version (FinalizedVersionLevel) in the Kafka cluster is 1:
Feature: kraft.version SupportedMinVersion: 0 SupportedMaxVersion: 1 FinalizedVersionLevel: 1 Epoch: 5
Feature: kraft.version SupportedMinVersion: 0 SupportedMaxVersion: 1 FinalizedVersionLevel: 1 Epoch: 5
If the kraft.version property shows an active version level of 0 or is absent, you are using a static quorum. If it is 1 or above, you are using a dynamic quorum.
It’s possible to configure a static quorum, but it is not a recommended approach as it requires downtime when scaling.
14.2. Adding new controllers 复制链接链接已复制到粘贴板!
To add a new controller to an existing dynamic controller quorum in Kafka, create a new controller, monitor its replication status, and then integrate it into the cluster.
Prerequisites
-
Streams for Apache Kafka is installed on the host, and the configuration files and tools are available.
This procedure uses thekafka-storage.sh,kafka-server-start.shandkafka-metadata-quorum.shtools. - Administrative access to the controller nodes.
Procedure
Configure a new controller node using the
controller.propertiesfile.At a minimum, the new controller requires the following configuration:
- A unique node ID
- Listener name used by the controller quorum
A quorum of controllers
Example controller configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
controller.quorum.bootstrap.serversconfiguration includes the host and port of the new controller and each other controller already present in the cluster.
-
Update
controller.quorum.bootstrap.serversin the configuration of each node in the cluster with the host and port of the new controller. Set the log directory ID for the new controller:
./bin/kafka-storage.sh format --cluster-id <cluster_id> --config controller.properties --no-initial-controllers
./bin/kafka-storage.sh format --cluster-id <cluster_id> --config controller.properties --no-initial-controllersCopy to Clipboard Copied! Toggle word wrap Toggle overflow By using the
no-initial-controllersoption, the controller is initialized without it joining the controller quorum.Start the controller node
./bin/kafka-server-start.sh ./config/kraft/controller.properties
./bin/kafka-server-start.sh ./config/kraft/controller.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the replication progress of the new controller:
./bin/kafka-metadata-quorum.sh --bootstrap-server localhost:9092 --replication
./bin/kafka-metadata-quorum.sh --bootstrap-server localhost:9092 --replicationCopy to Clipboard Copied! Toggle word wrap Toggle overflow Wait until the new controller has caught up with the active controller before proceeding.
Add the new controller to the controller quorum:
./bin/kafka-metadata-quorum.sh --command-config controller.properties --bootstrap-controller localhost:9092 add-controller
./bin/kafka-metadata-quorum.sh --command-config controller.properties --bootstrap-controller localhost:9092 add-controllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow
14.3. Removing controllers 复制链接链接已复制到粘贴板!
To remove a controller from an existing dynamic controller quorum in Kafka, use the kafka-metadata-quorum.sh tool.
Prerequisites
-
Streams for Apache Kafka is installed on the host, and the configuration files and tools are available.
This procedure uses thekafka-server-stop.shandkafka-metadata-quorum.shtools. - Administrative access to the controller nodes.
Procedure
Stop the controller node
./bin/kafka-server-stop.sh
./bin/kafka-server-stop.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Locate the ID of the controller and its directory ID to be able to remove it from the controller quorum. You can find this information in the
meta.propertiesfile of the metadata log. Remove the controller from the controller quorum:
./bin/kafka-metadata-quorum.sh --bootstrap-controller localhost:9092 remove-controller --controller-id <id> --controller-directory-id <directory_id>
./bin/kafka-metadata-quorum.sh --bootstrap-controller localhost:9092 remove-controller --controller-id <id> --controller-directory-id <directory_id>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Update
controller.quorum.bootstrap.serversin the configuration of each node in the cluster to remove the host and port of the controller removed from the controller quorum.
14.4. Unregistering nodes after scale-down operations 复制链接链接已复制到粘贴板!
After removing a node from a Kafka cluster, use the kafka-cluster.sh script to unregister the node from the cluster metadata. Failing to unregister removed nodes leads to stale metadata, which causes operational issues.
Prerequisites
Before unregistering a node, ensure the following tasks are completed:
-
Reassign the partitions from the node you plan to remove to the remaining brokers using the Cruise control
remove-nodesoperation. -
Update the cluster configuration, if necessary, to adjust the replication factor for topics (
default.replication.factor) and the minimum required number of in-sync replica acknowledgements (min.insync.replicas). - Stop the Kafka broker service on the node and remove the node from the cluster.
Procedure
Unregister the removed node from the cluster:
./bin/kafka-cluster.sh unregister \ --bootstrap-server <broker_host>:<port> \ --id <node_id_number>
./bin/kafka-cluster.sh unregister \ --bootstrap-server <broker_host>:<port> \ --id <node_id_number>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the current state of the cluster by describing the topics:
./bin/kafka-topics.sh \ --bootstrap-server <broker_host>:<port> \ --describe
./bin/kafka-topics.sh \ --bootstrap-server <broker_host>:<port> \ --describeCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 15. Using Cruise Control for cluster rebalancing 复制链接链接已复制到粘贴板!
Cruise Control is an open-source application designed to run alongside Kafka to help optimize use of cluster resources by doing the following:
- Monitoring cluster workload
- Rebalancing partitions based on predefined constraints
Cruise Control operations help with running a more balanced Kafka cluster that uses brokers more efficiently.
As Kafka clusters evolve, some brokers may become overloaded while others remain underutilized. Cruise Control addresses this imbalance by modeling resource utilization at the replica level—including, CPU, disk, network load—and generating optimization proposals (which you can approve or reject) for balanced partition assignments based on configurable optimization goals.
The cruisecontrol.properties file contains the configuration for Cruise Control. You can specify and configure all the properties listed in the Configurations section of the Cruise Control Wiki.
15.1. Cruise Control components and features 复制链接链接已复制到粘贴板!
Cruise Control comprises four main components:
- Load Monitor
- Load Monitor collects the metrics and analyzes cluster workload data.
- Analyzer
- Analyzer generates optimization proposals based on collected data and configured goals.
- Anomaly Detector
- Anomaly Detector identifies and reports irregularities in cluster behavior.
- Executor
- Executor applies approved optimization proposals to the cluster.
Cruise Control also provides a REST API for client interactions, which Streams for Apache Kafka uses to support these features:
- Generating optimization proposals from optimization goals
- Rebalancing a Kafka cluster based on an optimization proposal
- Changing topic replication factor
Other Cruise Control features are not currently supported, including self healing, notifications, and write-your-own goals.
15.1.1. Optimization goals 复制链接链接已复制到粘贴板!
Optimization goals define objectives for rebalancing, such as distributing topic replicas evenly across brokers.
They are categorized as follows:
- Supported goals are a list of goals supported by the Cruise Control instance that can be used in its operations. By default, this list includes all goals included with Cruise Control. For a goal to be used in other categories, such as default or hard goals, it must first be listed in supported goals. To prevent a goal’s usage, remove it from this list.
- Hard goals are preset and must be satisfied for a proposal to succeed.
- Soft goals are preset goals with objectives that are prioritized during optimization as much as possible, without preventing a proposal from being created if all hard goals are satisfied.
- Default goals refer to the goals used by default when generating proposals. They match the supported goals unless specifically set by the user.
- Intra-broker goals refer to the goals used specifically for rebalances on the same broker.
- Proposal-specific goals are a subset of supported goals configured for specific proposals.
Set proposal-specific goals at runtime. Specify other optimization goals in a configuration properties file using their fully-qualified domain names and in descending priority order.
The config/cruisecontrol.properties file contains the configuration for Cruise Control. Use the following properties to manage goals:
-
Supported goals:
goalsproperty -
Hard goals:
hard.goalsproperty -
Default goals:
default.goalsproperty -
Intra-broker goals:
intra.broker.goalsproperty
15.1.1.1. Supported goals 复制链接链接已复制到粘贴板!
Supported goals are predefined and available to use for generating Cruise Control optimization proposals. Goals not listed as supported goals cannot be used in Cruise Control operations. Some supported goals are preset as hard goals.
Configure supported goals in cruisecontrol.properties:
-
To modify supported goals, specify the goals in the
goalsproperty.
You can adjust the priority order in the goals configuration. - You must specify at least one supported goal.
15.1.1.2. Hard and soft goals 复制链接链接已复制到粘贴板!
Hard goals must be satisfied for optimization proposals to be generated. Soft goals are best-effort objectives that Cruise Control tries to meet after all hard goals are satisfied. The classification of hard and soft goals is fixed in Cruise Control code and cannot be changed.
Cruise Control first prioritizes satisfying hard goals, and then addresses soft goals in the order they are listed. A proposal meeting all hard goals is valid, even if it violates some soft goals.
For example, a soft goal might be to evenly distribute a topic’s replicas. Cruise Control continues to generate an optimization proposal even if the soft goal isn’t completely satisfied.
Configure hard goals in cruisecontrol.properties:
-
To modify hard goals, specify a subset of supported goals in the
hard.goalsproperty.
You can adjust the priority order in the hard goals configuration. -
To exclude a hard goal, ensure it’s not in either
default.goalsorhard.goals.
Increasing the number of configured hard goals will reduce the likelihood of Cruise Control generating optimization proposals.
15.1.1.3. Default goals 复制链接链接已复制到粘贴板!
Cruise Control uses default goals to generate an optimization proposal. Default goals must be a subset of the supported optimization goals.
The optimization proposal based on this supported goals list is then generated and cached.
Configure default goals in cruisecontrol.properties:
-
To modify default goals, specify a subset of supported goals in the
default.goalsproperty.
You can adjust the priority order in the default goals configuration. - You must specify at least one default goal.
15.1.1.4. Intra-broker goals 复制链接链接已复制到粘贴板!
Cruise Control uses intra-broker goals to balance data between disks on the same broker, which is useful for deployments with JBOD storage and multiple disks.
Configure intra-broker goals in cruisecontrol.properties:
-
To modify intra-broker goals, list the supported goals in the
intra.broker.goalsproperty.
You can adjust the priority order in the intra-broker goals configuration.
15.1.1.5. Proposal-specific goals 复制链接链接已复制到粘贴板!
Proposal-specific optimization goals support the creation of optimization proposals based on a specific list of goals. If proposal-specific goals are not set, then default goals are used
Specify proposal-specific goals at runtime as a subset of supported optimization goals for customization.
For example, you can optimize topic leader replica distribution across the Kafka cluster without considering disk capacity or utilization by defining a single proposal-specific goal.
When specifying proposal-specific goals, include all configured hard goals, or an error occurs.
To ignore the configured hard goals in an optimization proposal, add the skip_hard_goals_check=true parameter to the request.
15.1.1.6. Goals order of priority 复制链接链接已复制到粘贴板!
Unless you change the configuration, Streams for Apache Kafka inherits goals from Cruise Control.
The following list shows supported goals inherited by Streams for Apache Kafka from Cruise Control in descending priority order. Goals labeled as hard are mandatory constraints that must be satisfied for optimization proposals.
-
RackAwareGoal(hard) -
MinTopicLeadersPerBrokerGoal(hard) -
ReplicaCapacityGoal(hard) -
DiskCapacityGoal(hard) -
NetworkInboundCapacityGoal(hard) -
NetworkOutboundCapacityGoal(hard) -
CpuCapacityGoal(hard) -
ReplicaDistributionGoal -
PotentialNwOutGoal -
DiskUsageDistributionGoal -
NetworkInboundUsageDistributionGoal -
NetworkOutboundUsageDistributionGoal -
CpuUsageDistributionGoal -
TopicReplicaDistributionGoal -
LeaderReplicaDistributionGoal -
LeaderBytesInDistributionGoal -
PreferredLeaderElectionGoal -
IntraBrokerDiskCapacityGoal(hard) -
IntraBrokerDiskUsageDistributionGoal
Resource distribution goals are subject to capacity limits on broker resources.
For more information on each optimization goal, see Goals in the Cruise Control Wiki.
"Write your own" goals and Kafka assigner goals are not supported.
Example configuration for default and hard goals
default.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.PotentialNwOutGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.TopicReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderBytesInDistributionGoal hard.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal
default.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.PotentialNwOutGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.TopicReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderBytesInDistributionGoal
hard.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal
Ensure that the supported goals, default.goals, and (unless skip_hard_goals_check is set to true) proposal-specific goals include all hard goals specified in hard.goals to avoid errors when generating optimization proposals. Hard goals must be included as a subset in the supported, default, and proposal-specific goals.
Example request with proposal-specific goals
curl -v -X POST 'http://<cc_host>:<cc_port>/kafkacruisecontrol/rebalance?goals=RackAwareGoal,ReplicaCapacityGoal,ReplicaDistributionGoal&skip_hard_goal_check=true'
curl -v -X POST 'http://<cc_host>:<cc_port>/kafkacruisecontrol/rebalance?goals=RackAwareGoal,ReplicaCapacityGoal,ReplicaDistributionGoal&skip_hard_goal_check=true'
15.1.1.7. Skipping hard goal checks 复制链接链接已复制到粘贴板!
If skip_hard_goals_check=true is specified in a request, Cruise Control does not verify that the proposal-specific goals include all the configured hard goals. This allows for more flexibility in generating optimization proposals, but may lead to proposals that do not satisfy all hard goals.
However, any hard goals included in the proposal-specific goals will still be treated as hard goals by Cruise Control, even with skip_hard_goals_check=true.
15.1.2. Optimization proposals 复制链接链接已复制到粘贴板!
Optimization proposals are summaries of proposed changes based on the defined optimization goals, assessed in a specific order of priority. You can approve or reject proposals and rerun them with adjusted goals if needed.
With Cruise Control deployed for use in Streams for Apache Kafka, the process to generate and approve an optimization proposal is as follows:
- Make a request to generate an optimization proposal. This request triggers Cruise Control to initiate the optimization proposal generation process.
-
A Cruise Control Metrics Reporter runs in every Kafka broker, collecting raw metrics and publishing them to a dedicated Kafka topic (
__CruiseControlMetrics). Metrics for brokers, topics, and partitions are aggregated, sampled, and stored in other topics automatically created when Cruise Control is deployed. - Load Monitor collects, processes, and stores the metrics as a workload model--including CPU, disk, and network utilization data—which is used by the Analyzer and Anomaly Detector.
- Anomaly Detector continuously monitors the health and performance of the Kafka cluster, checking for things like broker failures or disk capacity issues, that could impact cluster stability.
- Analyzer creates optimization proposals based on the workload model from the Load Monitor. Based on configured goals and capacities, it generates an optimization proposal for balancing partitions across brokers. Through the REST API, a summary of the proposal is returned in the response to the request.
- The optimization proposal is approved or rejected based on its alignment with cluster management goals.
- If approved, the Executor applies the optimization proposal to rebalance the Kafka cluster. This involves reassigning partitions and redistributing workload across brokers according to the approved proposal.
Cruise Control optimization process
Optimization proposals comprise a list of partition reassignment mappings. When you approve a proposal, the Cruise Control server applies these partition reassignments to the Kafka cluster.
A partition reassignment consists of either of the following types of operations:
Partition movement: Involves transferring the partition replica and its data to a new location. Partition movements can take one of two forms:
- Inter-broker movement: The partition replica is moved to a log directory on a different broker.
- Intra-broker movement: The partition replica is moved to a different log directory on the same broker.
- Leadership movement: Involves switching the leader of the partition’s replicas.
Cruise Control issues partition reassignments to the Kafka cluster in batches. The performance of the cluster during the rebalance is affected by the number and magnitude of each type of movement contained in each batch.
15.1.2.1. Rebalancing endpoints 复制链接链接已复制到粘贴板!
Proposals for rebalances can be generated by making a request to one of three endpoints.
- /rebalance endpoint
- A request to this endpoint runs a full rebalance by moving replicas across all the brokers in the cluster.
- /add_broker endpoint
-
This endpoint is used after scaling up a Kafka cluster by adding one or more brokers. Normally, after scaling up a Kafka cluster, new brokers are used to host only the partitions of newly created topics. If no new topics are created, the newly added brokers are not used and the existing brokers remain under the same load. By using the
add_brokerendpoint immediately after adding brokers to the cluster, the rebalancing operation moves replicas from existing brokers to the newly added brokers. You specify the new brokers in the request as a list of broker IDs. - /remove_broker
- This endpoint is used before scaling down a Kafka cluster by removing one or more brokers. The operation moves replicas off the brokers that are going to be removed. When these brokers are not hosting replicas anymore, you can safely run the scaling down operation. You specify the brokers you’re removing as a list of broker IDs.
In general, use the full rebalance endpoint to rebalance a Kafka cluster by spreading the load across brokers. Use the add_broker and remove_broker endpoints only if you want to scale your cluster up or down and rebalance the replicas accordingly.
The procedure to run a rebalance is actually the same across the three different endpoints. The only difference is with specifying the endpoint in the request and, if needed, listing brokers that have been added or will be removed.
15.1.2.2. The results of an optimization proposal 复制链接链接已复制到粘贴板!
When an optimization proposal is generated, a summary of the changes is returned.
The summary is returned in a response to a HTTP request through the Cruise Control API. The summary provides an overview of the proposed cluster rebalance and indicates the scale of the changes involved. The information provided is a summary of the full optimization proposal.
An optimization proposal summary shows the proposed scope of changes.
When you make a POST request to the /rebalance endpoint, an optimization proposal summary is returned in the response.
Returning an optimization proposal summary
curl -v -X POST 'http://<cc_host>:<cc_port>/kafkacruisecontrol/rebalance'
curl -v -X POST 'http://<cc_host>:<cc_port>/kafkacruisecontrol/rebalance'
Use the summary to decide whether to approve or reject an optimization proposal.
- Approving an optimization proposal
-
You approve the optimization proposal by making a POST request to the
/rebalanceendpoint and setting thedryrunparameter tofalse(defaulttrue). Cruise Control applies the proposal to the Kafka cluster and starts a cluster rebalance operation. - Rejecting an optimization proposal
-
If you choose not to approve an optimization proposal, you can change the optimization goals or update any of the rebalance performance tuning options, and then generate another proposal. You can resend a request without the
dryrunparameter to generate a new optimization proposal.
Use optimization proposals to assess the movements required for a rebalance. For example, a summary describes inter-broker and intra-broker movements. Inter-broker rebalancing moves data between separate brokers. Intra-broker rebalancing moves data between disks on the same broker when you are using a JBOD storage configuration. Such information can be useful even if you don’t go ahead and approve the proposal.
You might reject an optimization proposal, or delay its approval, because of the additional load on a Kafka cluster when rebalancing. If the proposal is delayed for too long, the cluster load may change significantly, so it may be better to request a new proposal.
In the following example, the proposal suggests the rebalancing of data between separate brokers. The rebalance involves the movement of 55 partition replicas, totaling 12MB of data, across the brokers. The proposal will also move 24 partition leaders to different brokers. This requires a change to the cluster metadata, which has a low impact on performance.
The balancedness scores are measurements of the overall balance of the Kafka cluster before and after the optimization proposal is approved. A balancedness score is based on optimization goals. If all goals are satisfied, the score is 100. The score is reduced for each goal that will not be met. Compare the balancedness scores to see whether the Kafka cluster is less balanced than it could be following a rebalance.
Example optimization proposal summary
Though the inter-broker movement of partition replicas has a high impact on performance, the total amount of data is not large. If the total data was much larger, you could reject the proposal, or time when to approve the rebalance to limit the impact on the performance of the Kafka cluster.
The provision status indicates whether the current cluster configuration supports the optimization goals. Check the provision status to see if you should add or remove brokers.
| Status | Description |
|---|---|
|
| The cluster has an appropriate number of brokers to satisfy the optimization goals. |
|
| The cluster is under-provisioned and requires more brokers to satisfy the optimization goals. |
|
| The cluster is over-provisioned and requires fewer brokers to satisfy the optimization goals. |
|
| The status is not relevant or it has not yet been decided. |
15.1.2.4. Optimization proposal summary properties 复制链接链接已复制到粘贴板!
The following table explains the properties contained in the optimization proposal’s summary.
| Property | Description |
|---|---|
|
| <n>: The number of partition replicas that will be moved between separate brokers. Performance impact during rebalance operation: Relatively high. <y> MB: The sum of the size of each partition replica that will be moved to a separate broker. Performance impact during rebalance operation: Variable. The larger the number of MBs, the longer the cluster rebalance will take to complete. |
|
| <n>: The total number of partition replicas that will be transferred between the disks of the cluster’s brokers.
Performance impact during rebalance operation: Relatively high, but less than <y> MB: The sum of the size of each partition replica that will be moved between disks on the same broker.
Performance impact during rebalance operation: Variable. The larger the number, the longer the cluster rebalance will take to complete. Moving a large amount of data between disks on the same broker has less impact than between separate brokers (see |
|
| The number of topics excluded from the calculation of partition replica/leader movements in the optimization proposal. You can exclude topics in one of the following ways:
In the
In a POST request to the Topics that match the regular expression are listed in the response and will be excluded from the cluster rebalance. |
|
| <n>: The number of partitions whose leaders will be switched to different replicas. Performance impact during rebalance operation: Relatively low. |
|
| <n>: The number of metrics windows upon which the optimization proposal is based. |
|
| <n>%: The percentage of partitions in the Kafka cluster covered by the optimization proposal. |
|
| Measurements of the overall balance of a Kafka Cluster.
Cruise Control assigns a
The |
15.1.2.5. Adjusting the cached proposal refresh rate 复制链接链接已复制到粘贴板!
Cruise Control maintains a cached optimization proposal based on the configured default optimization goals. This proposal is generated from the workload model and updated every 15 minutes to reflect the current state of the Kafka cluster. When you generate an optimization proposal using the default goals, Cruise Control returns the latest cached version.
For clusters with rapidly changing workloads, you may want to shorten the refresh interval to ensure the optimization proposal reflects the most recent state. However, reducing the interval increases the load on the Cruise Control server. To adjust the refresh rate, modify the proposal.expiration.ms setting in the Cruise Control deployment configuration.
15.1.3. Tuning options for rebalances 复制链接链接已复制到粘贴板!
Configuration options allow you to fine-tune cluster rebalance performance. These settings control the movement of partition replicas and leadership, as well as the bandwidth allocated for rebalances.
15.1.3.1. Selecting replica movement strategies 复制链接链接已复制到粘贴板!
Cluster rebalance performance is also influenced by the replica movement strategy that is applied to the batches of partition reassignment commands. By default, Cruise Control uses the BaseReplicaMovementStrategy, which applies the reassignments in the order they were generated. However, this strategy could lead to the delay of other partition reassignments if large partition reassignments are generated then ordered first.
Cruise Control provides four alternative replica movement strategies that can be applied to optimization proposals:
-
PrioritizeSmallReplicaMovementStrategy: Reassign smaller partitions first. -
PrioritizeLargeReplicaMovementStrategy: Reassign larger partitions first. -
PostponeUrpReplicaMovementStrategy: Prioritize partitions without out-of-sync replicas. -
PrioritizeMinIsrWithOfflineReplicasStrategy: Prioritize reassignments for partitions at or below their minimum in-sync replicas (MinISR) with offline replicas.
Setconcurrency.adjuster.min.isr.check.enabledin the Cruise Control configuration to enable this strategy.
These strategies can be configured as a sequence. The first strategy attempts to compare two partition reassignments using its internal logic. If the reassignments are equivalent, then it passes them to the next strategy in the sequence to decide the order, and so on.
15.1.3.2. Rebalance tuning options 复制链接链接已复制到粘贴板!
You can set the following rebalance tuning options when configuring Cruise Control or individual rebalances:
Set the tuning options using one of the following methods:
-
Properties in the
cruisecontrol.propertiesfile -
arameters in POST requests to the
/rebalanceendpoint
The relevant configurations for both methods are summarized in the following table.
| Cruise Control properties | Rebalance endpoint parameters | Default | Description |
|---|---|---|---|
|
|
| 5 | The maximum number of inter-broker partition movements in each partition reassignment batch |
|
|
| 2 | The maximum number of intra-broker partition movements in each partition reassignment batch |
|
|
| 1000 | The maximum number of partition leadership changes in each partition reassignment batch |
|
|
| Null (no limit) | The bandwidth (in bytes per second) to assign to partition reassignment |
|
|
|
|
The list of strategies (in priority order) used to determine the order in which partition reassignment commands are executed for generated proposals. There are three strategies: |
Changing the default settings affects the length of time that the rebalance takes to complete, as well as the load placed on the Kafka cluster during the rebalance. Using lower values reduces the load but increases the amount of time taken, and vice versa.
15.2. Downloading Cruise Control 复制链接链接已复制到粘贴板!
A ZIP file distribution of Cruise Control is available for download from the Red Hat website. You can download the latest version of Red Hat Streams for Apache Kafka from the Streams for Apache Kafka software downloads page.
Procedure
- Download the latest version of the Red Hat Streams for Apache Kafka Cruise Control archive from the Red Hat Customer Portal.
Create the
/opt/cruise-controldirectory:sudo mkdir /opt/cruise-control
sudo mkdir /opt/cruise-controlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Extract the contents of the Cruise Control ZIP file to the new directory:
unzip amq-streams-<version>-cruise-control-bin.zip -d /opt/cruise-control
unzip amq-streams-<version>-cruise-control-bin.zip -d /opt/cruise-controlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Change the ownership of the
/opt/cruise-controldirectory to the Kafka user:sudo chown -R kafka:kafka /opt/cruise-control
sudo chown -R kafka:kafka /opt/cruise-controlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
15.3. Deploying the Cruise Control Metrics Reporter 复制链接链接已复制到粘贴板!
Before starting Cruise Control, you must configure the Kafka brokers to use the provided Cruise Control Metrics Reporter. The file for the Metrics Reporter is supplied with the Streams for Apache Kafka installation artifacts.
When loaded at runtime, the Metrics Reporter sends metrics to the __CruiseControlMetrics topic, one of three auto-created topics. Cruise Control uses these metrics to create and update the workload model and to calculate optimization proposals.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- You are logged in to Red Hat Enterprise Linux as the Kafka user.
Procedure
For each broker in the Kafka cluster and one at a time:
Stop the Kafka broker:
./bin/kafka-server-stop.sh
./bin/kafka-server-stop.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the Kafka configuration properties file to configure the Cruise Control Metrics Reporter.
Add the
CruiseControlMetricsReporterclass to themetric.reportersconfiguration option. Do not remove any existing Metrics Reporters.metric.reporters=com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter
metric.reporters=com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add the following configuration options and values:
cruise.control.metrics.topic.auto.create=true cruise.control.metrics.topic.num.partitions=1 cruise.control.metrics.topic.replication.factor=1
cruise.control.metrics.topic.auto.create=true cruise.control.metrics.topic.num.partitions=1 cruise.control.metrics.topic.replication.factor=1Copy to Clipboard Copied! Toggle word wrap Toggle overflow These options enable the Cruise Control Metrics Reporter to create the
__CruiseControlMetricstopic with a log cleanup policy ofDELETE. For more information, see Auto-created topics and Configuring logging and cleanup policy.
Configure SSL, if required.
In the Kafka configuration properties file, configure SSL between the Cruise Control Metrics Reporter and the Kafka broker by setting the relevant client configuration properties.
The Metrics Reporter accepts all standard producer-specific configuration properties with the
cruise.control.metrics.reporterprefix. For example:cruise.control.metrics.reporter.ssl.truststore.password.In the Cruise Control properties file (
./cruise-control/config/cruisecontrol.properties) configure SSL between the Kafka broker and the Cruise Control server by setting the relevant client configuration properties.Cruise Control inherits SSL client property options from Kafka and uses those properties for all Cruise Control server clients.
Restart the Kafka broker:
./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow For information on restarting brokers in a multi-node cluster, see Section 3.7, “Performing a graceful rolling restart of Kafka brokers”.
- Repeat steps 1-5 for the remaining brokers.
15.4. Installing Cruise Control 复制链接链接已复制到粘贴板!
Configure the properties used by Cruise Control and then start the Cruise Control server using the kafka-cruise-control-start.sh script. The server is hosted on a single machine for the whole Kafka cluster.
Three topics are auto-created when Cruise Control starts. For more information, see Section 15.4.1, “Auto-created topics”.
Prerequisites
- You are logged in to Red Hat Enterprise Linux as the Kafka user.
- You have downloaded Cruise Control.
- You have deployed the Cruise Control Metrics Reporter.
Procedure
-
Edit the Cruise Control properties file (
./cruise-control/config/cruisecontrol.properties). Configure the properties shown in the following example configuration:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Host and port numbers of the Kafka broker (always port 9092).
- 2
- Replication factor of the Kafka metric sample store topic. If you are evaluating Cruise Control in a single-node Kafka cluster, set this property to 1. For production use, set this property to 2 or more.
- 3
- The configuration file that sets the maximum capacity limits for broker resources. Use the file that applies to your Kafka deployment configuration. For more information, see Capacity configuration.
- 4
- Comma-separated list of default optimization goals, using fully-qualified domain names (FQDNs). A number of supported optimization goals (see 5) are already set as default optimization goals; you can add or remove goals if desired.
- 5
- Comma-separated list of supported optimization goals, using FQDNs. To completely exclude goals from being used to generate optimization proposals, remove them from the list.
- 6
- Comma-separated list of hard goals, using FQDNs. Seven of the supported optimization goals are already set as hard goals; you can add or remove goals if desired.
- 7
- The interval, in milliseconds, for refreshing the cached optimization proposal that is generated from the default optimization goals.
- 8
- Enables Cruise Control to use the Kafka API to detect broker failures.
Start the Cruise Control server. The server starts on port 9092 by default; optionally, specify a different port.
cd ./cruise-control/ ./kafka-cruise-control-start.sh config/cruisecontrol.properties <port_number>
cd ./cruise-control/ ./kafka-cruise-control-start.sh config/cruisecontrol.properties <port_number>Copy to Clipboard Copied! Toggle word wrap Toggle overflow To verify that Cruise Control is running, send a GET request to the
/stateendpoint of the Cruise Control server:curl -X GET 'http://<cc_host>:<cc_port>/kafkacruisecontrol/state'
curl -X GET 'http://<cc_host>:<cc_port>/kafkacruisecontrol/state'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
15.4.1. Auto-created topics 复制链接链接已复制到粘贴板!
The following table shows the three topics that are automatically created when Cruise Control starts. These topics are required for Cruise Control to work properly and must not be deleted or changed.
| Auto-created topic | Created by | Function |
|---|---|---|
|
| Cruise Control Metrics Reporter | Stores the raw metrics from the Metrics Reporter in each Kafka broker. |
|
| Cruise Control | Stores the derived metrics for each partition. These are created by the Metric Sample Aggregator. |
|
| Cruise Control | Stores the metrics samples used to create the Cluster Workload Model. |
To ensure that log compaction is disabled in the auto-created topics, make sure that you configure the Cruise Control Metrics Reporter as described in Section 15.3, “Deploying the Cruise Control Metrics Reporter”. Log compaction can remove records that are needed by Cruise Control and prevent it from working properly.
15.5. Configuring capacity limits 复制链接链接已复制到粘贴板!
Cruise Control uses capacity limits to determine if certain resource-based optimization goals are being broken. An attempted optimization fails if one or more of these resource-based goals is set as a hard goal and then broken. This prevents the optimization from being used to generate an optimization proposal.
You specify capacity limits for Kafka broker resources in one of the following three .json files in cruise-control/config:
-
capacityJBOD.json: For use in JBOD Kafka deployments (the default file). -
capacity.json: For use in non-JBOD Kafka deployments where each broker has the same number of CPU cores. -
capacityCores.json: For use in non-JBOD Kafka deployments where each broker has varying numbers of CPU cores.
Set the file in the capacity.config.file property in cruisecontrol.properties. The selected file will be used for broker capacity resolution. For example:
capacity.config.file=config/capacityJBOD.json
capacity.config.file=config/capacityJBOD.json
Capacity limits can be set for the following broker resources in the described units:
-
DISK: Disk storage in MB -
CPU: CPU utilization as a percentage (0-100) or as a number of cores -
NW_IN: Inbound network throughput in KB per second -
NW_OUT: Outbound network throughput in KB per second
To apply the same capacity limits to every broker monitored by Cruise Control, set capacity limits for broker ID -1. To set different capacity limits for individual brokers, specify each broker ID and its capacity configuration.
Example capacity limits configuration
For more information, see Populating the Capacity Configuration File in the Cruise Control Wiki.
15.6. Configuring logging and cleanup policy 复制链接链接已复制到粘贴板!
Cruise Control uses log4j1 for all server logging. To change the default configuration, edit the log4j.properties file in ./cruise-control/config/log4j.properties.
You must restart the Cruise Control server before the changes take effect.
It is important that the auto-created __CruiseControlMetrics topic (see auto-created topics) has a log cleanup policy of DELETE rather than COMPACT. Otherwise, records that are needed by Cruise Control might be removed.
As described in Section 15.3, “Deploying the Cruise Control Metrics Reporter”, setting the following options in the Kafka configuration file ensures that the COMPACT log cleanup policy is correctly set:
-
cruise.control.metrics.topic.auto.create=true -
cruise.control.metrics.topic.num.partitions=1 -
cruise.control.metrics.topic.replication.factor=1
If topic auto-creation is disabled in the Cruise Control Metrics Reporter (cruise.control.metrics.topic.auto.create=false), but enabled in the Kafka cluster, then the __CruiseControlMetrics topic is still automatically created by the broker. In this case, you must change the log cleanup policy of the __CruiseControlMetrics topic to DELETE using the kafka-configs.sh tool.
Get the current configuration of the
__CruiseControlMetricstopic:opt/kafka/bin/kafka-configs.sh --bootstrap-server <broker_address> --entity-type topics --entity-name __CruiseControlMetrics --describe
opt/kafka/bin/kafka-configs.sh --bootstrap-server <broker_address> --entity-type topics --entity-name __CruiseControlMetrics --describeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Change the log cleanup policy in the topic configuration:
./bin/kafka-configs.sh --bootstrap-server <broker_address> --entity-type topics --entity-name __CruiseControlMetrics --alter --add-config cleanup.policy=delete
./bin/kafka-configs.sh --bootstrap-server <broker_address> --entity-type topics --entity-name __CruiseControlMetrics --alter --add-config cleanup.policy=deleteCopy to Clipboard Copied! Toggle word wrap Toggle overflow
If topic auto-creation is disabled in both the Cruise Control Metrics Reporter and the Kafka cluster, you must create the __CruiseControlMetrics topic manually and then configure it to use the DELETE log cleanup policy using the kafka-configs.sh tool.
For more information, see Section 9.9, “Modifying a topic configuration”.
15.7. Generating optimization proposals 复制链接链接已复制到粘贴板!
When you make a POST request to the /rebalance endpoint, Cruise Control generates an optimization proposal to rebalance the Kafka cluster based on the optimization goals provided. You can use the results of the optimization proposal to rebalance your Kafka cluster.
You can run the optimization proposal using one of the following endpoints:
-
/rebalance -
/add_broker -
/remove_broker
The endpoint you use depends on whether you are rebalancing across all the brokers already running in the Kafka cluster; or you want to rebalance after adding brokers (scaling up) or before removing brokers (scaling down).
The optimization proposal is generated as a dry run, unless the dryrun parameter is supplied and set to false. In "dry run mode", Cruise Control generates the optimization proposal and the estimated result, but doesn’t initiate the proposal by rebalancing the cluster.
You can analyze the information returned in the optimization proposal and decide whether to approve it.
Use the following parameters to make requests to the endpoints:
dryrun
type: boolean, default: true
Informs Cruise Control whether you want to generate an optimization proposal only (true), or generate an optimization proposal and perform a cluster rebalance (false).
When dryrun=true (the default), you can also pass the verbose parameter to return more detailed information about the state of the Kafka cluster. This includes metrics for the load on each Kafka broker before and after the optimization proposal is applied, and the differences between the before and after values.
excluded_topics
type: regex
A regular expression that matches the topics to exclude from the calculation of the optimization proposal.
goals
type: list of strings, default: the configured default.goals list
List of user-provided optimization goals to use to prepare the optimization proposal. If goals are not supplied, the configured default.goals list in the cruisecontrol.properties file is used.
skip_hard_goals_check
type: boolean, default: false
By default, Cruise Control checks that the user-provided optimization goals (in the goals parameter) contain all the configured hard goals (in hard.goals). A request fails if you supply goals that are not a subset of the configured hard.goals.
Set skip_hard_goals_check to true if you want to generate an optimization proposal with user-provided optimization goals that do not include all the configured hard.goals.
json
type: boolean, default: false
Controls the type of response returned by the Cruise Control server. If not supplied, or set to false, then Cruise Control returns text formatted for display on the command line. If you want to extract elements of the returned information programmatically, set json=true. This will return JSON formatted text that can be piped to tools such as jq, or parsed in scripts and programs.
verbose
type: boolean, default: false
Controls the level of detail in responses that are returned by the Cruise Control server. Can be used with dryrun=true.
Other parameters are available. For more information, see REST APIs in the Cruise Control Wiki.
Prerequisites
- Kafka is running.
- You have configured Cruise Control.
- (Optional for scaling up) You have installed new brokers on hosts to include in the rebalance.
Procedure
Generate an optimization proposal using a POST request to the
/rebalance,/add_broker, or/remove_brokerendpoint.Example request to
/rebalanceusing default goalscurl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/rebalance'
curl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/rebalance'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The cached optimization proposal is immediately returned.
NoteIf
NotEnoughValidWindowsis returned, Cruise Control has not yet recorded enough metrics data to generate an optimization proposal. Wait a few minutes and then resend the request.Example request to
/rebalanceusing specified goalscurl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/rebalance?goals=RackAwareGoal,ReplicaCapacityGoal'
curl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/rebalance?goals=RackAwareGoal,ReplicaCapacityGoal'Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the request satisfies the supplied goals, the cached optimization proposal is immediately returned. Otherwise, a new optimization proposal is generated using the supplied goals; this takes longer to calculate. You can enforce this behavior by adding the
ignore_proposal_cache=trueparameter to the request.Example request to
/rebalanceusing specified goals without hard goalscurl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/rebalance?goals=RackAwareGoal,ReplicaCapacityGoal,ReplicaDistributionGoal&skip_hard_goal_check=true'
curl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/rebalance?goals=RackAwareGoal,ReplicaCapacityGoal,ReplicaDistributionGoal&skip_hard_goal_check=true'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example request to
/add_brokerthat includes specified brokerscurl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/add_broker?brokerid=3,4'
curl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/add_broker?brokerid=3,4'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The request includes the IDs of the new brokers only. For example, this request adds brokers with the IDs
3and4. Replicas are moved to the new brokers from existing brokers when rebalancing.Example request to
/remove_brokerthat excludes specified brokerscurl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/remove_broker?brokerid=3,4'
curl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/remove_broker?brokerid=3,4'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The request includes the IDs of the brokers being excluded only. For example, this request excludes brokers with the IDs
3and4. Replicas are moved from the brokers being removed to other existing brokers when rebalancing.NoteIf a broker that is being removed has excluded topics, replicas are still moved.
Review the optimization proposal contained in the response. The properties describe the pending cluster rebalance operation.
The proposal contains a high level summary of the proposed optimization, followed by summaries for each default optimization goal, and the expected cluster state after the proposal has executed.
Pay particular attention to the following information:
-
The
Cluster load after rebalancesummary. If it meets your requirements, you should assess the impact of the proposed changes using the high level summary. -
n inter-broker replica (y MB) movesindicates how much data will be moved across the network between brokers. The higher the value, the greater the potential performance impact on the Kafka cluster during the rebalance. -
n intra-broker replica (y MB) movesindicates how much data will be moved within the brokers themselves (between disks). The higher the value, the greater the potential performance impact on individual brokers (although less than that ofn inter-broker replica (y MB) moves). - The number of leadership moves. This has a negligible impact on the performance of the cluster during the rebalance.
-
The
Asynchronous responses
The Cruise Control REST API endpoints timeout after 10 seconds by default, although proposal generation continues on the server. A timeout might occur if the most recent cached optimization proposal is not ready, or if user-provided optimization goals were specified with ignore_proposal_cache=true.
To allow you to retrieve the optimization proposal at a later time, take note of the request’s unique identifier, which is given in the header of responses from the /rebalance endpoint.
To obtain the response using curl, specify the verbose (-v) option:
curl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/rebalance'
curl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/rebalance'
Here is an example header:
If an optimization proposal is not ready within the timeout, you can re-submit the POST request, this time including the User-Task-ID of the original request in the header:
curl -v -X POST -H 'User-Task-ID: 274b8095-d739-4840-85b9-f4cfaaf5c201' 'cruise-control-server:9090/kafkacruisecontrol/rebalance'
curl -v -X POST -H 'User-Task-ID: 274b8095-d739-4840-85b9-f4cfaaf5c201' 'cruise-control-server:9090/kafkacruisecontrol/rebalance'
What to do next
15.8. Approving optimization proposals 复制链接链接已复制到粘贴板!
If you are satisfied with your most recently generated optimization proposal, you can instruct Cruise Control to initiate a cluster rebalance and begin reassigning partitions.
Leave as little time as possible between generating an optimization proposal and initiating the cluster rebalance. If some time has passed since you generated the original optimization proposal, the cluster state might have changed. Therefore, the cluster rebalance that is initiated might be different to the one you reviewed. If in doubt, first generate a new optimization proposal.
Only one cluster rebalance, with a status of "Active", can be in progress at a time.
Prerequisites
- You have generated an optimization proposal from Cruise Control.
Procedure
Send a POST request to the
/rebalance,/add_broker, or/remove_brokerendpoint with thedryrun=falseparameter:If you used the
/add_brokeror/remove_brokerendpoint to generate a proposal that included or excluded brokers, use the same endpoint to perform the rebalance with or without the specified brokers.Example request to
/rebalancecurl -X POST 'cruise-control-server:9090/kafkacruisecontrol/rebalance?dryrun=false'
curl -X POST 'cruise-control-server:9090/kafkacruisecontrol/rebalance?dryrun=false'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example request to
/add_brokercurl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/add_broker?dryrun=false&brokerid=3,4'
curl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/add_broker?dryrun=false&brokerid=3,4'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example request to
/remove_brokercurl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/remove_broker?dryrun=false&brokerid=3,4'
curl -v -X POST 'cruise-control-server:9090/kafkacruisecontrol/remove_broker?dryrun=false&brokerid=3,4'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Cruise Control initiates the cluster rebalance and returns the optimization proposal.
- Check the changes that are summarized in the optimization proposal. If the changes are not what you expect, you can stop the rebalance.
Check the progress of the cluster rebalance using the
/user_tasksendpoint. The cluster rebalance in progress has a status of "Active".To view all cluster rebalance tasks executed on the Cruise Control server:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To view the status of a particular cluster rebalance task, supply the
user-task-idsparameter and the task ID:curl 'cruise-control-server:9090/kafkacruisecontrol/user_tasks?user_task_ids=c459316f-9eb5-482f-9d2d-97b5a4cd294d'
curl 'cruise-control-server:9090/kafkacruisecontrol/user_tasks?user_task_ids=c459316f-9eb5-482f-9d2d-97b5a4cd294d'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
(Optional) Removing brokers when scaling down
After a successful rebalance you can stop any brokers you excluded in order to scale down the Kafka cluster.
Check that each broker being removed does not have any live partitions in its log (
log.dirs).ls -l <LogDir> | grep -E '^d' | grep -vE '[a-zA-Z0-9.-]+\.[a-z0-9]+-delete$'
ls -l <LogDir> | grep -E '^d' | grep -vE '[a-zA-Z0-9.-]+\.[a-z0-9]+-delete$'Copy to Clipboard Copied! Toggle word wrap Toggle overflow If a log directory does not match the regular expression
\.[a-z0-9]-delete$, active partitions are still present. If you have active partitions, check the rebalance has finished or the configuration for the optimization proposal. You can run the proposal again. Make sure that there are no active partitions before moving on to the next step.Stop the broker.
./bin/kafka-server-stop.sh
./bin/kafka-server-stop.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the broker has stopped.
jcmd | grep kafka
jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow
15.9. Stopping rebalances 复制链接链接已复制到粘贴板!
You can stop the cluster rebalance that is currently in progress.
This instructs Cruise Control to finish the current batch of partition reassignments and then stop the rebalance. When the rebalance has stopped, completed partition reassignments have already been applied; therefore, the state of the Kafka cluster is different when compared to before the start of the rebalance operation. If further rebalancing is required, you should generate a new optimization proposal.
The performance of the Kafka cluster in the intermediate (stopped) state might be worse than in the initial state.
Prerequisites
- A cluster rebalance is in progress (indicated by a status of "Active").
Procedure
Send a POST request to the
/stop_proposal_executionendpoint:curl -X POST 'cruise-control-server:9090/kafkacruisecontrol/stop_proposal_execution'
curl -X POST 'cruise-control-server:9090/kafkacruisecontrol/stop_proposal_execution'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Make requests to the /topic_configuration endpoint of the Cruise Control REST API to modify topic configurations, including the replication factor.
Prerequisites
- You are logged in to Red Hat Enterprise Linux as the Kafka user.
- You have configured Cruise Control.
- You have deployed the Cruise Control Metrics Reporter.
Procedure
Start the Cruise Control server. The server starts on port 9092 by default; optionally, specify a different port.
cd ./cruise-control/ ./kafka-cruise-control-start.sh config/cruisecontrol.properties <port_number>
cd ./cruise-control/ ./kafka-cruise-control-start.sh config/cruisecontrol.properties <port_number>Copy to Clipboard Copied! Toggle word wrap Toggle overflow To verify that Cruise Control is running, send a GET request to the
/stateendpoint of the Cruise Control server:curl -X GET 'http://<cc_host>:<cc_port>/kafkacruisecontrol/state'
curl -X GET 'http://<cc_host>:<cc_port>/kafkacruisecontrol/state'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the
bin/kafka-topics.shcommand with the--describeoption to check the current replication factor of the target topic:./bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --topic <topic_name> \ --describe
./bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --topic <topic_name> \ --describeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Update the replication factor for the topic:
curl -X POST 'http://<cc_host>:<cc_port>/kafkacruisecontrol/topic_configuration?topic=<topic_name>&replication_factor=<new_replication_factor>&dryrun=false'
curl -X POST 'http://<cc_host>:<cc_port>/kafkacruisecontrol/topic_configuration?topic=<topic_name>&replication_factor=<new_replication_factor>&dryrun=false'Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example,
curl -X POST 'localhost:9090/kafkacruisecontrol/topic_configuration?topic=topic1&replication_factor=3&dryrun=false'.-
Run the
bin/kafka-topics.shcommand with the--describeoption, as before, to see the results of the change to the topic.
If you are using JBOD storage and have Cruise Control installed with Strimzi, you can reassign partitions and move data between the JBOD disks used for storage on the same broker. This capability also allows you to remove JBOD disks without data loss.
Use the Kafka kafka-log-dirs.sh tool to check information about Kafka topic partitions and their location on brokers before and after moving them.
Make requests to the remove_disks endpoint of the Cruise Control REST API to demote a disk in the cluster and reassign its partitions to other disk volumes.
Prerequisites
- You are logged in to Red Hat Enterprise Linux as the Kafka user.
- You have configured Cruise Control.
- You have deployed the Cruise Control Metrics Reporter.
- Kafka brokers use JBOD storage.
- More than one JBOD disk must be configured on the broker.
In this procedure, we use a broker configured with three JBOD volumes and a topic replication factor of three.
Example broker configuration with JBOD storage
node.id=1 process.roles=broker default.replication.factor = 3 log.dirs = /var/lib/kafka/data-0,/var/lib/kafka/data-1,/var/lib/kafka/data-2 # ...
node.id=1
process.roles=broker
default.replication.factor = 3
log.dirs = /var/lib/kafka/data-0,/var/lib/kafka/data-1,/var/lib/kafka/data-2
# ...
In the procedure, we reassign partitions for broker 1 from volume 0 to volumes 1 and 2.
Procedure
Start the Cruise Control server. The server starts on port 9092 by default; optionally, specify a different port.
cd ./cruise-control/ ./kafka-cruise-control-start.sh config/cruisecontrol.properties <port_number>
cd ./cruise-control/ ./kafka-cruise-control-start.sh config/cruisecontrol.properties <port_number>Copy to Clipboard Copied! Toggle word wrap Toggle overflow To verify that Cruise Control is running, send a GET request to the
/stateendpoint of the Cruise Control server:curl -X GET 'http://<cc_host>:<cc_port>/kafkacruisecontrol/state'
curl -X GET 'http://<cc_host>:<cc_port>/kafkacruisecontrol/state'Copy to Clipboard Copied! Toggle word wrap Toggle overflow (Optional) Check the partition replica data on the broker by using the Kafka
kafka-log-dirs.shtool:kafka-log-dirs.sh --describe --bootstrap-server my-cluster-kafka-bootstrap:9092 --broker-list 1 --topic-list my-topic
kafka-log-dirs.sh --describe --bootstrap-server my-cluster-kafka-bootstrap:9092 --broker-list 1 --topic-list my-topicCopy to Clipboard Copied! Toggle word wrap Toggle overflow The tool returns topic information for each log directory. In this example, we are restricting the information to
my-topicto show the steps against a single topic. The JBOD volumes used for log directories are mounted at/var/lib/kafka/<volume_id>.Example output data for each log directory
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the volume from the node:
curl -v -X POST 'http://<cc_host>:<cc_port>/kafkacruisecontrol/remove_disks?dryrun=false&brokerid_and_logdirs=1-/var/lib/kafka/data-0'
curl -v -X POST 'http://<cc_host>:<cc_port>/kafkacruisecontrol/remove_disks?dryrun=false&brokerid_and_logdirs=1-/var/lib/kafka/data-0'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The command specifies the broker ID and log directory for the volume being removed. If successful, partitions are reassigned from volume 0 on broker 1.
NoteIf required, you can perform a dry run of this operation before applying the changes.
Use the Kafka
kafka-log-dirs.shtool again to verify volume removal and data movement.In this example, volume 0 has been removed and
my-topic-0partition reassigned to/var/lib/kafka/data-1.Example output data following reassignment of partitions
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 18. Using the partition reassignment tool 复制链接链接已复制到粘贴板!
When scaling a Kafka cluster, you may need to add or remove brokers and update the distribution of partitions or the replication factor of topics. To update partitions and topics, you can use the kafka-reassign-partitions.sh tool.
You can change the replication factor of a topic using the kafka-reassign-partitions.sh tool. The tool can also be used to reassign partitions and balance the distribution of partitions across brokers to improve performance. However, it is recommended to use Cruise Control for automated partition reassignments and cluster rebalancing and changing the topic replication factor. Cruise Control can move topics from one broker to another without any downtime, and it is the most efficient way to reassign partitions.
18.1. Partition reassignment tool overview 复制链接链接已复制到粘贴板!
The partition reassignment tool provides the following capabilities for managing Kafka partitions and brokers:
- Redistributing partition replicas
- Scale your cluster up and down by adding or removing brokers, and move Kafka partitions from heavily loaded brokers to under-utilized brokers. To do this, you must create a partition reassignment plan that identifies which topics and partitions to move and where to move them. Cruise Control is recommended for this type of operation as it automates the cluster rebalancing process.
- Scaling topic replication factor up and down
- Increase or decrease the replication factor of your Kafka topics. To do this, you must create a partition reassignment plan that identifies the existing replication assignment across partitions and an updated assignment with the replication factor changes.
- Changing the preferred leader
Change the preferred leader of a Kafka partition. In Kafka, the partition leader is the only partition that accepts writes from message producers, and therefore has the most complete log across all replicas.
Changing the preferred leader can be useful if you want to redistribute load across the brokers in the cluster. If the preferred leader is unavailable, another in-sync replica is automatically elected as leader, or the partition goes offline if there are no in-sync replicas. A background thread moves the leader role to the preferred replica when it is in sync. Therefore, changing the preferred replicas only makes sense in the context of a cluster rebalancing.
To do this, you must create a partition reassignment plan that specifies the new preferred leader for each partition by changing the order of replicas. In Kafka’s leader election process, the preferred leader is prioritized by the order of replicas. The first broker in the order of replicas is designated as the preferred leader. This designation is important for load balancing by distributing partition leaders across the Kafka cluster. However, this alone might not be sufficient for optimal load balancing, as some partitions may have higher usage than others. Cruise Control can help address this by providing more comprehensive cluster rebalancing.
- Changing the log directories to use a specific JBOD volume
- Change the log directories of your Kafka brokers to use a specific JBOD volume. This can be useful if you want to move your Kafka data to a different disk or storage device. To do this, you must create a partition reassignment plan that specifies the new log directory for each topic.
18.1.1. Generating a partition reassignment plan 复制链接链接已复制到粘贴板!
The partition reassignment tool (kafka-reassign-partitions.sh) works by generating a partition assignment plan that specifies which partitions should be moved from their current broker to a new broker.
If you are satisfied with the plan, you can execute it. The tool then does the following:
- Migrates the partition data to the new broker
- Updates the metadata on the Kafka brokers to reflect the new partition assignments
- Triggers a rolling restart of the Kafka brokers to ensure that the new assignments take effect
The partition reassignment tool has three different modes:
--generate- Takes a set of topics and brokers and generates a reassignment JSON file which will result in the partitions of those topics being assigned to those brokers. Because this operates on whole topics, it cannot be used when you only want to reassign some partitions of some topics.
--execute- Takes a reassignment JSON file and applies it to the partitions and brokers in the cluster. Brokers that gain partitions as a result become followers of the partition leader. For a given partition, once the new broker has caught up and joined the ISR (in-sync replicas) the old broker will stop being a follower and will delete its replica.
--verify-
Using the same reassignment JSON file as the
--executestep,--verifychecks whether all the partitions in the file have been moved to their intended brokers. If the reassignment is complete,--verifyalso removes any traffic throttles (--throttle) that are in effect. Unless removed, throttles will continue to affect the cluster even after the reassignment has finished.
It is only possible to have one reassignment running in a cluster at any given time, and it is not possible to cancel a running reassignment. If you must cancel a reassignment, wait for it to complete and then perform another reassignment to revert the effects of the first reassignment. The kafka-reassign-partitions.sh will print the reassignment JSON for this reversion as part of its output. Very large reassignments should be broken down into a number of smaller reassignments in case there is a need to stop in-progress reassignment.
The kafka-reassign-partitions.sh tool uses a reassignment JSON file that specifies the topics to reassign. You can generate a reassignment JSON file or create a file manually if you want to move specific partitions.
A basic reassignment JSON file has the structure presented in the following example, which describes three partitions belonging to two Kafka topics. Each partition is reassigned to a new set of replicas, which are identified by their broker IDs. The version, topic, partition, and replicas properties are all required.
Example partition reassignment JSON file structure
- 1
- The version of the reassignment JSON file format. Currently, only version 1 is supported, so this should always be 1.
- 2
- An array that specifies the partitions to be reassigned.
- 3
- The name of the Kafka topic that the partition belongs to.
- 4
- The ID of the partition being reassigned.
- 5
- An ordered array of the IDs of the brokers that should be assigned as replicas for this partition. The first broker in the list is the leader replica.
Partitions not included in the JSON are not changed.
If you specify only topics using a topics array, the partition reassignment tool reassigns all the partitions belonging to the specified topics.
Example reassignment JSON file structure for reassigning all partitions for a topic
18.1.3. Reassigning partitions between JBOD volumes 复制链接链接已复制到粘贴板!
When using JBOD storage in your Kafka cluster, you can reassign the partitions between specific volumes and their log directories (each volume has a single log directory).
To reassign a partition to a specific volume, add log_dirs values for each partition in the reassignment JSON file. Each log_dirs array contains the same number of entries as the replicas array, since each replica should be assigned to a specific log directory. The log_dirs array contains either an absolute path to a log directory or the special value any. The any value indicates that Kafka can choose any available log directory for that replica, which can be useful when reassigning partitions between JBOD volumes.
Example reassignment JSON file structure with log directories
18.1.4. Throttling partition reassignment 复制链接链接已复制到粘贴板!
Partition reassignment can be a slow process because it involves transferring large amounts of data between brokers. To avoid a detrimental impact on clients, you can throttle the reassignment process. Use the --throttle parameter with the kafka-reassign-partitions.sh tool to throttle a reassignment. You specify a maximum threshold in bytes per second for the movement of partitions between brokers. For example, --throttle 5000000 sets a maximum threshold for moving partitions of 50 MBps.
Throttling might cause the reassignment to take longer to complete.
- If the throttle is too low, the newly assigned brokers will not be able to keep up with records being published and the reassignment will never complete.
- If the throttle is too high, clients will be impacted.
For example, for producers, this could manifest as higher than normal latency waiting for acknowledgment. For consumers, this could manifest as a drop in throughput caused by higher latency between polls.
18.2. Reassigning partitions after adding brokers 复制链接链接已复制到粘贴板!
Use a reassignment file generated by the kafka-reassign-partitions.sh tool to reassign partitions after increasing the number of brokers in a Kafka cluster. The reassignment file should describe how partitions are reassigned to brokers in the enlarged Kafka cluster. You apply the reassignment specified in the file to the brokers and then verify the new partition assignments.
This procedure describes a secure scaling process that uses TLS. You’ll need a Kafka cluster that uses TLS encryption and mTLS authentication.
Though you can use the kafka-reassign-partitions.sh tool, Cruise Control is recommended for automated partition reassignments and cluster rebalancing. Cruise Control can move topics from one broker to another without any downtime, and it is the most efficient way to reassign partitions.
Prerequisites
- An existing Kafka cluster.
- A new machine with the additional AMQ broker installed.
You have created a JSON file to specify how partitions should be reassigned to brokers in the enlarged cluster.
In this procedure, we are reassigning all partitions for a topic called
my-topic. A JSON file namedtopics.jsonspecifies the topic, and is used to generate areassignment.jsonfile.
Example JSON file specifies my-topic
Procedure
-
Create a configuration file for the new broker using the same settings as for the other brokers in your cluster, except for
broker.id, which should be a number that is not already used by any of the other brokers. Start the new Kafka broker passing the configuration file you created in the previous step as the argument to the
kafka-server-start.shscript:./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the Kafka broker is running.
jcmd | grep Kafka
jcmd | grep KafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Repeat the above steps for each new broker.
If you haven’t done so, generate a reassignment JSON file named
reassignment.jsonusing thekafka-reassign-partitions.shtool.Example command to generate the reassignment JSON file
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --topics-to-move-json-file topics.json \ --broker-list 0,1,2,3,4 \ --generate
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --topics-to-move-json-file topics.json \1 --broker-list 0,1,2,3,4 \2 --generateCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example reassignment JSON file showing the current and proposed replica assignment
Current partition replica assignment {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[0,1,2],"log_dirs":["any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[1,2,3],"log_dirs":["any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[2,3,0],"log_dirs":["any","any","any"]}]} Proposed partition reassignment configuration {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[0,1,2,3],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[1,2,3,4],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[2,3,4,0],"log_dirs":["any","any","any","any"]}]}Current partition replica assignment {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[0,1,2],"log_dirs":["any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[1,2,3],"log_dirs":["any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[2,3,0],"log_dirs":["any","any","any"]}]} Proposed partition reassignment configuration {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[0,1,2,3],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[1,2,3,4],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[2,3,4,0],"log_dirs":["any","any","any","any"]}]}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Save a copy of this file locally in case you need to revert the changes later on.
Run the partition reassignment using the
--executeoption../bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --execute
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --executeCopy to Clipboard Copied! Toggle word wrap Toggle overflow If you are going to throttle replication you can also pass the
--throttleoption with an inter-broker throttled rate in bytes per second. For example:./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --throttle 5000000 \ --execute
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --throttle 5000000 \ --executeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the reassignment has completed using the
--verifyoption../bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --verify
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --verifyCopy to Clipboard Copied! Toggle word wrap Toggle overflow The reassignment has finished when the
--verifycommand reports that each of the partitions being moved has completed successfully. This final--verifywill also have the effect of removing any reassignment throttles.
18.3. Reassigning partitions before removing brokers 复制链接链接已复制到粘贴板!
Use a reassignment file generated by the kafka-reassign-partitions.sh tool to reassign partitions before decreasing the number of brokers in a Kafka cluster. The reassignment file must describe how partitions are reassigned to the remaining brokers in the Kafka cluster. You apply the reassignment specified in the file to the brokers and then verify the new partition assignments. Brokers in the highest numbered pods are removed first.
This procedure describes a secure scaling process that uses TLS. You’ll need a Kafka cluster that uses TLS encryption and mTLS authentication.
Though you can use the kafka-reassign-partitions.sh tool, Cruise Control is recommended for automated partition reassignments and cluster rebalancing. Cruise Control can move topics from one broker to another without any downtime, and it is the most efficient way to reassign partitions.
Prerequisites
- An existing Kafka cluster.
You have created a JSON file to specify how partitions should be reassigned to brokers in the reduced cluster.
In this procedure, we are reassigning all partitions for a topic called
my-topic. A JSON file namedtopics.jsonspecifies the topic, and is used to generate areassignment.jsonfile.
Example JSON file specifies my-topic
Procedure
If you haven’t done so, generate a reassignment JSON file named
reassignment.jsonusing thekafka-reassign-partitions.shtool.Example command to generate the reassignment JSON file
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --topics-to-move-json-file topics.json \ --broker-list 0,1,2,3 \ --generate
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --topics-to-move-json-file topics.json \1 --broker-list 0,1,2,3 \2 --generateCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example reassignment JSON file showing the current and proposed replica assignment
Current partition replica assignment {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[3,4,2,0],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[0,2,3,1],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[1,3,0,4],"log_dirs":["any","any","any","any"]}]} Proposed partition reassignment configuration {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[0,1,2],"log_dirs":["any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[1,2,3],"log_dirs":["any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[2,3,0],"log_dirs":["any","any","any"]}]}Current partition replica assignment {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[3,4,2,0],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[0,2,3,1],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[1,3,0,4],"log_dirs":["any","any","any","any"]}]} Proposed partition reassignment configuration {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[0,1,2],"log_dirs":["any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[1,2,3],"log_dirs":["any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[2,3,0],"log_dirs":["any","any","any"]}]}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Save a copy of this file locally in case you need to revert the changes later on.
Run the partition reassignment using the
--executeoption../bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --execute
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --executeCopy to Clipboard Copied! Toggle word wrap Toggle overflow If you are going to throttle replication you can also pass the
--throttleoption with an inter-broker throttled rate in bytes per second. For example:./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --throttle 5000000 \ --execute
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --throttle 5000000 \ --executeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the reassignment has completed using the
--verifyoption../bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --verify
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --verifyCopy to Clipboard Copied! Toggle word wrap Toggle overflow The reassignment has finished when the
--verifycommand reports that each of the partitions being moved has completed successfully. This final--verifywill also have the effect of removing any reassignment throttles.Check that each broker being removed does not have any live partitions in its log (
log.dirs).ls -l <LogDir> | grep -E '^d' | grep -vE '[a-zA-Z0-9.-]+\.[a-z0-9]+-delete$'
ls -l <LogDir> | grep -E '^d' | grep -vE '[a-zA-Z0-9.-]+\.[a-z0-9]+-delete$'Copy to Clipboard Copied! Toggle word wrap Toggle overflow If a log directory does not match the regular expression
\.[a-z0-9]-delete$, active partitions are still present. If you have active partitions, check the reassignment has finished or the configuration in the reassignment JSON file. You can run the reassignment again. Make sure that there are no active partitions before moving on to the next step.Stop the broker.
./bin/kafka-server-stop.sh
./bin/kafka-server-stop.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the Kafka broker has stopped.
jcmd | grep kafka
jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow
18.4. Changing the replication factor of topics 复制链接链接已复制到粘贴板!
Use the kafka-reassign-partitions.sh tool to change the replication factor of topics in a Kafka cluster. This can be done using a reassignment file to describe how the topic replicas should be changed.
Prerequisites
- An existing Kafka cluster.
You have created a JSON file to specify the topics to include in the operation.
In this procedure, a topic called
my-topichas 4 replicas and we want to reduce it to 3. A JSON file namedtopics.jsonspecifies the topic, and is used to generate areassignment.jsonfile.
Example JSON file specifies my-topic
Procedure
If you haven’t done so, generate a reassignment JSON file named
reassignment.jsonusing thekafka-reassign-partitions.shtool.Example command to generate the reassignment JSON file
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --topics-to-move-json-file topics.json \ --broker-list 0,1,2,3,4 \ --generate
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --topics-to-move-json-file topics.json \1 --broker-list 0,1,2,3,4 \2 --generateCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example reassignment JSON file showing the current and proposed replica assignment
Current partition replica assignment {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[3,4,2,0],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[0,2,3,1],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[1,3,0,4],"log_dirs":["any","any","any","any"]}]} Proposed partition reassignment configuration {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[0,1,2,3],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[1,2,3,4],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[2,3,4,0],"log_dirs":["any","any","any","any"]}]}Current partition replica assignment {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[3,4,2,0],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[0,2,3,1],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[1,3,0,4],"log_dirs":["any","any","any","any"]}]} Proposed partition reassignment configuration {"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[0,1,2,3],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[1,2,3,4],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[2,3,4,0],"log_dirs":["any","any","any","any"]}]}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Save a copy of this file locally in case you need to revert the changes later on.
Edit the
reassignment.jsonto remove a replica from each partition.For example use
jqto remove the last replica in the list for each partition of the topic:Removing the last topic replica for each partition
jq '.partitions[].replicas |= del(.[-1])' reassignment.json > reassignment.json
jq '.partitions[].replicas |= del(.[-1])' reassignment.json > reassignment.jsonCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example reassignment file showing the updated replicas
{"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[0,1,2],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[1,2,3],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[2,3,4],"log_dirs":["any","any","any","any"]}]}{"version":1,"partitions":[{"topic":"my-topic","partition":0,"replicas":[0,1,2],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":1,"replicas":[1,2,3],"log_dirs":["any","any","any","any"]},{"topic":"my-topic","partition":2,"replicas":[2,3,4],"log_dirs":["any","any","any","any"]}]}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Make the topic replica change using the
--executeoption../bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --execute
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --executeCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteRemoving replicas from a broker does not require any inter-broker data movement, so there is no need to throttle replication. If you are adding replicas, then you may want to change the throttle rate.
Verify that the change to the topic replicas has completed using the
--verifyoption../bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --verify
./bin/kafka-reassign-partitions.sh \ --bootstrap-server localhost:9092 \ --reassignment-json-file reassignment.json \ --verifyCopy to Clipboard Copied! Toggle word wrap Toggle overflow The reassignment has finished when the
--verifycommand reports that each of the partitions being moved has completed successfully. This final--verifywill also have the effect of removing any reassignment throttles.Run the
bin/kafka-topics.shcommand with the--describeoption to see the results of the change to the topics../bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --describe
./bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --describeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Results of reducing the number of replicas for a topic
my-topic Partition: 0 Leader: 0 Replicas: 0,1,2 Isr: 0,1,2 my-topic Partition: 1 Leader: 2 Replicas: 1,2,3 Isr: 1,2,3 my-topic Partition: 2 Leader: 3 Replicas: 2,3,4 Isr: 2,3,4
my-topic Partition: 0 Leader: 0 Replicas: 0,1,2 Isr: 0,1,2 my-topic Partition: 1 Leader: 2 Replicas: 1,2,3 Isr: 1,2,3 my-topic Partition: 2 Leader: 3 Replicas: 2,3,4 Isr: 2,3,4Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 19. Setting up distributed tracing 复制链接链接已复制到粘贴板!
Distributed tracing allows you to track the progress of transactions between applications in a distributed system. In a microservices architecture, tracing tracks the progress of transactions between services. Trace data is useful for monitoring application performance and investigating issues with target systems and end-user applications.
In Streams for Apache Kafka, tracing facilitates the end-to-end tracking of messages: from source systems to Kafka, and then from Kafka to target systems and applications. It complements the metrics that are available to view in JMX metrics, as well as the component loggers.
Support for tracing is built in to the following Kafka components:
- Kafka Connect
- MirrorMaker
- MirrorMaker 2
- Streams for Apache Kafka Bridge
Tracing is not supported for Kafka brokers.
You add tracing configuration to the properties file of the component.
To enable tracing, you set environment variables and add the library of the tracing system to the Kafka classpath. For Jaeger tracing, you can add tracing artifacts for OpenTelemetry with the Jaeger Exporter.
Streams for Apache Kafka no longer supports OpenTracing. If you were previously using OpenTracing with Jaeger, we encourage you to transition to using OpenTelemetry instead.
To enable tracing in Kafka producers, consumers, and Kafka Streams API applications, you instrument application code. When instrumented, clients generate trace data; for example, when producing messages or writing offsets to the log.
Setting up tracing for applications and systems beyond Streams for Apache Kafka is outside the scope of this content.
19.1. Outline of procedures 复制链接链接已复制到粘贴板!
To set up tracing for Streams for Apache Kafka, follow these procedures in order:
Set up tracing for Kafka Connect, MirrorMaker 2, and MirrorMaker:
Set up tracing for clients:
Instrument clients with tracers:
For information on enabling tracing for the Kafka Bridge, see Using the Streams for Apache Kafka Bridge.
19.2. Tracing options 复制链接链接已复制到粘贴板!
Distributed traces consist of spans, which represent individual units of work performed over a specific time period. When instrumented with tracers, applications generate traces that follow requests as they move through the system, making it easier to identify delays or issues.
OpenTelemetry, a telemetry framework, provides APIs for tracing that are independent of any specific backend tracing system. In Streams for Apache Kafka, the default protocol for transmitting traces between Kafka components and tracing systems is OpenTelemetry’s OTLP (OpenTelemetry Protocol), a vendor-neutral protocol.
While OTLP is the default, Streams for Apache Kafka also supports other tracing systems, such as Jaeger. Jaeger is a distributed tracing system designed for monitoring microservices, and its user interface allows you to query, filter, and analyze trace data in detail.
The Jaeger user interface showing a simple query
19.3. Environment variables for tracing 复制链接链接已复制到粘贴板!
Use environment variables to enable tracing for Kafka components or to initialize a tracer for Kafka clients.
Tracing environment variables are subject to change. For the latest information, see the OpenTelemetry documentation.
The following table describes the key environment variables for setting up tracing with OpenTelemetry.
| Property | Required | Description |
|---|---|---|
|
| Yes | The name of the tracing service for OpenTelemetry, such as OTLP or Jaeger. |
|
| Yes (if using OTLP exporter) |
The OTLP endpoint for exporting trace data to the tracing system. For Jaeger tracing, specify the |
|
| No (unless using a non-OTLP exporter) |
The exporter used for tracing. The default is |
|
| No (required if using TLS with OTLP) |
The path to the file containing trusted certificates for TLS authentication. Required to secure communication between Kafka components and the OpenTelemetry endpoint when using TLS with the |
19.4. Enabling tracing for Kafka Connect 复制链接链接已复制到粘贴板!
Enable distributed tracing for Kafka Connect using configuration properties. Only messages produced and consumed by Kafka Connect itself are traced. To trace messages sent between Kafka Connect and external systems, you must configure tracing in the connectors for those systems.
You can enable tracing that uses OpenTelemetry.
Procedure
-
Add the tracing artifacts to the
opt/kafka/libsdirectory. Configure producer and consumer tracing in the relevant Kafka Connect configuration file.
-
If you are running Kafka Connect in standalone mode, edit the
./config/connect-standalone.propertiesfile. -
If you are running Kafka Connect in distributed mode, edit the
./config/connect-distributed.propertiesfile.
Add the following tracing interceptor properties to the configuration file:
Properties for OpenTelemetry
producer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingProducerInterceptor consumer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingConsumerInterceptor
producer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingProducerInterceptor consumer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingConsumerInterceptorCopy to Clipboard Copied! Toggle word wrap Toggle overflow With tracing enabled, you initialize tracing when you run the Kafka Connect script.
-
If you are running Kafka Connect in standalone mode, edit the
- Save the configuration file.
- Set the environment variables for tracing.
Start Kafka Connect in standalone or distributed mode with the configuration file as a parameter (plus any connector properties):
Running Kafka Connect in standalone mode
./bin/connect-standalone.sh \ ./config/connect-standalone.properties \ connector1.properties \ [connector2.properties ...]
./bin/connect-standalone.sh \ ./config/connect-standalone.properties \ connector1.properties \ [connector2.properties ...]Copy to Clipboard Copied! Toggle word wrap Toggle overflow Running Kafka Connect in distributed mode
./bin/connect-distributed.sh ./config/connect-distributed.properties
./bin/connect-distributed.sh ./config/connect-distributed.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow The internal consumers and producers of Kafka Connect are now enabled for tracing.
19.5. Enabling tracing for MirrorMaker 2 复制链接链接已复制到粘贴板!
Enable distributed tracing for MirrorMaker 2 by defining the Interceptor properties in the MirrorMaker 2 properties file. Messages are traced between Kafka clusters. The trace data records messages entering and leaving the MirrorMaker 2 component.
You can enable tracing that uses OpenTelemetry.
Procedure
-
Add the tracing artifacts to the
opt/kafka/libsdirectory. Configure producer and consumer tracing in the
opt/kafka/config/connect-mirror-maker.propertiesfile.Add the following tracing interceptor properties to the configuration file:
Properties for OpenTelemetry
header.converter=org.apache.kafka.connect.converters.ByteArrayConverter producer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingProducerInterceptor consumer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingConsumerInterceptor
header.converter=org.apache.kafka.connect.converters.ByteArrayConverter producer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingProducerInterceptor consumer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingConsumerInterceptorCopy to Clipboard Copied! Toggle word wrap Toggle overflow ByteArrayConverterprevents Kafka Connect from converting message headers (containing trace IDs) to base64 encoding. This ensures that messages are the same in both the source and the target clusters.With tracing enabled, you initialize tracing when you run the Kafka MirrorMaker 2 script.
- Save the configuration file.
- Set the environment variables for tracing.
Start MirrorMaker 2 with the producer and consumer configuration files as parameters:
./bin/connect-mirror-maker.sh \ ./config/connect-mirror-maker.properties
./bin/connect-mirror-maker.sh \ ./config/connect-mirror-maker.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow The internal consumers and producers of MirrorMaker 2 are now enabled for tracing.
19.6. Enabling tracing for MirrorMaker 复制链接链接已复制到粘贴板!
Enable distributed tracing for MirrorMaker by passing the Interceptor properties as consumer and producer configuration parameters. Messages are traced from the source cluster to the target cluster. The trace data records messages entering and leaving the MirrorMaker component.
You can enable tracing that uses OpenTelemetry.
Procedure
-
Add the tracing artifacts to the
opt/kafka/libsdirectory. Configure producer tracing in the
./config/producer.propertiesfile.Add the following tracing interceptor property:
Producer property for OpenTelemetry
producer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingProducerInterceptor
producer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingProducerInterceptorCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the configuration file.
Configure consumer tracing in the
./config/consumer.propertiesfile.Add the following tracing interceptor property:
Consumer property for OpenTelemetry
consumer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingConsumerInterceptor
consumer.interceptor.classes=io.opentelemetry.instrumentation.kafkaclients.TracingConsumerInterceptorCopy to Clipboard Copied! Toggle word wrap Toggle overflow With tracing enabled, you initialize tracing when you run the Kafka MirrorMaker script.
- Save the configuration file.
- Set the environment variables for tracing.
Start MirrorMaker with the producer and consumer configuration files as parameters:
./bin/kafka-mirror-maker.sh \ --producer.config ./config/producer.properties \ --consumer.config ./config/consumer.properties \ --num.streams=2
./bin/kafka-mirror-maker.sh \ --producer.config ./config/producer.properties \ --consumer.config ./config/consumer.properties \ --num.streams=2Copy to Clipboard Copied! Toggle word wrap Toggle overflow The internal consumers and producers of MirrorMaker are now enabled for tracing.
19.7. Initializing tracing for Kafka clients 复制链接链接已复制到粘贴板!
Initialize a tracer for OpenTelemetry, then instrument your client applications for distributed tracing. You can instrument Kafka producer and consumer clients, and Kafka Streams API applications.
Configure and initialize a tracer using a set of tracing environment variables.
Procedure
In each client application add the dependencies for the tracer:
Add the Maven dependencies to the
pom.xmlfile for the client application:Dependencies for OpenTelemetry
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Define the configuration of the tracer using the tracing environment variables.
Create a tracer, which is initialized with the environment variables:
Creating a tracer for OpenTelemetry
OpenTelemetry ot = GlobalOpenTelemetry.get();
OpenTelemetry ot = GlobalOpenTelemetry.get();Copy to Clipboard Copied! Toggle word wrap Toggle overflow Register the tracer as a global tracer:
GlobalTracer.register(tracer);
GlobalTracer.register(tracer);Copy to Clipboard Copied! Toggle word wrap Toggle overflow Instrument your client:
19.8. Instrumenting producers and consumers for tracing 复制链接链接已复制到粘贴板!
Instrument application code to enable tracing in Kafka producers and consumers. Use a decorator pattern or interceptors to instrument your Java producer and consumer application code for tracing. You can then record traces when messages are produced or retrieved from a topic.
OpenTelemetry instrumentation project provides classes that support instrumentation of producers and consumers.
- Decorator instrumentation
- For decorator instrumentation, create a modified producer or consumer instance for tracing.
- Interceptor instrumentation
- For interceptor instrumentation, add the tracing capability to the consumer or producer configuration.
Prerequisites
You have initialized tracing for the client.
You enable instrumentation in producer and consumer applications by adding the tracing JARs as dependencies to your project.
Procedure
Perform these steps in the application code of each producer and consumer application. Instrument your client application code using either a decorator pattern or interceptors.
To use a decorator pattern, create a modified producer or consumer instance to send or receive messages.
You pass the original
KafkaProducerorKafkaConsumerclass.Example decorator instrumentation for OpenTelemetry
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To use interceptors, set the interceptor class in the producer or consumer configuration.
You use the
KafkaProducerandKafkaConsumerclasses in the usual way. TheTracingProducerInterceptorandTracingConsumerInterceptorinterceptor classes take care of the tracing capability.Example producer configuration using interceptors
senderProps.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, TracingProducerInterceptor.class.getName()); KafkaProducer<Integer, String> producer = new KafkaProducer<>(senderProps); producer.send(...);senderProps.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, TracingProducerInterceptor.class.getName()); KafkaProducer<Integer, String> producer = new KafkaProducer<>(senderProps); producer.send(...);Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example consumer configuration using interceptors
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Instrument application code to enable tracing in Kafka Streams API applications. Use a decorator pattern or interceptors to instrument your Kafka Streams API applications for tracing. You can then record traces when messages are produced or retrieved from a topic.
- Decorator instrumentation
-
For decorator instrumentation, create a modified Kafka Streams instance for tracing. For OpenTelemetry, you need to create a custom
TracingKafkaClientSupplierclass to provide tracing instrumentation for Kafka Streams. - Interceptor instrumentation
- For interceptor instrumentation, add the tracing capability to the Kafka Streams producer and consumer configuration.
Prerequisites
You have initialized tracing for the client.
You enable instrumentation in Kafka Streams applications by adding the tracing JARs as dependencies to your project.
-
To instrument Kafka Streams with OpenTelemetry, you’ll need to write a custom
TracingKafkaClientSupplier. The custom
TracingKafkaClientSuppliercan extend Kafka’sDefaultKafkaClientSupplier, overriding the producer and consumer creation methods to wrap the instances with the telemetry-related code.Example custom
TracingKafkaClientSupplierCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure
Perform these steps for each Kafka Streams API application.
To use a decorator pattern, create an instance of the
TracingKafkaClientSuppliersupplier interface, then provide the supplier interface toKafkaStreams.Example decorator instrumentation
KafkaClientSupplier supplier = new TracingKafkaClientSupplier(tracer); KafkaStreams streams = new KafkaStreams(builder.build(), new StreamsConfig(config), supplier); streams.start();
KafkaClientSupplier supplier = new TracingKafkaClientSupplier(tracer); KafkaStreams streams = new KafkaStreams(builder.build(), new StreamsConfig(config), supplier); streams.start();Copy to Clipboard Copied! Toggle word wrap Toggle overflow To use interceptors, set the interceptor class in the Kafka Streams producer and consumer configuration.
The
TracingProducerInterceptorandTracingConsumerInterceptorinterceptor classes take care of the tracing capability.Example producer and consumer configuration using interceptors
props.put(StreamsConfig.PRODUCER_PREFIX + ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, TracingProducerInterceptor.class.getName()); props.put(StreamsConfig.CONSUMER_PREFIX + ConsumerConfig.INTERCEPTOR_CLASSES_CONFIG, TracingConsumerInterceptor.class.getName());
props.put(StreamsConfig.PRODUCER_PREFIX + ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, TracingProducerInterceptor.class.getName()); props.put(StreamsConfig.CONSUMER_PREFIX + ConsumerConfig.INTERCEPTOR_CLASSES_CONFIG, TracingConsumerInterceptor.class.getName());Copy to Clipboard Copied! Toggle word wrap Toggle overflow
19.10. Specifying tracing systems with OpenTelemetry 复制链接链接已复制到粘贴板!
Instead of the default Jaeger system, you can specify other tracing systems that are supported by OpenTelemetry.
If you want to use another tracing system with OpenTelemetry, do the following:
- Add the library of the tracing system to the Kafka classpath.
Add the name of the tracing system as an additional exporter environment variable.
Additional environment variable when not using Jaeger
OTEL_SERVICE_NAME=my-tracing-service OTEL_TRACES_EXPORTER=zipkin OTEL_EXPORTER_ZIPKIN_ENDPOINT=http://localhost:9411/api/v2/spans
OTEL_SERVICE_NAME=my-tracing-service OTEL_TRACES_EXPORTER=zipkin1 OTEL_EXPORTER_ZIPKIN_ENDPOINT=http://localhost:9411/api/v2/spans2 Copy to Clipboard Copied! Toggle word wrap Toggle overflow
19.11. Specifying custom span names for OpenTelemetry 复制链接链接已复制到粘贴板!
A tracing span is a logical unit of work in Jaeger, with an operation name, start time, and duration. Spans have built-in names, but you can specify custom span names in your Kafka client instrumentation where used.
Specifying custom span names is optional and only applies when using a decorator pattern in producer and consumer client instrumentation or Kafka Streams instrumentation.
Custom span names cannot be specified directly with OpenTelemetry. Instead, you retrieve span names by adding code to your client application to extract additional tags and attributes.
Example code to extract attributes
Chapter 20. Using Kafka Exporter 复制链接链接已复制到粘贴板!
Kafka Exporter is an open source project to enhance monitoring of Apache Kafka brokers and clients.
Kafka Exporter is provided with Streams for Apache Kafka for deployment with a Kafka cluster to extract additional metrics data from Kafka brokers related to offsets, consumer groups, consumer lag, and topics.
The metrics data is used, for example, to help identify slow consumers.
Lag data is exposed as Prometheus metrics, which can then be presented in Grafana for analysis.
If you are already using Prometheus and Grafana for monitoring of built-in Kafka metrics, you can configure Prometheus to also scrape the Kafka Exporter Prometheus endpoint.
Kafka exposes metrics through JMX, which can then be exported as Prometheus metrics. For more information, see Monitoring your cluster using JMX.
20.1. Consumer lag 复制链接链接已复制到粘贴板!
Consumer lag indicates the difference in the rate of production and consumption of messages. Specifically, consumer lag for a given consumer group indicates the delay between the last message in the partition and the message being currently picked up by that consumer. The lag reflects the position of the consumer offset in relation to the end of the partition log.
This difference is sometimes referred to as the delta between the producer offset and consumer offset, the read and write positions in the Kafka broker topic partitions.
Suppose a topic streams 100 messages a second. A lag of 1000 messages between the producer offset (the topic partition head) and the last offset the consumer has read means a 10-second delay.
The importance of monitoring consumer lag
For applications that rely on the processing of (near) real-time data, it is critical to monitor consumer lag to check that it does not become too big. The greater the lag becomes, the further the process moves from the real-time processing objective.
Consumer lag, for example, might be a result of consuming too much old data that has not been purged, or through unplanned shutdowns.
Reducing consumer lag
Typical actions to reduce lag include:
- Scaling-up consumer groups by adding new consumers
- Increasing the retention time for a message to remain in a topic
- Adding more disk capacity to increase the message buffer
Actions to reduce consumer lag depend on the underlying infrastructure and the use cases Streams for Apache Kafka is supporting. For instance, a lagging consumer is less likely to benefit from the broker being able to service a fetch request from its disk cache. And in certain cases, it might be acceptable to automatically drop messages until a consumer has caught up.
20.2. Kafka Exporter alerting rule examples 复制链接链接已复制到粘贴板!
The sample alert notification rules specific to Kafka Exporter are as follows:
UnderReplicatedPartition- An alert to warn that a topic is under-replicated and the broker is not replicating enough partitions. The default configuration is for an alert if there are one or more under-replicated partitions for a topic. The alert might signify that a Kafka instance is down or the Kafka cluster is overloaded. A planned restart of the Kafka broker may be required to restart the replication process.
TooLargeConsumerGroupLag- An alert to warn that the lag on a consumer group is too large for a specific topic partition. The default configuration is 1000 records. A large lag might indicate that consumers are too slow and are falling behind the producers.
NoMessageForTooLong- An alert to warn that a topic has not received messages for a period of time. The default configuration for the time period is 10 minutes. The delay might be a result of a configuration issue preventing a producer from publishing messages to the topic.
You can adapt alerting rules according to your specific needs.
20.3. Kafka Exporter metrics 复制链接链接已复制到粘贴板!
Lag information is exposed by Kafka Exporter as Prometheus metrics for presentation in Grafana.
Kafka Exporter exposes metrics data for brokers, topics, and consumer groups.
| Name | Information |
|---|---|
|
| Number of brokers in the Kafka cluster |
| Name | Information |
|---|---|
|
| Number of partitions for a topic |
|
| Current topic partition offset for a broker |
|
| Oldest topic partition offset for a broker |
|
| Number of in-sync replicas for a topic partition |
|
| Leader broker ID of a topic partition |
|
|
Shows |
|
| Number of replicas for this topic partition |
|
|
Shows |
| Name | Information |
|---|---|
|
| Current topic partition offset for a consumer group |
|
| Current approximate lag for a consumer group at a topic partition |
20.4. Running Kafka Exporter 复制链接链接已复制到粘贴板!
Run Kafka Exporter to expose Prometheus metrics for presentation in a Grafana dashboard.
Download and install the Kafka Exporter package to use the Kafka Exporter with Streams for Apache Kafka. You need a Streams for Apache Kafka subscription to be able to download and install the package.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- You have a subscription to Streams for Apache Kafka.
This procedure assumes you already have access to a Grafana user interface and Prometheus is deployed and added as a data source.
Procedure
Install the Kafka Exporter package:
dnf install kafka_exporter
dnf install kafka_exporterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the package has installed:
dnf info kafka_exporter
dnf info kafka_exporterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the Kafka Exporter using appropriate configuration parameter values:
kafka_exporter --kafka.server=<kafka_bootstrap_address>:9092 --kafka.version=3.9.0 --<my_other_parameters>
kafka_exporter --kafka.server=<kafka_bootstrap_address>:9092 --kafka.version=3.9.0 --<my_other_parameters>Copy to Clipboard Copied! Toggle word wrap Toggle overflow The parameters require a double-hyphen (
--) convention.
The--kafka.serverparameter specifies a hostname and port to connect to a Kafka instance.
The--kafka.versionparameter specifies the Kafka version to ensure compatibility.
Usekafka_exporter --helpfor information on other available parameters.Configure Prometheus to monitor the Kafka Exporter metrics.
For more information on configuring Prometheus, see the Prometheus documentation.
Enable Grafana to present the Kafka Exporter metrics data exposed by Prometheus.
For more information, see Presenting Kafka Exporter metrics in Grafana.
Updating Kafka Exporter
Use the latest version of Kafka Exporter with your Streams for Apache Kafka installation.
To check for updates, use:
dnf check-update
dnf check-update
To update Kafka Exporter, use:
dnf update kafka_exporter
dnf update kafka_exporter
20.5. Presenting Kafka Exporter metrics in Grafana 复制链接链接已复制到粘贴板!
Using Kafka Exporter Prometheus metrics as a data source, you can create a dashboard of Grafana charts.
For example, from the metrics you can create the following Grafana charts:
- Message in per second (from topics)
- Message in per minute (from topics)
- Lag by consumer group
- Messages consumed per minute (by consumer groups)
When metrics data has been collected for some time, the Kafka Exporter charts are populated.
Use the Grafana charts to analyze lag and to check if actions to reduce lag are having an impact on an affected consumer group. If, for example, Kafka brokers are adjusted to reduce lag, the dashboard will show the Lag by consumer group chart going down and the Messages consumed per minute chart going up.
Chapter 21. Upgrading Streams for Apache Kafka and Kafka 复制链接链接已复制到粘贴板!
Upgrade your Kafka cluster with no downtime. Streams for Apache Kafka 2.9 supports and uses Apache Kafka version 3.9.0. Kafka 3.8.0 is supported only for the purpose of upgrading to Streams for Apache Kafka 2.9. You upgrade to the latest supported version of Kafka when you install the latest version of Streams for Apache Kafka.
21.1. Upgrade prerequisites 复制链接链接已复制到粘贴板!
Before you begin the upgrade process, make sure you are familiar with any upgrade changes described in the Streams for Apache Kafka 2.9 on Red Hat Enterprise Linux Release Notes.
21.2. Streams for Apache Kafka upgrade paths 复制链接链接已复制到粘贴板!
Two upgrade paths are available for Streams for Apache Kafka.
- Incremental upgrade
- An incremental upgrade involves upgrading Streams for Apache Kafka from the previous minor version to version 2.9.
- Multi-version upgrade
- A multi-version upgrade involves upgrading an older version of Streams for Apache Kafka to version 2.9 within a single upgrade, skipping one or more intermediate versions. For example, you might wish to upgrade from one LTS version to the next LTS version.
The upgrade process is the same for either path, you just need to make sure that the Kafka metadata version is switched to the newer version.
21.3. Strategies for upgrading clients 复制链接链接已复制到粘贴板!
Upgrading Kafka clients ensures that they benefit from the features, fixes, and improvements that are introduced in new versions of Kafka. Upgraded clients maintain compatibility with other upgraded Kafka components. The performance and stability of the clients might also be improved.
Consider the best approach for upgrading Kafka clients and brokers to ensure a smooth transition. The chosen upgrade strategy depends on whether you are upgrading brokers or clients first. Since Kafka 3.0, you can upgrade brokers and client independently and in any order. The decision to upgrade clients or brokers first depends on several factors, such as the number of applications that need to be upgraded and how much downtime is tolerable.
If you upgrade clients before brokers, some new features may not work as they are not yet supported by brokers. However, brokers can handle producers and consumers running with different versions and supporting different log message versions.
21.4. Upgrading Kafka clusters 复制链接链接已复制到粘贴板!
Upgrade a KRaft-based Kafka cluster to a newer supported Kafka version and KRaft metadata version. You update the installation files, then configure and restart all Kafka nodes. After performing these steps, data is transmitted between the Kafka brokers according to the new metadata version. For this setup, Kafka is installed in the /opt/kafka/ directory.
When downgrading a KRaft-based Strimzi Kafka cluster to a lower version, like moving from 3.9.0 to 3.8.0, ensure that the metadata version used by the Kafka cluster is a version supported by the Kafka version you want to downgrade to. The metadata version for the Kafka version you are downgrading from must not be higher than the version you are downgrading to. Before downgrading in production, test your specific scenario in a controlled environment to identify potential issues.
Prerequisites
- You are logged in to Red Hat Enterprise Linux as the Kafka user.
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
- You have downloaded the installation files.
Procedure
For each Kafka node in your Streams for Apache Kafka cluster, starting with controller nodes and then brokers, and one at a time:
Download the Streams for Apache Kafka archive from the Streams for Apache Kafka software downloads page.
NoteIf prompted, log in to your Red Hat account.
On the command line, create a temporary directory and extract the contents of the
amq-streams-<version>-kafka-bin.zipfile.mkdir /tmp/kafka unzip amq-streams-<version>-kafka-bin.zip -d /tmp/kafka
mkdir /tmp/kafka unzip amq-streams-<version>-kafka-bin.zip -d /tmp/kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow If running, stop the Kafka broker running on the host.
./bin/kafka-server-stop.sh jcmd | grep kafka
./bin/kafka-server-stop.sh jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow If you are running Kafka on a multi-node cluster, see Section 3.7, “Performing a graceful rolling restart of Kafka brokers”.
Delete the
libsandbindirectories from your existing installation:rm -rf ./libs ./bin
rm -rf ./libs ./binCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the
libsandbindirectories from the temporary directory:cp -r /tmp/kafka/kafka_<version>/libs /opt/kafka/ cp -r /tmp/kafka/kafka_<version>/bin /opt/kafka/
cp -r /tmp/kafka/kafka_<version>/libs /opt/kafka/ cp -r /tmp/kafka/kafka_<version>/bin /opt/kafka/Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
If required, update the configuration files in the
configdirectory to reflect any changes in the new Kafka version. Delete the temporary directory.
rm -r /tmp/kafka
rm -r /tmp/kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the updated Kafka node:
Restarting nodes with combined roles
./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/server.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Restarting controller nodes
./bin/kafka-server-start.sh -daemon ./config/kraft/controller.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/controller.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Restarting nodes with broker roles
./bin/kafka-server-start.sh -daemon ./config/kraft/broker.properties
./bin/kafka-server-start.sh -daemon ./config/kraft/broker.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow The Kafka broker starts using the binaries for the latest Kafka version.
For information on restarting brokers in a multi-node cluster, see Section 3.7, “Performing a graceful rolling restart of Kafka brokers”.
Check that Kafka is running:
jcmd | grep kafka
jcmd | grep kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Update the Kafka metadata version:
./bin/kafka-features.sh --bootstrap-server <broker_host>:<port> upgrade --metadata 3.9
./bin/kafka-features.sh --bootstrap-server <broker_host>:<port> upgrade --metadata 3.9Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use the correct version for the Kafka version you are upgrading to.
Verify that a restarted Kafka broker has caught up with the partition replicas it is following using the kafka-topics.sh tool to ensure that all replicas contained in the broker are back in sync. For instructions, see Listing and describing topics.
Upgrading client applications
Ensure all Kafka client applications are updated to use the new version of the client binaries as part of the upgrade process and verify their compatibility with the Kafka upgrade. If needed, coordinate with the team responsible for managing the client applications.
To check that a client is using the latest message format, use the kafka.server:type=BrokerTopicMetrics,name={Produce|Fetch}MessageConversionsPerSec metric. The metric shows 0 if the latest message format is being used.
21.5. Upgrading Kafka components 复制链接链接已复制到粘贴板!
Upgrade Kafka components on a host machine to use the latest version of Streams for Apache Kafka. You can use the Streams for Apache Kafka installation files to upgrade the following components:
- Kafka Connect
- MirrorMaker
- Kafka Bridge (separate ZIP file)
For this setup, Kafka is installed in the /opt/kafka/ directory.
Prerequisites
- You are logged in to Red Hat Enterprise Linux as the Kafka user.
- You have downloaded the installation files.
You have upgraded Kafka.
If a Kafka component is running on the same host as Kafka, you’ll also need to stop and start Kafka when upgrading.
Procedure
For each host running an instance of the Kafka component:
Download the Streams for Apache Kafka or Kafka Bridge installation files from the Streams for Apache Kafka software downloads page.
NoteIf prompted, log in to your Red Hat account.
On the command line, create a temporary directory and extract the contents of the
amq-streams-<version>-kafka-bin.zipfile.mkdir /tmp/kafka unzip amq-streams-<version>-kafka-bin.zip -d /tmp/kafka
mkdir /tmp/kafka unzip amq-streams-<version>-kafka-bin.zip -d /tmp/kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow For Kafka Bridge, extract the
amq-streams-<version>-bridge-bin.zipfile.- If running, stop the Kafka component running on the host.
Delete the
libsandbindirectories from your existing installation:rm -rf ./libs ./bin
rm -rf ./libs ./binCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the
libsandbindirectories from the temporary directory:cp -r /tmp/kafka/kafka_<version>/libs /opt/kafka/ cp -r /tmp/kafka/kafka_<version>/bin /opt/kafka/
cp -r /tmp/kafka/kafka_<version>/libs /opt/kafka/ cp -r /tmp/kafka/kafka_<version>/bin /opt/kafka/Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
If required, update the configuration files in the
configdirectory to reflect any changes in the new versions. Delete the temporary directory.
rm -r /tmp/kafka
rm -r /tmp/kafkaCopy to Clipboard Copied! Toggle word wrap Toggle overflow Start the Kafka component using the appropriate script and properties files.
Starting Kafka Connect in standalone mode
./bin/connect-standalone.sh \ ./config/connect-standalone.properties <connector1>.properties [<connector2>.properties ...]
./bin/connect-standalone.sh \ ./config/connect-standalone.properties <connector1>.properties [<connector2>.properties ...]Copy to Clipboard Copied! Toggle word wrap Toggle overflow Starting Kafka Connect in distributed mode
./bin/connect-distributed.sh \ ./config/connect-distributed.properties
./bin/connect-distributed.sh \ ./config/connect-distributed.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Starting MirrorMaker 2 in dedicated mode
./bin/connect-mirror-maker.sh \ ./config/connect-mirror-maker.properties
./bin/connect-mirror-maker.sh \ ./config/connect-mirror-maker.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Starting Kafka Bridge
./bin/kafka_bridge_run.sh \ --config-file=<path>/application.properties
./bin/kafka_bridge_run.sh \ --config-file=<path>/application.propertiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the Kafka component is running, and producing or consuming data as expected.
Verifying Kafka Connect in standalone mode is running
jcmd | grep ConnectStandalone
jcmd | grep ConnectStandaloneCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verifying Kafka Connect in distributed mode is running
jcmd | grep ConnectDistributed
jcmd | grep ConnectDistributedCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verifying MirrorMaker 2 in dedicated mode is running
jcmd | grep mirrorMaker
jcmd | grep mirrorMakerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verifying Kafka Bridge is running by checking the log
HTTP-Kafka Bridge started and listening on port 8080 HTTP-Kafka Bridge bootstrap servers localhost:9092
HTTP-Kafka Bridge started and listening on port 8080 HTTP-Kafka Bridge bootstrap servers localhost:9092Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 22. Monitoring your cluster using JMX 复制链接链接已复制到粘贴板!
Collecting metrics is critical for understanding the health and performance of your Kafka deployment. By monitoring metrics, you can actively identify issues before they become critical and make informed decisions about resource allocation and capacity planning. Without metrics, you may be left with limited visibility into the behavior of your Kafka deployment, which can make troubleshooting more difficult and time-consuming. Setting up metrics can save you time and resources in the long run, and help ensure the reliability of your Kafka deployment.
Kafka components use Java Management Extensions (JMX) to share management information through metrics. These metrics are crucial for monitoring a Kafka cluster’s performance and overall health. Like many other Java applications, Kafka employs Managed Beans (MBeans) to supply metric data to monitoring tools and dashboards. JMX operates at the JVM level, allowing external tools to connect and retrieve management information from Kafka components. To connect to the JVM, these tools typically need to run on the same machine and with the same user privileges by default.
22.1. Enabling the JMX agent 复制链接链接已复制到粘贴板!
Enable JMX monitoring of Kafka components using JVM system properties. Use the KAFKA_JMX_OPTS environment variable to set the JMX system properties required for enabling JMX monitoring. The scripts that run the Kafka component use these properties.
Procedure
Set the
KAFKA_JMX_OPTSenvironment variable with the JMX properties for enabling JMX monitoring.export KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=<port> -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
export KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=<port> -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=falseCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <port> with the name of the port on which you want the Kafka component to listen for JMX connections.
Add
org.apache.kafka.common.metrics.JmxReportertometric.reportersin theserver.propertiesfile.metric.reporters=org.apache.kafka.common.metrics.JmxReporter
metric.reporters=org.apache.kafka.common.metrics.JmxReporterCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Start the Kafka component using the appropriate script, such as
bin/kafka-server-start.shfor a broker orbin/connect-distributed.shfor Kafka Connect.
It is recommended that you configure authentication and SSL to secure a remote JMX connection. For more information about the system properties needed to do this, see the Oracle documentation.
22.2. Disabling the JMX agent 复制链接链接已复制到粘贴板!
Disable JMX monitoring for Kafka components by updating the KAFKA_JMX_OPTS environment variable.
Procedure
Set the
KAFKA_JMX_OPTSenvironment variable to disable JMX monitoring.export KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote=false
export KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote=falseCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteOther JMX properties, like port, authentication, and SSL properties do not need to be specified when disabling JMX monitoring.
Set
auto.include.jmx.reportertofalsein the Kafkaserver.propertiesfile.auto.include.jmx.reporter=false
auto.include.jmx.reporter=falseCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe
auto.include.jmx.reporterproperty is deprecated. From Kafka 4, the JMXReporter is only enabled iforg.apache.kafka.common.metrics.JmxReporteris added to themetric.reportersconfiguration in the properties file.-
Start the Kafka component using the appropriate script, such as
bin/kafka-server-start.shfor a broker orbin/connect-distributed.shfor Kafka Connect.
22.3. Metrics naming conventions 复制链接链接已复制到粘贴板!
When working with Kafka JMX metrics, it’s important to understand the naming conventions used to identify and retrieve specific metrics. Kafka JMX metrics use the following format:
Metrics format
<metric_group>:type=<type_name>,name=<metric_name><other_attribute>=<value>
<metric_group>:type=<type_name>,name=<metric_name><other_attribute>=<value>
- <metric_group> is the name of the metric group
- <type_name> is the name of the type of metric
- <metric_name> is the name of the specific metric
- <other_attribute> represents zero or more additional attributes
For example, the BytesInPerSec metric is a BrokerTopicMetrics type in the kafka.server group:
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec
In some cases, metrics may include the ID of an entity. For instance, when monitoring a specific client, the metric format includes the client ID:
Metrics for a specific client
kafka.consumer:type=consumer-fetch-manager-metrics,client-id=<client_id>
kafka.consumer:type=consumer-fetch-manager-metrics,client-id=<client_id>
Similarly, a metric can be further narrowed down to a specific client and topic:
Metrics for a specific client and topic
kafka.consumer:type=consumer-fetch-manager-metrics,client-id=<client_id>,topic=<topic_id>
kafka.consumer:type=consumer-fetch-manager-metrics,client-id=<client_id>,topic=<topic_id>
Understanding these naming conventions will allow you to accurately specify the metrics you want to monitor and analyze.
To view the full list of available JMX metrics for a Strimzi installation, you can use a graphical tool like JConsole. JConsole is a Java Monitoring and Management Console that allows you to monitor and manage Java applications, including Kafka. By connecting to the JVM running the Kafka component using its process ID, the tool’s user interface allows you to view the list of metrics.
22.4. Analyzing Kafka JMX metrics for troubleshooting 复制链接链接已复制到粘贴板!
JMX provides a way to gather metrics about Kafka brokers for monitoring and managing their performance and resource usage. By analyzing these metrics, common broker issues such as high CPU usage, memory leaks, thread contention, and slow response times can be diagnosed and resolved. Certain metrics can pinpoint the root cause of these issues.
JMX metrics also provide insights into the overall health and performance of a Kafka cluster. They help monitor the system’s throughput, latency, and availability, diagnose issues, and optimize performance. This section explores the use of JMX metrics to help identify common issues and provides insights into the performance of a Kafka cluster.
Collecting and graphing these metrics using tools like Prometheus and Grafana allows you to visualize the information returned. This can be particularly helpful in detecting issues or optimizing performance. Graphing metrics over time can also help with identifying trends and forecasting resource consumption.
22.4.1. Checking for under-replicated partitions 复制链接链接已复制到粘贴板!
A balanced Kafka cluster is important for optimal performance. In a balanced cluster, partitions and leaders are evenly distributed across all brokers, and I/O metrics reflect this. As well as using metrics, you can use the kafka-topics.sh tool to get a list of under-replicated partitions and identify the problematic brokers. If the number of under-replicated partitions is fluctuating or many brokers show high request latency, this typically indicates a performance issue in the cluster that requires investigation. On the other hand, a steady (unchanging) number of under-replicated partitions reported by many of the brokers in a cluster normally indicates that one of the brokers in the cluster is offline.
Use the describe --under-replicated-partitions option from the kafka-topics.sh tool to show information about partitions that are currently under-replicated in the cluster. These are the partitions that have fewer replicas than the configured replication factor.
If the output is blank, the Kafka cluster has no under-replicated partitions. Otherwise, the output shows replicas that are not in sync or available.
In the following example, only 2 of the 3 replicas are in sync for each partition, with a replica missing from the ISR (in-sync replica).
Returning information on under-replicated partitions from the command line
bin/kafka-topics.sh --bootstrap-server :9092 --describe --under-replicated-partitions Topic: topic-1 Partition: 0 Leader: 4 Replicas: 4,2,3 Isr: 4,3 Topic: topic-1 Partition: 1 Leader: 3 Replicas: 2,3,4 Isr: 3,4 Topic: topic-1 Partition: 2 Leader: 3 Replicas: 3,4,2 Isr: 3,4
bin/kafka-topics.sh --bootstrap-server :9092 --describe --under-replicated-partitions
Topic: topic-1 Partition: 0 Leader: 4 Replicas: 4,2,3 Isr: 4,3
Topic: topic-1 Partition: 1 Leader: 3 Replicas: 2,3,4 Isr: 3,4
Topic: topic-1 Partition: 2 Leader: 3 Replicas: 3,4,2 Isr: 3,4
Here are some metrics to check for I/O and under-replicated partitions:
Metrics to check for under-replicated partitions
- 1
- Total number of partitions across all topics in the cluster.
- 2
- Total number of leaders across all topics in the cluster.
- 3
- Rate of incoming bytes per second for each broker.
- 4
- Rate of outgoing bytes per second for each broker.
- 5
- Number of under-replicated partitions across all topics in the cluster.
- 6
- Number of partitions below the minimum ISR.
If topic configuration is set for high availability, with a replication factor of at least 3 for topics and a minimum number of in-sync replicas being 1 less than the replication factor, under-replicated partitions can still be usable. Conversely, partitions below the minimum ISR have reduced availability. You can monitor these using the kafka.server:type=ReplicaManager,name=UnderMinIsrPartitionCount metric and the under-min-isr-partitions option from the kafka-topics.sh tool.
Use Cruise Control to automate the task of monitoring and rebalancing a Kafka cluster to ensure that the partition load is evenly distributed. For more information, see Chapter 15, Using Cruise Control for cluster rebalancing.
Spikes in cluster metrics may indicate a broker issue, which is often related to slow or failing storage devices or compute restraints from other processes. If there is no issue at the operating system or hardware level, an imbalance in the load of the Kafka cluster is likely, with some partitions receiving disproportionate traffic compared to others in the same Kafka topic.
To anticipate performance problems in a Kafka cluster, it’s useful to monitor the RequestHandlerAvgIdlePercent metric. RequestHandlerAvgIdlePercent provides a good overall indicator of how the cluster is behaving. The value of this metric is between 0 and 1. A value below 0.7 indicates that threads are busy 30% of the time and performance is starting to degrade. If the value drops below 50%, problems are likely to occur, especially if the cluster needs to scale or rebalance. At 30%, a cluster is barely usable.
Another useful metric is kafka.network:type=Processor,name=IdlePercent, which you can use to monitor the extent (as a percentage) to which network processors in a Kafka cluster are idle. The metric helps identify whether the processors are over or underutilized.
To ensure optimal performance, set the num.io.threads property equal to the number of processors in the system, including hyper-threaded processors. If the cluster is balanced, but a single client has changed its request pattern and is causing issues, reduce the load on the cluster or increase the number of brokers.
It’s important to note that a single disk failure on a single broker can severely impact the performance of an entire cluster. Since producer clients connect to all brokers that lead partitions for a topic, and those partitions are evenly spread over the entire cluster, a poorly performing broker will slow down produce requests and cause back pressure in the producers, slowing down requests to all brokers. A RAID (Redundant Array of Inexpensive Disks) storage configuration that combines multiple physical disk drives into a single logical unit can help prevent this issue.
Here are some metrics to check the performance of a Kafka cluster:
Metrics to check the performance of a Kafka cluster
- 1
- Average idle percentage of the request handler threads in the Kafka broker’s thread pool. The
OneMinuteRateandFifteenMinuteRateattributes show the request rate of the last one minute and fifteen minutes, respectively. - 2
- Rate at which new connections are being created on a specific network processor of a specific listener in the Kafka broker. The
listenerattribute refers to the name of the listener, and thenetworkProcessorattribute refers to the ID of the network processor. Theconnection-creation-rateattribute shows the rate of connection creation in connections per second. - 3
- Current size of the request queue.
- 4
- Current sizes of the response queue.
- 5
- Percentage of time the specified network processor is idle. The
networkProcessorspecifies the ID of the network processor to monitor. - 6
- Total number of bytes read from disk by a Kafka server.
- 7
- Total number of bytes written to disk by a Kafka server.
The Kafka controller is responsible for managing the overall state of the cluster, such as broker registration, partition reassignment, and topic management. Problems with the controller in the Kafka cluster are difficult to diagnose and often fall into the category of bugs in Kafka itself. Controller issues might manifest as broker metadata being out of sync, offline replicas when the brokers appear to be fine, or actions on topics like topic creation not happening correctly.
There are not many ways to monitor the controller, but you can monitor the active controller count and the controller queue size. Monitoring these metrics gives a high-level indicator if there is a problem. Although spikes in the queue size are expected, if this value continuously increases, or stays steady at a high value and does not drop, it indicates that the controller may be stuck. If you encounter this problem, you can move the controller to a different broker, which requires shutting down the broker that is currently the controller.
Here are some metrics to check the performance of a Kafka controller:
Metrics to check the performance of a Kafka controller
kafka.controller:type=KafkaController,name=ActiveControllerCount kafka.controller:type=KafkaController,name=OfflinePartitionsCount kafka.controller:type=ControllerEventManager,name=EventQueueSize
kafka.controller:type=KafkaController,name=ActiveControllerCount
kafka.controller:type=KafkaController,name=OfflinePartitionsCount
kafka.controller:type=ControllerEventManager,name=EventQueueSize
- 1
- Number of active controllers in the Kafka cluster. A value of 1 indicates that there is only one active controller, which is the desired state.
- 2
- Number of partitions that are currently offline. If this value is continuously increasing or stays at a high value, there may be a problem with the controller.
- 3
- Size of the event queue in the controller. Events are actions that must be performed by the controller, such as creating a new topic or moving a partition to a new broker. if the value continuously increases or stays at a high value, the controller may be stuck and unable to perform the required actions.
22.4.4. Identifying problems with requests 复制链接链接已复制到粘贴板!
You can use the RequestHandlerAvgIdlePercent metric to determine if requests are slow. Additionally, request metrics can identify which specific requests are experiencing delays and other issues.
To effectively monitor Kafka requests, it is crucial to collect two key metrics: count and 99th percentile latency, also known as tail latency.
The count metric represents the number of requests processed within a specific time interval. It provides insights into the volume of requests handled by your Kafka cluster and helps identify spikes or drops in traffic.
The 99th percentile latency metric measures the request latency, which is the time taken for a request to be processed. It represents the duration within which 99% of requests are handled. However, it does not provide information about the exact duration for the remaining 1% of requests. In other words, the 99th percentile latency metric tells you that 99% of the requests are handled within a certain duration, and the remaining 1% may take even longer, but the precise duration for this remaining 1% is not known. The choice of the 99th percentile is primarily to focus on the majority of requests and exclude outliers that can skew the results.
This metric is particularly useful for identifying performance issues and bottlenecks related to the majority of requests, but it does not give a complete picture of the maximum latency experienced by a small fraction of requests.
By collecting and analyzing both count and 99th percentile latency metrics, you can gain an understanding of the overall performance and health of your Kafka cluster, as well as the latency of the requests being processed.
Here are some metrics to check the performance of Kafka requests:
Metrics to check the performance of requests
- 1
- Request types to break down the request metrics.
- 2
- Rate at which requests are being processed by the Kafka broker per second.
- 3
- Time (in milliseconds) that a request spends waiting in the broker’s request queue before being processed.
- 4
- Total time (in milliseconds) that a request takes to complete, from the time it is received by the broker to the time the response is sent back to the client.
- 5
- Time (in milliseconds) that a request spends being processed by the broker on the local machine.
- 6
- Time (in milliseconds) that a request spends being processed by other brokers in the cluster.
- 7
- Time (in milliseconds) that a request spends being throttled by the broker. Throttling occurs when the broker determines that a client is sending too many requests too quickly and needs to be slowed down.
- 8
- Time (in milliseconds) that a response spends waiting in the broker’s response queue before being sent back to the client.
- 9
- Time (in milliseconds) that a response takes to be sent back to the client after it has been generated by the broker.
- 10
- For all of the requests metrics, the
Countand99thPercentileattributes show the total number of requests that have been processed and the time it takes for the slowest 1% of requests to complete, respectively.
By analyzing client metrics, you can monitor the performance of the Kafka clients (producers and consumers) connected to a broker. This can help identify issues highlighted in broker logs, such as consumers being frequently kicked off their consumer groups, high request failure rates, or frequent disconnections.
Here are some metrics to check the performance of Kafka clients:
Metrics to check the performance of client requests
- 1
- (Consumer) Average and maximum time between poll requests, which can help determine if the consumers are polling for messages frequently enough to keep up with the message flow. The
time-between-poll-avgandtime-between-poll-maxattributes show the average and maximum time in milliseconds between successive polls by a consumer, respectively. - 2
- (Consumer) Metrics to monitor the coordination process between Kafka consumers and the broker coordinator. Attributes relate to the heartbeat, join, and rebalance process.
- 3
- (Producer) Metrics to monitor the performance of Kafka producers. Attributes relate to buffer usage, request latency, in-flight requests, transactional processing, and record handling.
Metrics for topics and partitions can also be helpful in diagnosing issues in a Kafka cluster. You can also use them to debug issues with a specific client when you are unable to collect client metrics.
Here are some metrics to check the performance of a specific topic and partition:
Metrics to check the performance of topics and partitions
- 1
- Rate of incoming bytes per second for a specific topic.
- 2
- Rate of outgoing bytes per second for a specific topic.
- 3
- Rate of fetch requests that failed per second for a specific topic.
- 4
- Rate of produce requests that failed per second for a specific topic.
- 5
- Incoming message rate per second for a specific topic.
- 6
- Total rate of fetch requests (successful and failed) per second for a specific topic.
- 7
- Total rate of fetch requests (successful and failed) per second for a specific topic.
- 8
- Size of a specific partition’s log in bytes.
- 9
- Number of log segments in a specific partition.
- 10
- Offset of the last message in a specific partition’s log.
- 11
- Offset of the first message in a specific partition’s log
Appendix A. Using your subscription 复制链接链接已复制到粘贴板!
Streams for Apache Kafka is provided through a software subscription. To manage your subscriptions, access your account at the Red Hat Customer Portal.
Accessing Your Account
- Go to access.redhat.com.
- If you do not already have an account, create one.
- Log in to your account.
Activating a Subscription
- Go to access.redhat.com.
- Navigate to My Subscriptions.
- Navigate to Activate a subscription and enter your 16-digit activation number.
Downloading Zip and Tar Files
To access zip or tar files, use the customer portal to find the relevant files for download. If you are using RPM packages, this step is not required.
- Open a browser and log in to the Red Hat Customer Portal Product Downloads page at access.redhat.com/downloads.
- Locate the Streams for Apache Kafka for Apache Kafka entries in the INTEGRATION AND AUTOMATION category.
- Select the desired Streams for Apache Kafka product. The Software Downloads page opens.
- Click the Download link for your component.
Installing packages with DNF
To install a package and all the package dependencies, use:
dnf install <package_name>
dnf install <package_name>
To install a previously-downloaded package from a local directory, use:
dnf install <path_to_download_package>
dnf install <path_to_download_package>
Revised on 2025-10-23 15:00:23 UTC