Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Chapter 5. Configuring Streams for Apache Kafka
Use the Kafka configuration properties files to configure Streams for Apache Kafka.
The properties file is in Java format, with each property on a separate line in the following format:
<option> = <value>
Lines starting with # or ! are treated as comments and are ignored by Streams for Apache Kafka components.
# This is a comment
Values can be split into multiple lines by using \ directly before the newline/carriage return.
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="bob" \
password="bobs-password";
After saving the changes in the properties file, you need to restart the Kafka node. In a multi-node environment, repeat the process on each node in the cluster.
5.1. Using standard Kafka configuration properties Link kopierenLink in die Zwischenablage kopiert!
Use standard Kafka configuration properties to configure Kafka components.
The properties provide options to control and tune the configuration of the following Kafka components:
- Brokers
- Topics
- Producer, consumer, and management clients
- Kafka Connect
- Kafka Streams
Broker and client parameters include options to configure authorization, authentication and encryption.
For further information on Kafka configuration properties and how to use the properties to tune your deployment, see the following guides:
5.2. Loading configuration values from environment variables Link kopierenLink in die Zwischenablage kopiert!
Use the Environment Variables Configuration Provider plugin to load configuration data from environment variables. You can use the Environment Variables Configuration Provider, for example, to load certificates or JAAS configuration from environment variables.
You can use the provider to load configuration data for all Kafka components, including producers and consumers. Use the provider, for example, to provide the credentials for Kafka Connect connector configuration.
Prerequisites
- Streams for Apache Kafka is installed on each host, and the configuration files are available.
The Environment Variables Configuration Provider JAR file.
The JAR file is available from the Streams for Apache Kafka archive.
Procedure
-
Add the Environment Variables Configuration Provider JAR file to the Kafka
libsdirectory. Initialize the Environment Variables Configuration Provider in the configuration properties file of the Kafka component. For example, to initialize the provider for Kafka, add the configuration to the
server.propertiesfile.Configuration to enable the Environment Variables Configuration Provider
config.providers.env.class=org.apache.kafka.common.config.provider.EnvVarConfigProviderAdd configuration to the properties file to load data from environment variables.
Configuration to load data from an environment variable
option=${env:<MY_ENV_VAR_NAME>}Use capitalized or upper-case environment variable naming conventions, such as
MY_ENV_VAR_NAME.- Save the changes.
Restart the Kafka component.
For information on restarting brokers in a multi-node cluster, see Section 3.8, “Performing a graceful rolling restart of Kafka brokers”.
5.3. Configuring Kafka Link kopierenLink in die Zwischenablage kopiert!
Kafka uses properties files to store static configuration. The recommended location for the configuration files is ./config/kraft/. The configuration files have to be readable by the Kafka user.
Streams for Apache Kafka ships example configuration files that highlight various basic and advanced features of the product. They can be found under config/kraft/ in the Streams for Apache Kafka installation directory as follows:
-
(default)
config/kraft/server.propertiesfor nodes running in combined mode -
config/kraft/broker.propertiesfor nodes running as brokers -
config/kraft/controller.propertiesfor nodes running as controllers
This chapter explains the most important configuration options.
5.3.1. Listeners Link kopierenLink in die Zwischenablage kopiert!
Listeners are used to connect to Kafka brokers. Each Kafka broker can be configured to use multiple listeners. Each listener requires a different configuration so it can listen on a different port or network interface.
To configure listeners, edit the listeners property in the Kafka configuration properties file. Add listeners to the listeners property as a comma-separated list. Configure each property as follows:
<listener_name>://<hostname>:<port>
If <hostname> is empty, Kafka uses the java.net.InetAddress.getCanonicalHostName() class as the hostname.
Example configuration for multiple listeners
listeners=internal-1://:9092,internal-2://:9093,replication://:9094
When a Kafka client wants to connect to a Kafka cluster, it first connects to the bootstrap server, which is one of the cluster nodes. The bootstrap server provides the client with a list of all the brokers in the cluster, and the client connects to each one individually. The list of brokers is based on the configured listeners.
Advertised listeners
Optionally, you can use the advertised.listeners property to provide the client with a different set of listener addresses than those given in the listeners property. This is useful if additional network infrastructure, such as a proxy, is between the client and the broker, or an external DNS name is being used instead of an IP address.
The advertised.listeners property is formatted in the same way as the listeners property.
Example configuration for advertised listeners
listeners=internal-1://:9092,internal-2://:9093
advertised.listeners=internal-1://my-broker-1.my-domain.com:1234,internal-2://my-broker-1.my-domain.com:1235
The names of the advertised listeners must match those listed in the listeners property.
Inter-broker listeners
Inter-broker listeners are used for communication between Kafka brokers. Inter-broker communication is required for:
- Coordinating workloads between different brokers
- Replicating messages between partitions stored on different brokers
The inter-broker listener can be assigned to a port of your choice. When multiple listeners are configured, you can define the name of the inter-broker listener in the inter.broker.listener.name property of your broker configuration.
Here, the inter-broker listener is named as REPLICATION:
listeners=REPLICATION://0.0.0.0:9091
inter.broker.listener.name=REPLICATION
Controller listeners
Controller configuration is used to connect and communicate with the controller that coordinates the cluster and manages the metadata used to track the status of brokers and partitions.
By default, communication between the controllers and brokers uses a dedicated controller listener. Controllers are responsible for coordinating administrative tasks, such as partition leadership changes, so one or more of these listeners is required.
Specify listeners to use for controllers using the controller.listener.names property. You can specify a dynamic quorum of controllers using the controller.quorum.bootstrap.servers property. The quorum enables a leader-follower structure for administrative tasks, with the leader actively managing operations and followers as hot standbys, ensuring metadata consistency in memory and facilitating failover.
listeners=CONTROLLER://0.0.0.0:9090
controller.listener.names=CONTROLLER
controller.quorum.bootstrap.servers=localhost:9090
The format for the controller quorum is <hostname>:<port>.
5.3.2. Data logs Link kopierenLink in die Zwischenablage kopiert!
Apache Kafka stores all records it receives from producers in logs. The logs contain the actual data, in the form of records, that Kafka needs to deliver. Note that these records differ from application log files, which detail the broker’s activities.
Log directories
You can configure log directories using the log.dirs property in the server configuration properties file to store logs in one or multiple log directories. It should be set to /var/lib/kafka directory created during installation:
Data log configuration
log.dirs=/var/lib/kafka
For performance reasons, you can configure log.dirs to multiple directories and place each of them on a different physical device to improve disk I/O performance. For example:
Configuration for multiple directories
log.dirs=/var/lib/kafka1,/var/lib/kafka2,/var/lib/kafka3
5.3.3. Metadata log Link kopierenLink in die Zwischenablage kopiert!
Controllers use a metadata log stored as a single-partition topic (__cluster_metadata) on every node. It records the state of the cluster, storing information on brokers, replicas, topics, and partitions, including the state of in-sync replicas and partition leadership.
Metadata log directory
You can configure the directory for storing the metadata log using the metadata.log.dir property. By default, if this property is not set, Kafka uses the log.dirs property to determine the storage directory for both data logs and metadata logs. The metadata log is placed in the first directory specified for log.dirs.
Isolating metadata operations from data operations can improve manageability and potentially lead to performance gains. To set a specific directory for the metadata log, include the metadata.log.dir property in the server configuration properties file.
For example:
Metadata log configuration
log.dirs=/var/lib/kafka
metadata.log.dir=/var/lib/kafka-metadata
Kafka tools are available for inspecting and debugging the metadata log. For more information, see the Apache Kafka documentation.
5.3.4. Node ID Link kopierenLink in die Zwischenablage kopiert!
Node ID is a unique identifier for each node (broker or controller) in the cluster. You can assign an integer greater than or equal to 0 as node ID. The node ID is used to identify the nodes after restarts or crashes and it is therefore important that the ID is stable and does not change over time.
The node ID is configured in the Kafka configuration properties file:
node.id=1
5.4. Transitioning to separate broker and controller roles Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to transition to using nodes with separate roles. If your Kafka cluster is using nodes with dual controller and broker roles, you can transition to using nodes with separate roles.
To do this, add new controllers, scale down the controllers on the dual-role nodes, and then switch the dual-role nodes to broker-only.
In this example, we update three dual-role nodes.
Prerequisites
- Streams for Apache Kafka (minimum 2.9) is installed on each host, and the configuration files are available.
-
The controller quorum is configured for dynamic scaling using the
controller.quorum.bootstrap.serversproperty. - Cruise Control is installed.
- A backup of the cluster is recommended.
Procedure
Add a quorum of three new controller-only nodes.
Integrate the controllers into the controller quorum by updating the
controller.quorum.bootstrap.serversproperty.For more information, see Section 14.2, “Adding new controllers”.
Using the
kafka-metadata-quorum.shtool, remove the dual-role controllers from the controller quorum.For more information, see Section 14.3, “Removing controllers”.
For each dual-role node, and one at a time:
Stop the dual-role node:
./bin/kafka-server-stop.shConfigure the dual-role node to serve as a broker-node only by switching
process.roles=broker, controllerin the node configuration toprocess.roles=broker.Example broker configuration
node.id=1 process.roles=broker log.dirs=/var/lib/kafka listeners=PLAINTEXT://0.0.0.0:9092 controller.listener.names=CONTROLLER listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT controller.quorum.bootstrap.servers=localhost:9090, localhost:9091, localhost:9092 inter.broker.listener.name=PLAINTEXT num.partitions=1 auto.create.topics.enable=false default.replication.factor=3 min.insync.replicas=2 #...Restart the broker node that was previously serving a dual role:
./bin/kafka-server-start.sh -daemon ./config/kraft/server.properties
5.5. Transitioning to dual-role nodes Link kopierenLink in die Zwischenablage kopiert!
This procedure describes how to transition from using separate nodes with broker-only and controller-only roles to using dual-role nodes. If your Kafka cluster is using nodes with dedicated controller and broker nodes, you can transition to using single nodes with both roles.
To do this, add dual-role configuration to the nodes, then rebalance the cluster to move partition replicas to the nodes that previously served as controllers only.
A dual-role configuration is suitable for development or testing. In a typical production environment, use dedicated broker and controller nodes.
Prerequisites
- Streams for Apache Kafka (minimum 2.9) is is installed on each host, and the configuration files are available.
-
The controller quorum is configured for dynamic scaling using the
controller.quorum.bootstrap.serversproperty. - Cruise Control is installed.
- A backup of the cluster is recommended.
Procedure
For each controller node, and one at a time:
Stop the controller node:
./bin/kafka-server-stop.shConfigure the controller-only node to serve as a dual-role node by adding broker-specific configuration.
At a minimum, do the following:
-
Switch
process.roles=controllertoprocess.roles=broker, controller. -
Add or update the broker log directory using
log.dirs. - Add a listener for the broker to handle client requests. In this example, PLAINTEXT://:9092 is added.
-
Update mappings between listener names and security protocols using
listener.security.protocol.map. -
Configure a listener for inter-broker communication using
inter.broker.listener.name.
Example dual-role configuration
process.roles=broker, controller node.id=1 log.dirs=/var/lib/kafka metadata.log.dir=/var/lib/kafka-metadata listeners=PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9090 controller.listener.names=CONTROLLER listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT controller.quorum.bootstrap.servers=localhost:9090 inter.broker.listener.name=PLAINTEXT num.partitions=1 auto.create.topics.enable=false default.replication.factor=3 min.insync.replicas=2 # ...-
Switch
Restart the node that is now operating in a dual role:
./bin/kafka-server-start.sh -daemon ./config/kraft/dual-role.properties
Use the Cruise Control
remove_brokerendpoint to reassign partition replicas from broker-only nodes to the nodes that now serve as dual-role nodes.The reassignment can take some time depending on the number of topics and partitions in the cluster.
For more information, see Section 15.7, “Generating optimization proposals”.
Unregister the broker nodes:
./bin/kafka-cluster.sh unregister \ --bootstrap-server <broker_host>:<port> \ --id <node_id_number>For more information, see Section 14.4, “Unregistering nodes after scale-down operations”.
Stop the broker nodes:
./bin/kafka-server-stop.sh