Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 3. Getting started

Streams for Apache Kafka is distributed in a ZIP file that contains installation artifacts for the Kafka components.

Note

The Kafka Bridge has separate installation files. For information on installing and using the Kafka Bridge, see Using the Streams for Apache Kafka Bridge.

3.1. Installation environment

Streams for Apache Kafka runs on Red Hat Enterprise Linux. The host (node) can be a physical or virtual machine (VM). Use the installation files provided with Streams for Apache Kafka to install Kafka components. You can install Kafka in a single-node or multi-node environment.

Single-node environment: A single-node Kafka cluster runs instances of Kafka components on a single host. This configuration is not suitable for a production environment.
Multi-node environment: A multi-node Kafka cluster runs instances of Kafka components on multiple hosts.

We recommended that you run Kafka and other Kafka components, such as Kafka Connect, on separate hosts. By running the components in this way, it’s easier to maintain and upgrade each component.

Kafka clients establish a connection to the Kafka cluster using the bootstrap.servers configuration property. If you are using Kafka Connect, for example, the Kafka Connect configuration properties must include a bootstrap.servers value that specifies the hostname and port of the hosts where the Kafka brokers are running. If the Kafka cluster is running on more than one host with multiple Kafka brokers, you specify a hostname and port for each broker. Each Kafka broker is identified by a node.id.

3.1.1. Data storage considerations

An efficient data storage infrastructure is essential to the optimal performance of Streams for Apache Kafka.

Block storage is required. File storage, such as NFS, does not work with Kafka.

Choose from one of the following options for your block storage:

Cloud-based block storage solutions, such as Amazon Elastic Block Store (EBS)
Local storage
Storage Area Network (SAN) volumes accessed by a protocol such as Fibre Channel or iSCSI

3.1.2. File systems

Kafka uses a file system for storing messages. Streams for Apache Kafka is compatible with the XFS and ext4 file systems, which are commonly used with Kafka. Consider the underlying architecture and requirements of your deployment when choosing and setting up your file system.

For more information, refer to Filesystem Selection in the Kafka documentation.

3.2. Downloading Streams for Apache Kafka

A ZIP file distribution of Streams for Apache Kafka is available for download from the Red Hat website. You can download the latest version of Red Hat Streams for Apache Kafka from the Streams for Apache Kafka software downloads page.

For Kafka and other Kafka components, download the amq-streams-<version>-bin.zip file
For Kafka Bridge, download the amq-streams-<version>-bridge-bin.zip file.
For installation instructions, see Using the Streams for Apache Kafka Bridge.

3.3. Installing Kafka

Use the Streams for Apache Kafka ZIP files to install Kafka on Red Hat Enterprise Linux. You can install Kafka in a single-node or multi-node environment. In this procedure, a single Kafka instance is installed on a single host (node).

The Streams for Apache Kafka installation files include the binaries for running other Kafka components, like Kafka Connect, Kafka MirrorMaker 2, and Kafka Bridge. In a single-node environment, you can run these components from the same host where you installed Kafka. However, we recommend that you add the installation files and run other Kafka components on separate hosts.

Prerequisites

You have downloaded the installation files.
You have reviewed the supported configurations in the Streams for Apache Kafka 2.7 on Red Hat Enterprise Linux Release Notes.
You are logged in to Red Hat Enterprise Linux as admin (root) user.

Procedure

Install Kafka on your host.

Add a new kafka user and group:

groupadd kafka
useradd -g kafka kafka
passwd kafka

Extract and move the contents of the amq-streams-<version>-bin.zip file into the /opt/kafka directory:
```
unzip amq-streams-<version>-bin.zip -d /opt
mv /opt/kafka*redhat* /opt/kafka
```
Change the ownership of the /opt/kafka directory to the kafka user:
```
chown -R kafka:kafka /opt/kafka
```
Create directory /var/lib/kafka for storing Kafka data and set its ownership to the kafka user:
```
mkdir /var/lib/kafka
chown -R kafka:kafka /var/lib/kafka
```
You can now run a default configuration of Kafka as a single-node cluster.
You can also use the installation to run other Kafka components, like Kafka Connect, on the same host.
To run other components, specify the hostname and port to connect to the Kafka broker using the bootstrap.servers property in the component configuration.
Example bootstrap servers configuration pointing to a single Kafka broker on the same host
```
bootstrap.servers=localhost:9092
```
However, we recommend installing and running Kafka components on separate hosts.
(Optional) Install Kafka components on separate hosts.
1. Extract the installation files to the /opt/kafka directory on each host.
2. Change the ownership of the /opt/kafka directory to the kafka user.
3. Add bootstrap.servers configuration that connects the component to the host (or hosts in a multi-node environment) running the Kafka brokers.
  Example bootstrap servers configuration pointing to Kafka brokers on different hosts
```
bootstrap.servers=kafka0.<host_ip_address>:9092,kafka1.<host_ip_address>:9092,kafka2.<host_ip_address>:9092
```
  You can use this configuration for Kafka Connect, MirrorMaker 2, and the Kafka Bridge.

3.4. Running a Kafka cluster in KRaft mode

Configure and run Kafka in KRaft mode. You can run Kafka as a single-node or multi-node Kafka cluster. Run a minimum of three broker and three controller nodes, with topic replication across the brokers, for stability and availability.

Kafka nodes perform the role of broker, controller, or both.

Broker role: A broker, sometimes referred to as a node or server, orchestrates the storage and passing of messages.
Controller role: A controller coordinates the cluster and manages the metadata used to track the status of brokers and partitions.

Note

Cluster metadata is stored in the internal __cluster_metadata topic.

You can use combined broker and controller nodes, though you might want to separate these functions. Brokers performing combined roles can be more convenient in simpler deployments.

To identify a cluster, you create an ID. The ID is used when creating logs for the nodes you add to the cluster.

Specify the following in the configuration of each node:

A node ID
Broker roles
A list of nodes (or voters) that act as controllers

You specify a list of controllers, configured as voters, using the node ID and connection details (hostname and port) for each controller.

You apply broker configuration, including the setting of roles, using a configuration properties file. Broker configuration differs according to role. KRaft provides three example broker configuration properties files.

/opt/kafka/config/kraft/broker.properties has example configuration for a broker role
/opt/kafka/config/kraft/controller.properties has example configuration for a controller role
/opt/kafka/config/kraft/server.properties has example configuration for a combined role

You can base your broker configuration on these example properties files. In this procedure, the example server.properties configuration is used.

Prerequisites

Streams for Apache Kafka is installed on each host, and the configuration files are available.

Procedure

Generate a unique ID for the Kafka cluster.
You can use the kafka-storage tool to do this:
```
/opt/kafka/bin/kafka-storage.sh random-uuid
```
The command returns an ID. A cluster ID is required in KRaft mode.
Create a configuration properties file for each node in the cluster.
You can base the file on the examples provided with Kafka.
1. Specify a role as broker, controller, or broker, controller
  For example, specify process.roles=broker, controller for a combined role.
2. Specify a unique node.id for each node in the cluster starting from 0.
  For example, node.id=1.
3. Specify a list of controller.quorum.voters in the format <node_id>@<hostname:port>.
  For example, controller.quorum.voters=1@localhost:9093.
4. Specify listeners:
  - Configure the name, hostname and port for each listener.
    For example, listeners=PLAINTEXT:localhost:9092,CONTROLLER:localhost:9093.
  - Configure the listener names used for inter-broker communication.
    For example, inter.broker.listener.name=PLAINTEXT.
  - Configure the listener names used by the controller quorum.
    For example, controller.listener.names=CONTROLLER.
  - Configure the name, hostname and port for each listener that is advertised to clients for connection to Kafka.
    For example, advertised.listeners=PLAINTEXT:localhost:9092.
Set up log directories for each node in your Kafka cluster:
```
/opt/kafka/bin/kafka-storage.sh format -t <uuid> -c /opt/kafka/config/kraft/server.properties
```
Returns:
```
Formatting /tmp/kraft-combined-logs
```
Replace <uuid> with the cluster ID you generated. Use the same ID for each node in your cluster.
Apply the broker configuration using the properties file you created for the broker.
By default, the log directory (log.dirs) specified in the server.properties configuration file is set to /tmp/kraft-combined-logs. The /tmp directory is typically cleared on each system reboot, making it suitable for development environments only.
You can add a comma-separated list to set up multiple log directories.

Start each Kafka node.

/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/kraft/server.properties

Check that Kafka is running:
```
jcmd | grep kafka
```
Returns:
```
process ID kafka.Kafka /opt/kafka/config/kraft/server.properties
```
Check the logs of each node to ensure that they have successfully joined the KRaft cluster:
```
tail -f /opt/kafka/logs/server.log
```

You can now create topics, and send and receive messages from the brokers.

For brokers passing messages, you can use topic replication across the brokers in a cluster for data durability. Configure topics to have a replication factor of at least three and a minimum number of in-sync replicas set to 1 less than the replication factor. For more information, see Section 7.7, “Creating a topic”.

3.5. Stopping the Streams for Apache Kafka services

You can stop Kafka services by running a script. After running the script, all connections to the Kafka services are terminated.

Procedure

Stop the Kafka node.

su - kafka
/opt/kafka/bin/kafka-server-stop.sh

Confirm that the Kafka node is stopped.
```
jcmd | grep kafka
```

3.6. Performing a graceful rolling restart of Kafka brokers

This procedure shows how to do a graceful rolling restart of brokers in a multi-node cluster. A rolling restart is usually required following an upgrade or change to the Kafka cluster configuration properties.

Note

Some broker configurations do not need a restart of the broker. For more information, see Updating Broker Configs in the Apache Kafka documentation.

After you perform a restart of a broker, check for under-replicated topic partitions to make sure that replica partitions have caught up.

To achieve a graceful restart with no loss of availability, ensure that you are replicating topics and that at least the minimum number of replicas (min.insync.replicas) replicas are in sync. The min.insync.replicas configuration determines the minimum number of replicas that must acknowledge a write for the write to be considered successful.

For a multi-node cluster, the standard approach is to have a topic replication factor of at least 3 and a minimum number of in-sync replicas set to 1 less than the replication factor. If you are using acks=all in your producer configuration for data durability, check that the broker you restarted is in sync with all the partitions it’s replicating before restarting the next broker.

Single-node clusters are unavailable during a restart, since all partitions are on the same broker.

Prerequisites

Streams for Apache Kafka is installed on each host, and the configuration files are available.
The Kafka cluster is operating as expected.
Check for under-replicated partitions or any other issues affecting broker operation. The steps in this procedure describe how to check for under-replicated partitions.

Procedure

Perform the following steps on each Kafka broker. Complete the steps on the first broker before moving on to the next. Perform the steps on the brokers that also act as controllers last. Otherwise, the controllers need to change on more than one restart.

Stop the Kafka broker:
```
/opt/kafka/bin/kafka-server-stop.sh
```
Make any changes to the broker configuration that require a restart after completion.
For further information, see the following:
- Configuring Kafka
- Upgrading Kafka nodes

Restart the Kafka broker:

/opt/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/kraft/server.properties

Check that Kafka is running:
```
jcmd | grep kafka
```
Returns:
```
process ID kafka.Kafka /opt/kafka/config/kraft/server.properties
```
Check the logs of each node to ensure that they have successfully joined the KRaft cluster:
```
tail -f /opt/kafka/logs/server.log
```
Wait until the broker has zero under-replicated partitions. You can check from the command line or use metrics.
- Use the kafka-topics.sh command with the --under-replicated-partitions parameter:
```
/opt/kafka/bin/kafka-topics.sh --bootstrap-server <broker_host>:<port>  --describe --under-replicated-partitions
```
  For example:
```
/opt/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe --under-replicated-partitions
```
  The command provides a list of topics with under-replicated partitions in a cluster.
  Topics with under-replicated partitions
```
Topic: topic3 Partition: 4 Leader: 2 Replicas: 2,3 Isr: 2
Topic: topic3 Partition: 5 Leader: 3 Replicas: 1,2 Isr: 1
Topic: topic1 Partition: 1 Leader: 3 Replicas: 1,3 Isr: 3
# …
```
  Under-replicated partitions are listed if the ISR (in-sync replica) count is less than the number of replicas. If a list is not returned, there are no under-replicated partitions.
- Use the UnderReplicatedPartitions metric:
```
kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions
```
  The metric provides a count of partitions where replicas have not caught up. You wait until the count is zero.
  Tip
  Use the Kafka Exporter to create an alert when there are one or more under-replicated partitions for a topic.

Checking logs when restarting

If a broker fails to start, check the application logs for information. You can also check the status of a broker shutdown and restart in the /opt/kafka/logs/server.log application log.

Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 3. Getting started

3.1. Installation environment

3.1.1. Data storage considerations

3.1.2. File systems

3.2. Downloading Streams for Apache Kafka

3.3. Installing Kafka

3.4. Running a Kafka cluster in KRaft mode

3.5. Stopping the Streams for Apache Kafka services

3.6. Performing a graceful rolling restart of Kafka brokers

Apprendre

Essayez, achetez et vendez

Communautés

À propos de la documentation Red Hat

Rendre l’open source plus inclusif

À propos de Red Hat

Red Hat legal and privacy links

Red Hat legal and privacy links