Rechercher

Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 3. Getting started

download PDF

Streams for Apache Kafka is distributed in a ZIP file that contains installation artifacts for the Kafka components.

Note

The Kafka Bridge has separate installation files. For information on installing and using the Kafka Bridge, see Using the Streams for Apache Kafka Bridge.

3.1. Installation environment

Streams for Apache Kafka runs on Red Hat Enterprise Linux. The host (node) can be a physical or virtual machine (VM). Use the installation files provided with Streams for Apache Kafka to install Kafka components. You can install Kafka in a single-node or multi-node environment.

Single-node environment
A single-node Kafka cluster runs instances of Kafka components on a single host. This configuration is not suitable for a production environment.
Multi-node environment
A multi-node Kafka cluster runs instances of Kafka components on multiple hosts.

We recommended that you run Kafka and other Kafka components, such as Kafka Connect, on separate hosts. By running the components in this way, it’s easier to maintain and upgrade each component.

Kafka clients establish a connection to the Kafka cluster using the bootstrap.servers configuration property. If you are using Kafka Connect, for example, the Kafka Connect configuration properties must include a bootstrap.servers value that specifies the hostname and port of the hosts where the Kafka brokers are running. If the Kafka cluster is running on more than one host with multiple Kafka brokers, you specify a hostname and port for each broker. Each Kafka broker is identified by a node.id.

3.1.1. Data storage considerations

An efficient data storage infrastructure is essential to the optimal performance of Streams for Apache Kafka.

Block storage is required. File storage, such as NFS, does not work with Kafka.

Choose from one of the following options for your block storage:

  • Cloud-based block storage solutions, such as Amazon Elastic Block Store (EBS)
  • Local storage
  • Storage Area Network (SAN) volumes accessed by a protocol such as Fibre Channel or iSCSI

3.1.2. File systems

Kafka uses a file system for storing messages. Streams for Apache Kafka is compatible with the XFS and ext4 file systems, which are commonly used with Kafka. Consider the underlying architecture and requirements of your deployment when choosing and setting up your file system.

For more information, refer to Filesystem Selection in the Kafka documentation.

3.1.3. Apache Kafka and ZooKeeper storage

Use separate disks for Apache Kafka and ZooKeeper.

Kafka supports JBOD (Just a Bunch of Disks) storage, a data storage configuration of multiple disks or volumes. JBOD provides increased data storage for Kafka brokers. It can also improve performance.

Solid-state drives (SSDs), though not essential, can improve the performance of Kafka in large clusters where data is sent to and received from multiple topics asynchronously. SSDs are particularly effective with ZooKeeper, which requires fast, low latency data access.

Note

You do not need to provision replicated storage because Kafka and ZooKeeper both have built-in data replication.

3.2. Downloading Streams for Apache Kafka

A ZIP file distribution of Streams for Apache Kafka is available for download from the Red Hat website. You can download the latest version of Red Hat Streams for Apache Kafka from the Streams for Apache Kafka software downloads page.

  • For Kafka and other Kafka components, download the amq-streams-<version>-bin.zip file
  • For Kafka Bridge, download the amq-streams-<version>-bridge-bin.zip file.

    For installation instructions, see Using the Streams for Apache Kafka Bridge.

3.3. Installing Kafka

Use the Streams for Apache Kafka ZIP files to install Kafka on Red Hat Enterprise Linux. You can install Kafka in a single-node or multi-node environment. In this procedure, a single Kafka broker and ZooKeeper instance are installed on a single host (node).

The Streams for Apache Kafka installation files include the binaries for running other Kafka components, like Kafka Connect, Kafka MirrorMaker 2, and Kafka Bridge. In a single-node environment, you can run these components from the same host where you installed Kafka. However, we recommend that you add the installation files and run other Kafka components on separate hosts.

Apache ZooKeeper provides a cluster coordination service for highly reliable distributed coordination. Kafka uses ZooKeeper for storing configuration data and for cluster coordination. Before running Kafka, a ZooKeeper cluster has to be ready.

Note

If you are using a multi-node environment, you install Kafka brokers and ZooKeeper instances on more than one host. Repeat the installation steps for each host. To identify each ZooKeeper instance and broker, you add a unique ID in the configuration. For more information, see Chapter 4, Running a multi-node environment.

Prerequisites

Procedure

Install Kafka with ZooKeeper on your host.

  1. Add a new kafka user and group:

    groupadd kafka
    useradd -g kafka kafka
    passwd kafka
  2. Extract and move the contents of the amq-streams-<version>-bin.zip file into the /opt/kafka directory:

    unzip amq-streams-<version>-bin.zip -d /opt
    mv /opt/kafka*redhat* /opt/kafka
  3. Change the ownership of the /opt/kafka directory to the kafka user:

    chown -R kafka:kafka /opt/kafka
  4. Create directory /var/lib/zookeeper for storing ZooKeeper data and set its ownership to the kafka user:

    mkdir /var/lib/zookeeper
    chown -R kafka:kafka /var/lib/zookeeper
  5. Create directory /var/lib/kafka for storing Kafka data and set its ownership to the kafka user:

    mkdir /var/lib/kafka
    chown -R kafka:kafka /var/lib/kafka

    You can now run a default configuration of Kafka as a single-node cluster.

    You can also use the installation to run other Kafka components, like Kafka Connect, on the same host.

    To run other components, specify the hostname and port to connect to the Kafka broker using the bootstrap.servers property in the component configuration.

    Example bootstrap servers configuration pointing to a single Kafka broker on the same host

    bootstrap.servers=localhost:9092

    However, we recommend installing and running Kafka components on separate hosts.

  6. (Optional) Install Kafka components on separate hosts.

    1. Repeat the steps to extract and install the installation files to the /opt/kafka directory on each host.
    2. Add bootstrap.servers configuration that connects the component to the host (or hosts in a multi-node environment) running the Kafka brokers.

      Example bootstrap servers configuration pointing to Kafka brokers on different hosts

      bootstrap.servers=kafka0.<host_ip_address>:9092,kafka1.<host_ip_address>:9092,kafka2.<host_ip_address>:9092

      You can use this configuration for Kafka Connect, MirrorMaker 2, and the Kafka Bridge.

3.4. Running a single-node Kafka cluster

This procedure shows how to run a basic Streams for Apache Kafka cluster consisting of a single Apache ZooKeeper node and a single Apache Kafka node, both running on the same host. The default configuration files are used for Kafka.

Warning

A single node Streams for Apache Kafka cluster does not provide reliability and high availability and is suitable only for development purposes.

Prerequisites

  • Streams for Apache Kafka is installed on the host

Running the cluster

  1. Generate a unique ID for the Kafka cluster.

    You can use the kafka-storage tool to do this:

    /opt/kafka/bin/kafka-storage.sh random-uuid

    The command returns an ID.

    Note

    A cluster ID is required in KRaft mode.

  2. Edit the Kafka configuration file /opt/kafka/config/server.properties. Set the log.dirs option to /var/lib/kafka/:

    log.dirs=/var/lib/kafka/
  3. Switch to the kafka user:

    su - kafka
  4. Start Kafka:

    /opt/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/server.properties
  5. Check that Kafka is running:

    jcmd | grep kafka

    Returns:

    process ID kafka.Kafka /opt/kafka/config/server.properties

3.5. Sending and receiving messages from a topic

This procedure describes how to start the Kafka console producer and consumer clients and use them to send and receive several messages.

A new topic is automatically created in step one. Topic auto-creation is controlled using the auto.create.topics.enable configuration property (set to true by default). Alternatively, you can configure and create topics before using the cluster. For more information, see Topics.

Procedure

  1. Start the Kafka console producer and configure it to send messages to a new topic:

    /opt/kafka/bin/kafka-console-producer.sh --broker-list <bootstrap_address> --topic <topic-name>

    For example:

    /opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-topic
  2. Enter several messages into the console. Press Enter to send each individual message to your new topic:

    >message 1
    >message 2
    >message 3
    >message 4

    When Kafka creates a new topic automatically, you might receive a warning that the topic does not exist:

    WARN Error while fetching metadata with correlation id 39 :
    {4-3-16-topic1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)

    The warning should not reappear after you send further messages.

  3. In a new terminal window, start the Kafka console consumer and configure it to read messages from the beginning of your new topic.

    /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server <bootstrap_address> --topic <topic-name> --from-beginning

    For example:

    /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic my-topic --from-beginning

    The incoming messages display in the consumer console.

  4. Switch to the producer console and send additional messages. Check that they display in the consumer console.
  5. Stop the Kafka console producer and then the consumer by pressing Ctrl+C.

3.6. Stopping the Streams for Apache Kafka services

You can stop the Kafka and ZooKeeper services by running a script. All connections to the Kafka and ZooKeeper services will be terminated.

Prerequisites

  • Streams for Apache Kafka is installed on the host
  • ZooKeeper and Kafka are up and running

Procedure

  1. Stop the Kafka broker.

    su - kafka
    /opt/kafka/bin/kafka-server-stop.sh
  2. Confirm that the Kafka broker is stopped.

    jcmd | grep kafka
  3. Stop ZooKeeper.

    su - kafka
    /opt/kafka/bin/zookeeper-server-stop.sh
Red Hat logoGithubRedditYoutubeTwitter

Apprendre

Essayez, achetez et vendez

Communautés

À propos de la documentation Red Hat

Nous aidons les utilisateurs de Red Hat à innover et à atteindre leurs objectifs grâce à nos produits et services avec un contenu auquel ils peuvent faire confiance.

Rendre l’open source plus inclusif

Red Hat s'engage à remplacer le langage problématique dans notre code, notre documentation et nos propriétés Web. Pour plus de détails, consultez leBlog Red Hat.

À propos de Red Hat

Nous proposons des solutions renforcées qui facilitent le travail des entreprises sur plusieurs plates-formes et environnements, du centre de données central à la périphérie du réseau.

© 2024 Red Hat, Inc.