Chapter 2. Streams for Apache Kafka deployment of Kafka


Streams for Apache Kafka enables the deployment of Apache Kafka components to an OpenShift cluster, typically running as clusters for high availability.

A standard Kafka deployment using Streams for Apache Kafka might include the following components:

  • Kafka cluster of broker nodes as the core component
  • Kafka Connect cluster for external data connections
  • Kafka MirrorMaker cluster to mirror data to another Kafka cluster
  • Kafka Exporter to extract additional Kafka metrics data for monitoring
  • Kafka Bridge to enable HTTP-based communication with Kafka
  • Cruise Control to rebalance topic partitions across brokers

Not all of these components are required, though you need Kafka as a minimum for a Streams for Apache Kafka-managed Kafka cluster. Depending on your use case, you can deploy the additional components as needed. These components can also be used with Kafka clusters that are not managed by Streams for Apache Kafka.

2.1. Kafka component architecture

A KRaft-based Kafka cluster consists of broker nodes responsible for message delivery and controller nodes that manage cluster metadata and coordinate clusters. These roles can be configured using node pools in Streams for Apache Kafka.

Other Kafka components interact with the Kafka cluster for specific tasks.

Kafka component interaction

Data flows between several Kafka components and the Kafka cluster.

Kafka Connect

Kafka Connect is an integration toolkit for streaming data between Kafka brokers and other systems using connector plugins. Kafka Connect provides a framework for integrating Kafka with an external data source or target, such as a database, for import or export of data using connectors. Connectors provide the connection configuration needed.

  • A source connector pushes external data into Kafka.
  • A sink connector extracts data out of Kafka

External data is translated and transformed into the appropriate format.

Kafka Connect can be configured to build custom container images with the required connectors.

Kafka MirrorMaker
Kafka MirrorMaker replicates data between two Kafka clusters, either in the same data center or across different locations.
Kafka Bridge
Kafka Bridge provides an API for integrating HTTP-based clients with a Kafka cluster.
Kafka Exporter
Kafka Exporter extracts data for analysis as Prometheus metrics, primarily data relating to offsets, consumer groups, consumer lag and topics. Consumer lag is the delay between the last message written to a partition and the message currently being picked up from that partition by a consumer
Apache ZooKeeper (optional)
Apache ZooKeeper provides a cluster coordination service, storing and tracking the status of brokers and consumers. ZooKeeper is also used for controller election. If ZooKeeper is used, the ZooKeeper cluster must be ready before running Kafka. However, since the introduction of KRaft, ZooKeeper is no longer required as Kafka nodes handle cluster coordination and control natively.
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.