Pesquisar

Este conteúdo não está disponível no idioma selecionado.

Chapter 2. About Kafka

download PDF

Apache Kafka is an open-source distributed publish-subscribe messaging system for fault-tolerant real-time data feeds.

Additional resources

2.1. Kafka concepts

Knowledge of the key concepts of Kafka is important in understanding how AMQ Streams works.

A Kafka cluster comprises multiple brokers. Topics are used to receive and store data in a Kafka cluster. Topics are split by partitions, where the data is written. Partitions are replicated across topics for fault tolerance.

Kafka brokers and topics

Kafka brokers and topics inside a Kafka cluster showing the partition leader of each topic

Broker
A broker, sometimes referred to as a server or node, orchestrates the storage and passing of messages.
Topic
A topic provides a destination for the storage of data. Each topic is split into one or more partitions.
Cluster
A group of broker instances.
Partition
The number of topic partitions is defined by a topic partition count.
Partition leader
A partition leader handles all producer requests for a topic.
Partition follower

A partition follower replicates the partition data of a partition leader, optionally handling consumer requests.

Topics use a replication factor to configure the number of replicas of each partition within the cluster. A topic comprises at least one partition.

An in-sync replica has the same number of messages as the leader. Configuration defines how many replicas must be in-sync to be able to produce messages, ensuring that a message is committed only after it has been successfully copied to the replica partition. In this way, if the leader fails the message is not lost.

In the Kafka brokers and topics diagram, we can see each numbered partition has a leader and two followers in replicated topics.

2.2. Producers and consumers

Producers and consumers send and receive messages (publish and subscribe) through brokers. Messages comprise an optional key and a value that contains the message data, plus headers and related metadata. The key is used to identify the subject of the message, or a property of the message. Messages are delivered in batches, and batches and records contain headers and metadata that provide details that are useful for filtering and routing by clients, such as the timestamp and offset position for the record.

Producers and consumers

A producer sends messages through a broker to a topic containing three partitions. Three consumers in a consumer group read the messages from the partitions

Producer
A producer sends messages to a broker topic to be written to the end offset of a partition. Messages are written to partitions by a producer on a round robin basis, or to a specific partition based on the message key.
Consumer
A consumer subscribes to a topic and reads messages according to topic, partition and offset.
Consumer group
Consumer groups are used to share a typically large data stream generated by multiple producers from a given topic. Consumers are grouped using a group.id, allowing messages to be spread across the members. Consumers within a group do not read data from the same partition, but can receive data from one or more partitions.
Offsets

Offsets describe the position of messages within a partition. Each message in a given partition has a unique offset, which helps identify the position of a consumer within the partition to track the number of records that have been consumed.

Committed offsets are written to an offset commit log. A __consumer_offsets topic stores information on committed offsets, the position of last and next offset, according to consumer group.

Producing and consuming data

A producer sends a message to a broker topic; the message is written to the end offset (7). A consumer reads messages from offset 5

Red Hat logoGithubRedditYoutubeTwitter

Aprender

Experimente, compre e venda

Comunidades

Sobre a documentação da Red Hat

Ajudamos os usuários da Red Hat a inovar e atingir seus objetivos com nossos produtos e serviços com conteúdo em que podem confiar.

Tornando o open source mais inclusivo

A Red Hat está comprometida em substituir a linguagem problemática em nosso código, documentação e propriedades da web. Para mais detalhes veja oBlog da Red Hat.

Sobre a Red Hat

Fornecemos soluções robustas que facilitam o trabalho das empresas em plataformas e ambientes, desde o data center principal até a borda da rede.

© 2024 Red Hat, Inc.