Chapter 14. Scaling clusters by adding or removing brokers
Scaling Kafka clusters by adding brokers can increase the performance and reliability of the cluster. Adding more brokers increases available resources, allowing the cluster to handle larger workloads and process more messages. It can also improve fault tolerance by providing more replicas and backups. Conversely, removing underutilized brokers can reduce resource consumption and improve efficiency. Scaling must be done carefully to avoid disruption or data loss. By redistributing partitions across all brokers in the cluster, the resource utilization of each broker is reduced, which can increase the overall throughput of the cluster.
To increase the throughput of a Kafka topic, you can increase the number of partitions for that topic. This allows the load of the topic to be shared between different brokers in the cluster. However, if every broker is constrained by a specific resource (such as I/O), adding more partitions will not increase the throughput. In this case, you need to add more brokers to the cluster.
Adding brokers when running a multi-node Kafka cluster affects the number of brokers in the cluster that act as replicas. The actual replication factor for topics is determined by settings for the default.replication.factor and min.insync.replicas, and the number of available brokers. For example, a replication factor of 3 means that each partition of a topic is replicated across three brokers, ensuring fault tolerance in the event of a broker failure.
Example replica configuration
default.replication.factor = 3 min.insync.replicas = 2
default.replication.factor = 3
min.insync.replicas = 2
When you add or remove brokers, Kafka does not automatically reassign partitions. The best way to do this is using Cruise Control. You can use Cruise Control’s add-brokers and remove-brokers modes when scaling a cluster up or down.
-
Use the
add-brokersmode after scaling up a Kafka cluster to move partition replicas from existing brokers to the newly added brokers. -
Use the
remove-brokersmode before scaling down a Kafka cluster to move partition replicas off the brokers that are going to be removed.
14.1. Unregistering nodes after scale-down operations Copy linkLink copied to clipboard!
After removing a node from a Kafka cluster, use the kafka-cluster.sh script to unregister the node from the cluster metadata. Failing to unregister removed nodes leads to stale metadata, which causes operational issues.
Prerequisites
Before unregistering a node, ensure the following tasks are completed:
-
Reassign the partitions from the node you plan to remove to the remaining brokers using the Cruise control
remove-nodesoperation. -
Update the cluster configuration, if necessary, to adjust the replication factor for topics (
default.replication.factor) and the minimum required number of in-sync replica acknowledgements (min.insync.replicas). - Stop the Kafka broker service on the node and remove the node from the cluster.
Procedure
Unregister the removed node from the cluster:
/opt/kafka/bin/kafka-cluster.sh unregister \ --bootstrap-server <broker_host>:<port> \ --id <node_id_number>
/opt/kafka/bin/kafka-cluster.sh unregister \ --bootstrap-server <broker_host>:<port> \ --id <node_id_number>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the current state of the cluster by describing the topics:
/opt/kafka/bin/kafka-topics.sh \ --bootstrap-server <broker_host>:<port> \ --describe
/opt/kafka/bin/kafka-topics.sh \ --bootstrap-server <broker_host>:<port> \ --describeCopy to Clipboard Copied! Toggle word wrap Toggle overflow