Chapter 7. Scaling Clusters

7.1. Scaling Kafka clusters
复制链接

7.1.1. Adding brokers to a cluster
复制链接

The primary way of increasing throughput for a topic is to increase the number of partitions for that topic. That works because the partitions allow the load for that topic to be shared between the brokers in the cluster. When the brokers are all constrained by some resource (typically I/O), then using more partitions will not yield an increase in throughput. Instead, you must add brokers to the cluster.

When you add an extra broker to the cluster, AMQ Streams does not assign any partitions to it automatically. You have to decide which partitions to move from the existing brokers to the new broker.

Once the partitions have been redistributed between all brokers, each broker should have a lower resource utilization.

7.1.2. Removing brokers from the cluster
复制链接

Before you remove a broker from a cluster, you must ensure that it is not assigned to any partitions. You should decide which remaining brokers will be responsible for each of the partitions on the broker being decommissioned. Once the broker has no assigned partitions, you can stop it.

7.2. Reassignment of partitions
复制链接

The kafka-reassign-partitions.sh utility is used to reassign partitions to different brokers.

It has three different modes:

--generate: Takes a set of topics and brokers and generates a reassignment JSON file which will result in the partitions of those topics being assigned to those brokers. It is an easy way to generate a reassignment JSON file, but it operates on whole topics, so its use is not always appropriate.
--execute: Takes a reassignment JSON file and applies it to the partitions and brokers in the cluster. Brokers which are gaining partitions will become followers of the partition leader. For a given partition, once the new broker has caught up and joined the ISR the old broker will stop being a follower and will delete its replica.
--verify: Using the same reassignment JSON file as the --execute step, --verify checks whether all of the partitions in the file have been moved to their intended brokers. If the reassignment is complete it will also remove any throttles which are in effect. Unless removed, throttles will continue to affect the cluster even after the reassignment has finished.

It is only possible to have one reassignment running in the cluster at any given time, and it is not possible to cancel a running reassignment. If you need to cancel a reassignment you have to wait for it to complete and then perform another reassignment to revert the effects of the first one. The kafka-reassign-partitions.sh will print the reassignment JSON for this reversion as part of its output. Very large reassignments should be broken down into a number of smaller reassignments in case there is a need to stop in-progress reassignment.

7.2.1. Reassignment JSON file
复制链接

The reassignment JSON file has a specific structure:

{
  "version": 1,
  "partitions": [
    <PartitionObjects>
  ]
}

{
  "version": 1,
  "partitions": [
    <PartitionObjects>
  ]
}

Copy to Clipboard

Toggle word wrap

Where <PartitionObjects> is a comma-separated list of objects like:

{
  "topic": <TopicName>,
  "partition": <Partition>,
  "replicas": [ <AssignedBrokerIds> ],
  "log_dirs": [<LogDirs>]
}

{
  "topic": <TopicName>,
  "partition": <Partition>,
  "replicas": [ <AssignedBrokerIds> ],
  "log_dirs": [<LogDirs>]
}

Copy to Clipboard

Toggle word wrap

The "log_dirs" property is optional and is used to move the partition to a specific log directory.

The following is an example reassignment JSON file that assigns topic topic-a, partition 4 to brokers 2, 4 and 7, and topic topic-b partition 2 to brokers 1, 5 and 7:

{
  "version": 1,
  "partitions": [
    {
      "topic": "topic-a",
      "partition": 4,
      "replicas": [2,4,7]
    },
    {
      "topic": "topic-b",
      "partition": 2,
      "replicas": [1,5,7]
    }
  ]
}

{
  "version": 1,
  "partitions": [
    {
      "topic": "topic-a",
      "partition": 4,
      "replicas": [2,4,7]
    },
    {
      "topic": "topic-b",
      "partition": 2,
      "replicas": [1,5,7]
    }
  ]
}

Copy to Clipboard

Toggle word wrap

Partitions not included in the JSON are not changed.

7.2.2. Generating reassignment JSON files
复制链接

The easiest way to assign all the partitions for a given set of topics to a given set of brokers is to generate a reassignment JSON file using the kafka-reassign-partitions.sh --generate, command.

bin/kafka-reassign-partitions.sh --zookeeper <ZooKeeper> --topics-to-move-json-file <TopicsFile> --broker-list <BrokerList> --generate

bin/kafka-reassign-partitions.sh --zookeeper <ZooKeeper> --topics-to-move-json-file <TopicsFile> --broker-list <BrokerList> --generate

Copy to Clipboard

Toggle word wrap

The <TopicsFile> is a JSON file which lists the topics to move. It has the following structure:

{
  "version": 1,
  "topics": [
    <TopicObjects>
  ]
}

{
  "version": 1,
  "topics": [
    <TopicObjects>
  ]
}

Copy to Clipboard

Toggle word wrap

where <TopicObjects> is a comma-separated list of objects like:

{
  "topic": <TopicName>
}

{
  "topic": <TopicName>
}

Copy to Clipboard

Toggle word wrap

For example to move all the partitions of topic-a and topic-b to brokers 4 and 7

bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --topics-to-move-json-file topics-to-be-moved.json --broker-list 4,7 --generate

bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --topics-to-move-json-file topics-to-be-moved.json --broker-list 4,7 --generate

Copy to Clipboard

Toggle word wrap

where topics-to-be-moved.json has contents:

{
  "version": 1,
  "topics": [
    { "topic": "topic-a"},
    { "topic": "topic-b"}
  ]
}

{
  "version": 1,
  "topics": [
    { "topic": "topic-a"},
    { "topic": "topic-b"}
  ]
}

Copy to Clipboard

Toggle word wrap

7.2.3. Creating reassignment JSON files manually
复制链接

You can manually create the reassignment JSON file if you want to move specific partitions.

7.3. Reassignment throttles
复制链接

Reassigning partitions can be a slow process because it can require moving lots of data between brokers. To avoid this having a detrimental impact on clients it is possible to throttle the reassignment. Using a throttle can mean the reassignment takes longer. If the throttle is too low then the newly assigned brokers will not be able to keep up with records being published and the reassignment will never complete. If the throttle is too high then clients will be impacted. For example, for producers, this could manifest as higher than normal latency waiting for acknowledgement. For consumers, this could manifest as a drop in throughput caused by higher latency between polls.

7.4. Scaling up a Kafka cluster
复制链接

This procedure describes how to increase the number of brokers in a Kafka cluster.

Prerequisites

An existing Kafka cluster.
A new machine with the AMQ broker installed.
A reassignment JSON file of how partitions should be reassigned to brokers in the enlarged cluster.

Procedure

Create a configuration file for the new broker using the same settings as for the other brokers in your cluster, except for broker.id which should be a number that is not already used by any of the other brokers.
Start the new Kafka broker passing the configuration file you created in the previous step as the argument to the kafka-server-start.sh script:
```
su - kafka
/opt/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/server.properties
```
```
su - kafka
/opt/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/server.properties
```
Copy to Clipboard Toggle word wrap
Verify that the Kafka broker is running.
```
jcmd | grep Kafka
```
```
jcmd | grep Kafka
```
Copy to Clipboard Toggle word wrap
Repeat the above steps for each new broker.
Execute the partition reassignment using the kafka-reassign-partitions.sh command line tool.
```
kafka-reassign-partitions.sh --zookeeper <ZooKeeperHostAndPort> --reassignment-json-file <ReassignmentJsonFile> --execute
```
```
kafka-reassign-partitions.sh --zookeeper <ZooKeeperHostAndPort> --reassignment-json-file <ReassignmentJsonFile> --execute
```
Copy to Clipboard Toggle word wrap
If you are going to throttle replication you can also pass the --throttle option with an inter-broker throttled rate in bytes per second. For example:
```
kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --throttle 5000000 --execute
```
```
kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --throttle 5000000 --execute
```
Copy to Clipboard Toggle word wrap
This command will print out two reassignment JSON objects. The first records the current assignment for the partitions being moved. You should save this to a file in case you need to revert the reassignment later on. The second JSON object is the target reassignment you have passed in your reassignment JSON file.

If you need to change the throttle during reassignment you can use the same command line with a different throttled rate. For example:

kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --throttle 10000000 --execute

kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --throttle 10000000 --execute

Copy to Clipboard

Toggle word wrap

Periodically verify whether the reassignment has completed using the kafka-reassign-partitions.sh command line tool. This is the same command as the previous step but with the --verify option instead of the --execute option.

kafka-reassign-partitions.sh --zookeeper <ZooKeeperHostAndPort> --reassignment-json-file <ReassignmentJsonFile> --verify

kafka-reassign-partitions.sh --zookeeper <ZooKeeperHostAndPort> --reassignment-json-file <ReassignmentJsonFile> --verify

Copy to Clipboard

Toggle word wrap

For example:

kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --verify

kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --verify

Copy to Clipboard

Toggle word wrap

The reassignment has finished when the --verify command reports each of the partitions being moved as completed successfully. This final --verify will also have the effect of removing any reassignment throttles. You can now delete the revert file if you saved the JSON for reverting the assignment to their original brokers.

7.5. Scaling down a Kafka cluster
复制链接

Additional resources

This procedure describes how to decrease the number of brokers in a Kafka cluster.

Prerequisites

An existing Kafka cluster.
A reassignment JSON file of how partitions should be reassigned to brokers in the cluster once the broker(s) have been removed.

Procedure

Execute the partition reassignment using the kafka-reassign-partitions.sh command line tool.
```
kafka-reassign-partitions.sh --zookeeper <ZooKeeperHostAndPort> --reassignment-json-file <ReassignmentJsonFile> --execute
```
```
kafka-reassign-partitions.sh --zookeeper <ZooKeeperHostAndPort> --reassignment-json-file <ReassignmentJsonFile> --execute
```
Copy to Clipboard Toggle word wrap
If you are going to throttle replication you can also pass the --throttle option with an inter-broker throttled rate in bytes per second. For example:
```
kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --throttle 5000000 --execute
```
```
kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --throttle 5000000 --execute
```
Copy to Clipboard Toggle word wrap
This command will print out two reassignment JSON objects. The first records the current assignment for the partitions being moved. You should save this to a file in case you need to revert the reassignment later on. The second JSON object is the target reassignment you have passed in your reassignment JSON file.

If you need to change the throttle during reassignment you can use the same command line with a different throttled rate. For example:

kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --throttle 10000000 --execute

kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --throttle 10000000 --execute

Copy to Clipboard

Toggle word wrap

Periodically verify whether the reassignment has completed using the kafka-reassign-partitions.sh command line tool. This is the same command as the previous step but with the --verify option instead of the --execute option.

kafka-reassign-partitions.sh --zookeeper <ZooKeeperHostAndPort> --reassignment-json-file <ReassignmentJsonFile> --verify

kafka-reassign-partitions.sh --zookeeper <ZooKeeperHostAndPort> --reassignment-json-file <ReassignmentJsonFile> --verify

Copy to Clipboard

Toggle word wrap

For example:

kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --verify

kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 --reassignment-json-file reassignment.json --verify

Copy to Clipboard

Toggle word wrap

The reassignment has finished when the --verify command reports each of the partitions being moved as completed successfully. This final --verify will also have the effect of removing any reassignment throttles. You can now delete the revert file if you saved the JSON for reverting the assignment to their original brokers.
Once all the partition reassignments have finished, the broker being removed should not have responsibility for any of the partitions in the cluster. You can verify this by checking each of the directories given in the broker’s log.dirs configuration parameters. If any of the log directories on the broker contains a directory that does not match the extended regular expression \.[a-z0-9]-delete$ then the broker still has live partitions and it should not be stopped.
You can check this by executing the command:
```
ls -l <LogDir> | grep -E '^d' | grep -vE '[a-zA-Z0-9.-]+\.[a-z0-9]+-delete$'
```
```
ls -l <LogDir> | grep -E '^d' | grep -vE '[a-zA-Z0-9.-]+\.[a-z0-9]+-delete$'
```
Copy to Clipboard Toggle word wrap
If the above command prints any output then the broker still has live partitions. In this case, either the reassignment has not finished, or the reassignment JSON file was incorrect.
Once you have confirmed that the broker has no live partitions you can stop it.
```
su - kafka
/opt/kafka/bin/kafka-server-stop.sh
```
```
su - kafka
/opt/kafka/bin/kafka-server-stop.sh
```
Copy to Clipboard Toggle word wrap
Confirm that the Kafka broker is stopped.
```
jcmd | grep kafka
```
```
jcmd | grep kafka
```
Copy to Clipboard Toggle word wrap

7.6. Scaling up a ZooKeeper cluster
复制链接

This procedure describes how to add servers (nodes) to a ZooKeeper cluster. The dynamic reconfiguration feature of ZooKeeper maintains a stable ZooKeeper cluster during the scale up process.

Prerequisites

Dynamic reconfiguration is enabled in the ZooKeeper configuration file (reconfigEnabled=true).
ZooKeeper authentication is enabled and you can access the new server using the authentication mechanism.

Procedure

Perform the following steps for each ZooKeeper server, one at a time:

Add a server to the ZooKeeper cluster as described in Section 3.3, “Running multi-node ZooKeeper cluster” and then start ZooKeeper.
Note the IP address and configured access ports of the new server.
Start a zookeeper-shell session for the server. Run the following command from a machine that has access to the cluster (this might be one of the ZooKeeper nodes or your local machine, if it has access).
```
su - kafka
/opt/kafka/bin/zookeeper-shell.sh <ip-address>:<zk-port>
```
```
su - kafka
/opt/kafka/bin/zookeeper-shell.sh <ip-address>:<zk-port>
```
Copy to Clipboard Toggle word wrap
In the shell session, with the ZooKeeper node running, enter the following line to add the new server to the quorum as a voting member:
```
reconfig -add server.<positive-id> = <address1>:<port1>:<port2>[:role];[<client-port-address>:]<client-port>
```
```
reconfig -add server.<positive-id> = <address1>:<port1>:<port2>[:role];[<client-port-address>:]<client-port>
```
Copy to Clipboard Toggle word wrap
For example:
```
reconfig -add server.4=172.17.0.4:2888:3888:participant;172.17.0.4:2181
```
```
reconfig -add server.4=172.17.0.4:2888:3888:participant;172.17.0.4:2181
```
Copy to Clipboard Toggle word wrap
Where <positive-id> is the new server ID 4.
For the two ports, <port1> 2888 is for communication between ZooKeeper servers, and <port2> 3888 is for leader election.
The new configuration propagates to the other servers in the ZooKeeper cluster; the new server is now a full member of the quorum.
Repeat steps 1-4 for the other servers that you want to add.

Additional resources

Section 7.7, “Scaling down a ZooKeeper cluster”

7.7. Scaling down a ZooKeeper cluster
复制链接

This procedure describes how to remove servers (nodes) from a ZooKeeper cluster. The dynamic reconfiguration feature of ZooKeeper maintains a stable ZooKeeper cluster during the scale down process.

Prerequisites

Dynamic reconfiguration is enabled in the ZooKeeper configuration file (reconfigEnabled=true).
ZooKeeper authentication is enabled and you can access the new server using the authentication mechanism.

Procedure

Perform the following steps for each ZooKeeper server, one at a time:

Log in to the zookeeper-shell on one of the servers that will be retained after the scale down (for example, server 1).
Note
Access the server using the authentication mechanism configured for the ZooKeeper cluster.
Remove a server, for example server 5.
```
reconfig -remove 5
```
```
reconfig -remove 5
```
Copy to Clipboard Toggle word wrap
Deactivate the server that you removed.
Repeat steps 1-3 to reduce the cluster size.

Additional resources

Section 7.6, “Scaling up a ZooKeeper cluster”
Removing servers in the ZooKeeper documentation

此内容没有您所选择的语言版本。

7.1. Scaling Kafka clusters
复制链接

7.1.1. Adding brokers to a cluster
复制链接

7.1.2. Removing brokers from the cluster
复制链接

7.2. Reassignment of partitions
复制链接

7.2.1. Reassignment JSON file
复制链接

7.2.2. Generating reassignment JSON files
复制链接

7.2.3. Creating reassignment JSON files manually
复制链接

7.3. Reassignment throttles
复制链接

7.4. Scaling up a Kafka cluster
复制链接

7.5. Scaling down a Kafka cluster
复制链接

7.6. Scaling up a ZooKeeper cluster
复制链接

7.7. Scaling down a ZooKeeper cluster
复制链接

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

此内容没有您所选择的语言版本。

Chapter 7. Scaling Clusters

7.1. Scaling Kafka clusters复制链接链接已复制到粘贴板!

7.1.1. Adding brokers to a cluster复制链接链接已复制到粘贴板!

7.1.2. Removing brokers from the cluster复制链接链接已复制到粘贴板!

7.2. Reassignment of partitions复制链接链接已复制到粘贴板!

7.2.1. Reassignment JSON file复制链接链接已复制到粘贴板!

7.2.2. Generating reassignment JSON files复制链接链接已复制到粘贴板!

7.2.3. Creating reassignment JSON files manually复制链接链接已复制到粘贴板!

7.3. Reassignment throttles复制链接链接已复制到粘贴板!

7.4. Scaling up a Kafka cluster复制链接链接已复制到粘贴板!

7.5. Scaling down a Kafka cluster复制链接链接已复制到粘贴板!

7.6. Scaling up a ZooKeeper cluster复制链接链接已复制到粘贴板!

7.7. Scaling down a ZooKeeper cluster复制链接链接已复制到粘贴板!

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

7.1. Scaling Kafka clusters
复制链接

7.1.1. Adding brokers to a cluster
复制链接

7.1.2. Removing brokers from the cluster
复制链接

7.2. Reassignment of partitions
复制链接

7.2.1. Reassignment JSON file
复制链接

7.2.2. Generating reassignment JSON files
复制链接

7.2.3. Creating reassignment JSON files manually
复制链接

7.3. Reassignment throttles
复制链接

7.4. Scaling up a Kafka cluster
复制链接

7.5. Scaling down a Kafka cluster
复制链接

7.6. Scaling up a ZooKeeper cluster
复制链接

7.7. Scaling down a ZooKeeper cluster
复制链接