
Now create a new topic with a replication factor of three:

> bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic
使用describe topics指令,查看副本在集群中每一个broker的分布情况。
Okay but now that we have a cluster how can we know which broker is doing what? To see that run the "describe topics" command:

> bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic
Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3 Configs:
Topic: my-replicated-topic Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0


"leader" is the node responsible for all reads and writes for the given partition.
Each node will be the leader for a randomly selected portion of the partitions.


"replicas" is the list of nodes that replicate the log for this partition
regardless of whether they are the leader or even if they are currently alive.


"isr" is the set of "in-sync" replicas.
This is the subset of the replicas list that is currently alive and caught-up to the leader.



With replication, each partition can have multiple replicas. The list of replicas for a partition is called the "assigned replicas".
The first replica in this list is the "preferred replica". When topic/partitions are created,
Kafka ensures that the "preferred replica" for the partitions across topics are equally distributed amongst the brokers in a cluster.
In an ideal scenario, the leader for a given partition should be the "preferred replica".


This guarantees that the leadership load across the brokers in a cluster are evenly balanced.
However, over time the leadership load could get imbalanced due to broker shutdowns (caused by controlled shutdown, crashes, machine failures etc).
This tool helps to restore the leadership balance between the brokers in the cluster.
A summary of the steps that the tool does is shown below -


. The tool updates the zookeeper path "/admin/preferred_replica_election" with the list of topic partitions
whose leader needs to be moved to the preferred replica.


. The controller listens to the path above.
When a data change update is triggered,
the controller reads the list of topic partitions from zookeeper.


. For each topic partition,
the controller gets the preferred replica
(the first replica in the assigned replicas list).


If the preferred replica is not already the leader and it is present in the isr,
the controller issues a request to the broker
that owns the preferred replica to become the leader for the partition.

请注意,这个工具只更新zookeeper路径 。控制器异步地把一个partition的首选副本变成leader。

Note that the tool only updates the zookeeper path and exits.
The controller moves the leader for a partition to the preferred replica asynchronously.

How to use the tool?

bin/kafka-preferred-replica-election.sh --zookeeper localhost:/kafka --path-to-json-file topicPartitionList.json


The tool takes a mandatory list of zookeeper hosts and an optional list of topic partitions provided as a json file. If the list is not provided, the tool queries zookeeper and 
gets all the topic partitions for the cluster. The tool exits after updating the zookeeper path "/admin/preferred_replica_election" with the topic partition list.


Example json file (This is optional. This can be specified to move the leader to the preferred replica for specific topic partitions)
{"topic": "topic1", "partition": ""},
{"topic": "topic1", "partition": ""},
{"topic": "topic1", "partition": ""}, {"topic": "topic2", "partition": ""},
{"topic": "topic2", "partition": ""},

