kafka消息分发策略分析

当我们使用kafka向指定Topic发送消息时，如果该Topic具有多个partition，无论消费者有多少，最终都会保证一个partition内的消息只会被一个Consumer group中的一个Consumer消费，也就是说同一Consumer group中的多个Consumer自动会起到负载均衡的效果。

1、消息构造

下面我们就针对调用kafka API发送消息到Topic时partition的分配策略，分析下其内部具体的源码码实现。

首先看下kafka API中消息体ProducerRecord类的构造函数，可以看到构造消息时可指定该消息要发送的Topic、partition、key、value等关键信息。

    /**

     * Creates a record to be sent to a specified topic and partition

     *

     * @param topic The topic the record will be appended to

     * @param partition The partition to which the record should be sent

     * @param key The key that will be included in the record

     * @param value The record contents

     * @param headers The headers that will be included in the record

     */

    public ProducerRecord(String topic, Integer partition, K key, V value, Iterable<Header> headers) {

        this(topic, partition, null, key, value, headers);

    }

    /**

     * Creates a record to be sent to a specified topic and partition

     *

     * @param topic The topic the record will be appended to

     * @param partition The partition to which the record should be sent

     * @param key The key that will be included in the record

     * @param value The record contents

     */

    public ProducerRecord(String topic, Integer partition, K key, V value) {

        this(topic, partition, null, key, value, null);

    }

    /**

     * Create a record to be sent to Kafka

     *

     * @param topic The topic the record will be appended to

     * @param key The key that will be included in the record

     * @param value The record contents

     */

    public ProducerRecord(String topic, K key, V value) {

        this(topic, null, null, key, value, null);

    }

2、分发策略

在实际使用中，我们一般不会指定消息发送的具体partition，最多只会传入key值，类似下面这种方式：

producer.send(new ProducerRecord<Object, Object>(topic, key, data));

而kafka也会根据你传入key的hash值，通过取余的方法，尽可能保证消息能够相对均匀的分摊到每个可用的partition上；

下面是kafka内部默认的分发策略：

public class DefaultPartitioner implements Partitioner {

    private final ConcurrentMap<String, AtomicInteger> topicCounterMap = new ConcurrentHashMap<>();

    public void configure(Map<String, ?> configs) {}

    /**

     * Compute the partition for the given record.

     *

     * @param topic The topic name

     * @param key The key to partition on (or null if no key)

     * @param keyBytes serialized key to partition on (or null if no key)

     * @param value The value to partition on or null

     * @param valueBytes serialized value to partition on or null

     * @param cluster The current cluster metadata

     */

    public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {

        //获取该topic的分区列表

        List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);

        int numPartitions = partitions.size();

        //如果key值为null

        if (keyBytes == null) {

            //维护一个key为topic的ConcurrentHashMap，并通过CAS操作的方式对value值执行递增+1操作

            int nextValue = nextValue(topic);

            //获取该topic的可用分区列表

            List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);

            if (availablePartitions.size() > 0) {//如果可用分区大于0

                //执行求余操作，保证消息落在可用分区上

                int part = Utils.toPositive(nextValue) % availablePartitions.size();

                return availablePartitions.get(part).partition();

            } else {

                // 没有可用分区的话，就给出一个不可用分区

                return Utils.toPositive(nextValue) % numPartitions;

            }

        } else {

            // 通过计算key的hash，确定消息分区

            return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;

        }

    }

    private int nextValue(String topic) {

        //获取一个AtomicInteger对象

        AtomicInteger counter = topicCounterMap.get(topic);

        if (null == counter) {//如果为空

            //生成一个随机数

            counter = new AtomicInteger(ThreadLocalRandom.current().nextInt());

            //维护到topicCounterMap中

            AtomicInteger currentCounter = topicCounterMap.putIfAbsent(topic, counter);

            if (currentCounter != null) {

                counter = currentCounter;

            }

        }

        //返回值并执行递增

        return counter.getAndIncrement();

    }

    public void close() {}

}

3、自定义负载策略

我们也可以通过实现Partitioner接口，自定义分发策略，看下具体实现

自定义实现Partitioner接口

/**

 * 自定义实现Partitioner接口

 *

 */

public class KeyPartitioner implements Partitioner {

    /**

     * 实现具体分发策略

     */

    @Override

    public int partition(String topic, Object key, byte[] bytes, Object o1, byte[] bytes1, Cluster cluster) {

        List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);//拉取可用的partition

        if (key == null||key.equals("")) {

            int random =  (int) (Math.random() * 10);

            int part = random % availablePartitions.size();

            return availablePartitions.get(part).partition();

        }

        return  Math.abs(key.toString().hashCode() % 6);

    }

    @Override

    public void configure(Map<String, ?> configs) {

        // TODO Auto-generated method stub

    }

    @Override

    public void close() {

        // TODO Auto-generated method stub

    }

}

同时在初始化kafka生产者时，增加自定义配置

Properties properties = new Properties();

properties.put(ProducerConfig.PARTITIONER_CLASS_CONFIG,KeyPartitioner.class); //加入自定义的配置

producer = new KafkaProducer<Object, Object>(properties);

4、总结

以上是对kafka消息分发的策略进行一定的分析与自定义扩展，希望对大家在使用kafka时有所帮助，其中如有不足与不正确的地方还望指出与海涵。

关注微信公众号，查看更多技术文章。

kafka消息分发策略分析的更多相关文章

RabbitMQ，RocketMQ，Kafka 消息模型对比分析
消息模型消息队列的演进消息队列模型发布订阅模型 RabbitMQ的消息模型交换器的类型 direct topic fanout headers Kafka的消息模型 RocketMQ的消息模型 ...
apollo 消息分发源代码分析
1.MessageDispatch消息分发信息 public static final byte DATA_STRUCTURE_TYPE = CommandTypes.MESSAGE_DISPATCH ...
Kafka分区分配策略分析——重点：StickyAssignor
“ 为什么Kafka在RangeAssigor.RoundRobinAssignor的基础上,又新增了PartitionAssignor,它解决了什么问题?” 背景用过Kafka的同学应该都知道Ka ...
Storm 消息分发策略
1.Shuffle Grouping:随机分组,随机派发stream里面的tuple,保证每个bolt接收到的tuple数目相同.2.Fields Grouping:按字段分组,比如按userid来分 ...
kafka消息的分发与消费
关于 Topic 和 Partition: Topic: 在 kafka 中,topic 是一个存储消息的逻辑概念,可以认为是一个消息集合.每条消息发送到 kafka 集群的消息都有一个类别.物理上来 ...
Kafka分片存储、消息分发和持久化机制
Kafka 分片存储机制 Broker:消息中间件处理结点,一个 Kafka 节点就是一个 broker,多个 broker 可以组成一个 Kafka集群. Topic:一类消息,例如 page vi ...
Kafka学习笔记（二）：Partition分发策略
kafka版本0.8.2.1 Java客户端版本0.9.0.0 为了更好的实现负载均衡和消息的顺序性,Kafka Producer可以通过分发策略发送给指定的Partition.Kafka保证在par ...
源码分析 Kafka 消息发送流程(文末附流程图)
温馨提示:本文基于 Kafka 2.2.1 版本.本文主要是以源码的手段一步一步探究消息发送流程,如果对源码不感兴趣,可以直接跳到文末查看消息发送流程图与消息发送本地缓存存储结构. 从上文初识 Ka ...
源码分析 Kafka 消息发送流程
Futuresend(ProducerRecord<K, V> record) Futuresend(ProducerRecord<K, V> record, Callback ...

随机推荐

那些年,想和你一起认识的SpringCloud Eureka
前几天鲁班LB跟我说:你玩把游戏都要半个钟啦,为何不用这时间来看看书,如果涨工资还可以帮我买个皮肤. 面对如此合理的这需求,但我不以为然,事实上并不是我不想学习,而是 ↓ 实力不允许呀~ 直到有一天, ...
R语言学习笔记——C#中如何使用R语言setwd()函数
在R语言编译器中,设置当前工作文件夹可以用setwd()函数. > setwd("e://桌面//")> setwd("e:\桌面\")> s ...
Android 开发使用自定义字体
有时候,系统自带的字体并不能满足我们特殊的需求,这时候就需要引用其他的字体了,可以把下载的字体文件放在 assets 目录下. 自定义字体文件不能使用xml代码读取而应该使用java代码: publi ...
第三章 Linux基本命令操作
第三章 Linux基本命令操作 ¨ 本节所讲内容: ¨ 3.1 Linux终端介绍 Shell提示符 Bash Shell基本语法 ¨ 3.2 基本命令的使用:ls.pwd.cd.hist ...
Elasticsearch索引增量统计及定时邮件实现
0.需求随着ELKStack在应用系统中的数据规模的急剧增长,每天千万级别数据量(存储大小:10000000*10k/1024/1024=95.37GB,假设单条数据10kB,实际远大于10KB)的 ...
imageloader+图片压缩
public class MainActivity extends AppCompatActivity { private ImageView ivIcon; @Override protected ...
二.Google黑客语法
搜索也是一门艺术! 说起Google,可谓是无人不知无人不晓,其强大的搜索功能,可以让你在瞬间找到你想要的一切.对于黑客而言,Google可是一款绝佳的黑客工具.正因Google强大的检索能力,黑客 ...
优雅的在WinForm/WPF/控制台中使用特性封装WebApi
优雅的在WinForm/WPF/控制台中使用特性封装WebApi 说明在C/S端作为Server,建立HTTP请求,方便快捷. 1.使用到的类库 Newtonsoft.dll 2.封装 HttpL ...
cmd中，查询sqlcmd命令的选项
像我这样的小白,有时候看到-d,-S,-P这些都不知道什么意思,后面知道了是一些命令的选项.如sqlcmd,打开cmd,输入sqlcmd -? 即可获得选项的含义. .
K8S学习笔记之filebeat采集K8S微服务java堆栈多行日志
0x00 背景 K8S内运行Spring Cloud微服务,根据定制容器架构要求log文件不落地,log全部输出到std管道,由基于docker的filebeat去管道采集,然后发往Kafka或者ES ...

kafka消息分发策略分析

kafka消息分发策略分析的更多相关文章

随机推荐

热门专题