flink-connector-kafka consumer的topic分区分配源码
转载请注明原创地址 http://www.cnblogs.com/dongxiao-yang/p/7200599.html
flink官方提供了连接kafka的connector实现,由于调试的时候发现部分消费行为与预期不太一致,所以需要研究一下源码。
flink-connector-kafka目前已有kafka 0.8、0.9、0.10三个版本的实现,本文以FlinkKafkaConsumer010版本代码为例。
FlinkKafkaConsumer010类的父类继承关系如下,FlinkKafkaConsumerBase包含了大多数实现。
FlinkKafkaConsumer010<T> extends FlinkKafkaConsumer09<T> extends FlinkKafkaConsumerBase<T>
其中每个版本的FlinkKafkaConsumerBase内部都实现了一个对应的AbstractFetcher用来拉取kafka数据,继承关系如下
Kafka010Fetcher<T> extends Kafka09Fetcher<T>extends AbstractFetcher<T, TopicPartition>
FlinkKafkaConsumerBase类定义如下,继承了RichParallelSourceFunction和CheckpointedFunction等接口。
public abstract class FlinkKafkaConsumerBase<T> extends RichParallelSourceFunction<T> implements
CheckpointListener,
ResultTypeQueryable<T>,
CheckpointedFunction,
CheckpointedRestoring<HashMap<KafkaTopicPartition, Long>> {
FlinkKafkaConsumer内部各方法的执行细节
initializeState
public void initializeState(FunctionInitializationContext context) throws Exception { OperatorStateStore stateStore = context.getOperatorStateStore();
offsetsStateForCheckpoint = stateStore.getSerializableListState(DefaultOperatorStateBackend.DEFAULT_OPERATOR_STATE_NAME); if (context.isRestored()) {
if (restoredState == null) {
restoredState = new HashMap<>();
for (Tuple2<KafkaTopicPartition, Long> kafkaOffset : offsetsStateForCheckpoint.get()) {
restoredState.put(kafkaOffset.f0, kafkaOffset.f1);
} LOG.info("Setting restore state in the FlinkKafkaConsumer.");
if (LOG.isDebugEnabled()) {
LOG.debug("Using the following offsets: {}", restoredState);
}
}
if (restoredState != null && restoredState.isEmpty()) {
restoredState = null;
}
} else {
LOG.info("No restore state for FlinkKafkaConsumer.");
}
}
根据运行日志,initializeState会在flinkkafkaconusmer初始化的时候最先调用,方法通过运行时上下文FunctionSnapshotContext调用getOperatorStateStore和getSerializableListState拿到了checkpoint里面的state对象,如果这个task是从失败等过程中恢复的过程中,context.isRestored()会被判定为true,程序会试图从flink checkpoint里获取原来分配到的kafka partition以及最后提交完成的offset。
open
public void open(Configuration configuration) {
// determine the offset commit mode
offsetCommitMode = OffsetCommitModes.fromConfiguration(
getIsAutoCommitEnabled(),
enableCommitOnCheckpoints,
((StreamingRuntimeContext) getRuntimeContext()).isCheckpointingEnabled()); switch (offsetCommitMode) {
case ON_CHECKPOINTS:
LOG.info("Consumer subtask {} will commit offsets back to Kafka on completed checkpoints.",
getRuntimeContext().getIndexOfThisSubtask());
break;
case KAFKA_PERIODIC:
LOG.info("Consumer subtask {} will commit offsets back to Kafka periodically using the Kafka client's auto commit.",
getRuntimeContext().getIndexOfThisSubtask());
break;
default:
case DISABLED:
LOG.info("Consumer subtask {} has disabled offset committing back to Kafka." +
" This does not compromise Flink's checkpoint integrity.",
getRuntimeContext().getIndexOfThisSubtask());
} // initialize subscribed partitions
List<KafkaTopicPartition> kafkaTopicPartitions = getKafkaPartitions(topics);
Preconditions.checkNotNull(kafkaTopicPartitions, "TopicPartitions must not be null."); subscribedPartitionsToStartOffsets = new HashMap<>(kafkaTopicPartitions.size()); if (restoredState != null) {
for (KafkaTopicPartition kafkaTopicPartition : kafkaTopicPartitions) {
if (restoredState.containsKey(kafkaTopicPartition)) {
subscribedPartitionsToStartOffsets.put(kafkaTopicPartition, restoredState.get(kafkaTopicPartition));
}
} LOG.info("Consumer subtask {} will start reading {} partitions with offsets in restored state: {}",
getRuntimeContext().getIndexOfThisSubtask(), subscribedPartitionsToStartOffsets.size(), subscribedPartitionsToStartOffsets);
} else {
initializeSubscribedPartitionsToStartOffsets(
subscribedPartitionsToStartOffsets,
kafkaTopicPartitions,
getRuntimeContext().getIndexOfThisSubtask(),
getRuntimeContext().getNumberOfParallelSubtasks(),
startupMode,
specificStartupOffsets); if (subscribedPartitionsToStartOffsets.size() != 0) {
switch (startupMode) {
case EARLIEST:
LOG.info("Consumer subtask {} will start reading the following {} partitions from the earliest offsets: {}",
getRuntimeContext().getIndexOfThisSubtask(),
subscribedPartitionsToStartOffsets.size(),
subscribedPartitionsToStartOffsets.keySet());
break;
case LATEST:
LOG.info("Consumer subtask {} will start reading the following {} partitions from the latest offsets: {}",
getRuntimeContext().getIndexOfThisSubtask(),
subscribedPartitionsToStartOffsets.size(),
subscribedPartitionsToStartOffsets.keySet());
break;
case SPECIFIC_OFFSETS:
LOG.info("Consumer subtask {} will start reading the following {} partitions from the specified startup offsets {}: {}",
getRuntimeContext().getIndexOfThisSubtask(),
subscribedPartitionsToStartOffsets.size(),
specificStartupOffsets,
subscribedPartitionsToStartOffsets.keySet()); List<KafkaTopicPartition> partitionsDefaultedToGroupOffsets = new ArrayList<>(subscribedPartitionsToStartOffsets.size());
for (Map.Entry<KafkaTopicPartition, Long> subscribedPartition : subscribedPartitionsToStartOffsets.entrySet()) {
if (subscribedPartition.getValue() == KafkaTopicPartitionStateSentinel.GROUP_OFFSET) {
partitionsDefaultedToGroupOffsets.add(subscribedPartition.getKey());
}
} if (partitionsDefaultedToGroupOffsets.size() > 0) {
LOG.warn("Consumer subtask {} cannot find offsets for the following {} partitions in the specified startup offsets: {}" +
"; their startup offsets will be defaulted to their committed group offsets in Kafka.",
getRuntimeContext().getIndexOfThisSubtask(),
partitionsDefaultedToGroupOffsets.size(),
partitionsDefaultedToGroupOffsets);
}
break;
default:
case GROUP_OFFSETS:
LOG.info("Consumer subtask {} will start reading the following {} partitions from the committed group offsets in Kafka: {}",
getRuntimeContext().getIndexOfThisSubtask(),
subscribedPartitionsToStartOffsets.size(),
subscribedPartitionsToStartOffsets.keySet());
}
}
}
}
open方法会在initializeState技术后调用,主要逻辑分为几个步骤
1 判断offsetCommitMode。根据kafka的auto commit ,setCommitOffsetsOnCheckpoints()的值(默认为true)以及flink运行时有没有开启checkpoint三个参数的组合,
offsetCommitMode共有三种模式:ON_CHECKPOINTS checkpoint结束后提交offset;KAFKA_PERIODIC kafkaconsumer自带的定期提交功能;DISABLED 不提交
2 分配kafka partition 。如果initializeState阶段已经拿到了state之前存储的partition,直接继续读取对应的分区,如果是第一次初始化,调initializeSubscribedPartitionsToStartOffsets
方法计算当前task对应的分区列表
protected static void initializeSubscribedPartitionsToStartOffsets(
Map<KafkaTopicPartition, Long> subscribedPartitionsToStartOffsets,
List<KafkaTopicPartition> kafkaTopicPartitions,
int indexOfThisSubtask,
int numParallelSubtasks,
StartupMode startupMode,
Map<KafkaTopicPartition, Long> specificStartupOffsets) { for (int i = 0; i < kafkaTopicPartitions.size(); i++) {
if (i % numParallelSubtasks == indexOfThisSubtask) {
if (startupMode != StartupMode.SPECIFIC_OFFSETS) {
subscribedPartitionsToStartOffsets.put(kafkaTopicPartitions.get(i), startupMode.getStateSentinel());
} else {
if (specificStartupOffsets == null) {
throw new IllegalArgumentException(
"Startup mode for the consumer set to " + StartupMode.SPECIFIC_OFFSETS +
", but no specific offsets were specified");
} KafkaTopicPartition partition = kafkaTopicPartitions.get(i); Long specificOffset = specificStartupOffsets.get(partition);
if (specificOffset != null) {
// since the specified offsets represent the next record to read, we subtract
// it by one so that the initial state of the consumer will be correct
subscribedPartitionsToStartOffsets.put(partition, specificOffset - 1);
} else {
subscribedPartitionsToStartOffsets.put(partition, KafkaTopicPartitionStateSentinel.GROUP_OFFSET);
}
}
}
}
}
可以看到,flink采用分区号逐个对flink并发任务数量取余的方式来分配partition,如果i % numParallelSubtasks == indexOfThisSubtask,那么这个i分区就归属当前分区拥有。
partition的分区结果记录在私有变量Map<KafkaTopicPartition, Long> subscribedPartitionsToStartOffsets 里,用于后续初始化consumer。
run方法
@Override
public void run(SourceContext<T> sourceContext) throws Exception {
if (subscribedPartitionsToStartOffsets == null) {
throw new Exception("The partitions were not set for the consumer");
} // we need only do work, if we actually have partitions assigned
if (!subscribedPartitionsToStartOffsets.isEmpty()) { // create the fetcher that will communicate with the Kafka brokers
final AbstractFetcher<T, ?> fetcher = createFetcher(
sourceContext,
subscribedPartitionsToStartOffsets,
periodicWatermarkAssigner,
punctuatedWatermarkAssigner,
(StreamingRuntimeContext) getRuntimeContext(),
offsetCommitMode); // publish the reference, for snapshot-, commit-, and cancel calls
// IMPORTANT: We can only do that now, because only now will calls to
// the fetchers 'snapshotCurrentState()' method return at least
// the restored offsets
this.kafkaFetcher = fetcher;
if (!running) {
return;
} // (3) run the fetcher' main work method
fetcher.runFetchLoop();
}
else {
// this source never completes, so emit a Long.MAX_VALUE watermark
// to not block watermark forwarding
sourceContext.emitWatermark(new Watermark(Long.MAX_VALUE)); // wait until this is canceled
final Object waitLock = new Object();
while (running) {
try {
//noinspection SynchronizationOnLocalVariableOrMethodParameter
synchronized (waitLock) {
waitLock.wait();
}
}
catch (InterruptedException e) {
if (!running) {
// restore the interrupted state, and fall through the loop
Thread.currentThread().interrupt();
}
}
}
}
}
可以看到计算好的subscribedPartitionsToStartOffsets被传到了拥有consumerThread的AbstractFetcher实例内部,KafkaConsumerThread通过调用consumerCallBridge.assignPartitions(consumer, convertKafkaPartitions(subscribedPartitionStates));方法最终调用到了consumer.assign(topicPartitions);手动向consumer实例指定了topic分配。
参考文档:
flink-connector-kafka consumer的topic分区分配源码的更多相关文章
- Kafka消费分组和分区分配策略
Kafka消费分组,消息消费原理 同一个消费组里的消费者不能消费同一个分区,不同消费组的消费组可以消费同一个分区 Kafka分区分配策略 在 Kafka 内部存在两种默认的分区分配策略:Range 和 ...
- kafka 0.8.1 新producer 源码简单分析
1 背景 最近由于项目需要,需要使用kafka的producer.但是对于c++,kafka官方并没有很好的支持. 在kafka官网上可以找到0.8.x的客户端.可以使用的客户端有C版本客户端,此客户 ...
- Kafka服务端之网络连接源码分析
#### 简介 上次我们通过分析KafkaProducer的源码了解了生产端的主要流程,今天学习下服务端的网络层主要做了什么,先看下 KafkaServer的整体架构图 ![file](https:/ ...
- Flink中接收端反压以及Credit机制 (源码分析)
先上一张图整体了解Flink中的反压 可以看到每个task都会有自己对应的IG(inputgate)对接上游发送过来的数据和RS(resultPatation)对接往下游发送数据, 整个反压机制通 ...
- Flink 如何通过2PC实现Exactly-once语义 (源码分析)
Flink通过全局快照能保证内部处理的Exactly-once语义 但是端到端的Exactly-once还需要下游数据源配合,常见的通过幂等或者二阶段提交这两种方式保证 这里就来分析一下Sink二阶段 ...
- 从flink-example分析flink组件(3)WordCount 流式实战及源码分析
前面介绍了批量处理的WorkCount是如何执行的 <从flink-example分析flink组件(1)WordCount batch实战及源码分析> <从flink-exampl ...
- Flink中TaskManager端执行用户逻辑过程(源码分析)
TaskManager接收到来自JobManager的jobGraph转换得到的TDD对象,启动了任务,在StreamInputProcessor类的processInput()方法中 通过一个whi ...
- Flink Sql 之 Calcite Volcano优化器(源码解析)
Calcite作为大数据领域最常用的SQL解析引擎,支持Flink , hive, kylin , druid等大型项目的sql解析 同时想要深入研究Flink sql源码的话calcite也是必备 ...
- Spark(二)【sc.textfile的分区策略源码分析】
sparkcontext.textFile()返回的是HadoopRDD! 关于HadoopRDD的官方介绍,使用的是旧版的hadoop api ctrl+F12搜索 HadoopRDD的getPar ...
随机推荐
- RobotFramework操作浏览器滚动条
RobotFramework操作浏览器滚动条 (2016-12-21 11:52:43) 转载▼ 标签: selenium it 分类: 自动化测试 其实只要是用多了selenium+webdrive ...
- REST SOAP XML-RPC分析比较
本文的标题“REST与SOAP之比较”确实有些让人误解.REST是代表性状态传输的名称首字母缩写,与其说它是标准,不如说是一种风格.然而,在我的前一篇文章中,正如我们所讨论的,众多从事Web服务的软件 ...
- gnuplot加速比比较图
1)使用gnuplot画图代码如下: :] :] set xlabel "分片数" set ylabel "加速比" plot : w lp pt ps tit ...
- [读书笔记]iOS 7 UI设计 对比度
好久没写随笔了,最近在读<iOS 7 byTutorials>,很不错,推荐给大家. 每一个好的程序员也都是一个设计师,不懂设计的程序员不是好的CTO.哈哈,开个小玩笑. iOS 7设计的 ...
- NDK之HelloWord!
使用工具:Android Studio 2.2.2 1. 配置local.properties添加NDK路径. 效果:当然,你也可以手输写进去. 2. 项目gradle.properties追加 ...
- Ubuntu 16.04下将ISO镜像制作成U盘启动的工具-UNetbootin(UltraISO的替代工具)
说明: 1.在Windows下制作ISO镜像的U盘启动工具有很多,但是在Linux平台下估计就只有UNetbootin这个工具最好用了,效果和Windows下的制作方法差不多,但是这个工具只能针对Li ...
- NHibernate官方文档中文版——持久化类(Persistent Classes)
持久化类是一个应用程序中的类,主要用来实现业务逻辑(例如,在电商应用中的客户和订单类).持久化类,就像它的名字一样,生命周期短暂并且用来持久化的据库对象实例. 如果这些类的构造能够依照一些简单的原则, ...
- 分析器错误 未能加载类型“XX.WebApiApplication”
解决方案,删除bin目录下内容(有单独使用dll的删除前请先备份) 清理解决方案并重新生成
- 小二助手(react应用框架)-概述
时间想学习一个前端框架,原因是这样的,我本身是做游戏的,但是自己对前端web比较感兴趣. 然后我就选择自己学哪个框架,Angular.react.vue 最后选择了react,选择的理由就不说了 那做 ...
- CentOS 7 yum安装失败问题
在CentOS 7中,执行yum安装,一直报错,错误信息如下 其实在上述的错误信息中,上述中的repodata/repomd.xml文件据说是/mnt目录rpm包的目录,路径 在/mnt中因为没有/r ...