Flink 中的kafka何时commit?
https://ci.apache.org/projects/flink/flink-docs-release-1.6/internals/stream_checkpointing.html
@Override
publicfinalvoidnotifyCheckpointComplete(longcheckpointId)throwsException{
if(!running){
LOG.debug("notifyCheckpointComplete()calledonclosedsource");
return;
}
finalAbstractFetcher<?,?>fetcher=this.kafkaFetcher;
if(fetcher==null){
LOG.debug("notifyCheckpointComplete()calledonuninitializedsource");
return;
}
if(offsetCommitMode==OffsetCommitMode.ON_CHECKPOINTS){
//onlyonecommitoperationmustbeinprogress
if(LOG.isDebugEnabled()){
LOG.debug("CommittingoffsetstoKafka/ZooKeeperforcheckpoint"+checkpointId);
}
try{
finalintposInMap=pendingOffsetsToCommit.indexOf(checkpointId);
if(posInMap==-1){
LOG.warn("Receivedconfirmationforunknowncheckpointid{}",checkpointId);
return;
}
@SuppressWarnings("unchecked")
Map<KafkaTopicPartition,Long>offsets=
(Map<KafkaTopicPartition,Long>)pendingOffsetsToCommit.remove(posInMap);
//removeoldercheckpointsinmap
for(inti=0;i<posInMap;i++){
pendingOffsetsToCommit.remove(0);
}
if(offsets==null||offsets.size()==0){
LOG.debug("Checkpointstatewasempty.");
return;
}
fetcher.commitInternalOffsetsToKafka(offsets,offsetCommitCallback);
}catch(Exceptione){
if(running){
throwe;
}
//elseignoreexceptionifwearenolongerrunning
}
}
}
/**
*Theoffsetcommitmoderepresentsthebehaviourofhowoffsetsareexternallycommitted
*backtoKafkabrokers/Zookeeper.
*
*<p>Theexactvalueofthisisdeterminedatruntimeintheconsumersubtasks.
*/
@Internal
publicenumOffsetCommitMode{
/**Completelydisableoffsetcommitting.*/
DISABLED,
/**CommitoffsetsbacktoKafkaonlywhencheckpointsarecompleted.*/
ON_CHECKPOINTS,
/**CommitoffsetsperiodicallybacktoKafka,usingtheautocommitfunctionalityofinternalKafkaclients.*/
KAFKA_PERIODIC;
}
/**
*CommitsthegivenpartitionoffsetstotheKafkabrokers(ortoZooKeeperfor
*olderKafkaversions).Thismethodisonlyevercalledwhentheoffsetcommitmodeof
*theconsumeris{@linkOffsetCommitMode#ON_CHECKPOINTS}.
*
*<p>Thegivenoffsetsaretheinternalcheckpointedoffsets,representing
*thelastprocessedrecordofeachpartition.Version-specificimplementationsofthismethod
*needtoholdthecontractthatthegivenoffsetsmustbeincrementedby1before
*committingthem,sothatcommittedoffsetstoKafkarepresent"thenextrecordtoprocess".
*
*@paramoffsetsTheoffsetstocommittoKafka(implementationsmustincrementoffsetsby1beforecommitting).
*@paramcommitCallbackThecallbackthattheusershouldtriggerwhenacommitrequestcompletesorfails.
*@throwsExceptionThismethodforwardsexceptions.
*/
publicfinalvoidcommitInternalOffsetsToKafka(
Map<KafkaTopicPartition,Long>offsets,
@NonnullKafkaCommitCallbackcommitCallback)throwsException{
//Ignoresentinels.Theymightappearhereifsnapshothasstartedbeforeactualoffsetsvalues
//replacedsentinels
doCommitInternalOffsetsToKafka(filterOutSentinels(offsets),commitCallback);
}
/**
* Invoking this method makes all buffered records immediately available to send (even if <code>linger.ms</code> is
* greater than 0) and blocks on the completion of the requests associated with these records. The post-condition
* of <code>flush()</code> is that any previously sent record will have completed (e.g. <code>Future.isDone() == true</code>).
* A request is considered completed when it is successfully acknowledged
* according to the <code>acks</code> configuration you have specified or else it results in an error.
* <p>
* Other threads can continue sending records while one thread is blocked waiting for a flush call to complete,
* however no guarantee is made about the completion of records sent after the flush call begins.
* <p>
* This method can be useful when consuming from some input system and producing into Kafka. The <code>flush()</code> call
* gives a convenient way to ensure all previously sent messages have actually completed.
* <p>
* This example shows how to consume from one Kafka topic and produce to another Kafka topic:
* <pre>
* {@code
* for(ConsumerRecord<String, String> record: consumer.poll(100))
* producer.send(new ProducerRecord("my-topic", record.key(), record.value());
* producer.flush();
* consumer.commit();
* }
* </pre>
*
* Note that the above example may drop records if the produce request fails. If we want to ensure that this does not occur
* we need to set <code>retries=<large_number></code> in our config.
* </p>
* <p>
* Applications don't need to call this method for transactional producers, since the {@link #commitTransaction()} will
* flush all buffered records before performing the commit. This ensures that all the the {@link #send(ProducerRecord)}
* calls made since the previous {@link #beginTransaction()} are completed before the commit.
* </p>
*
* @throws InterruptException If the thread is interrupted while blocked
*/
@Override
public void flush() {
log.trace("Flushing accumulated records in producer.");
this.accumulator.beginFlush();
this.sender.wakeup();
try {
this.accumulator.awaitFlushCompletion();
} catch (InterruptedException e) {
throw new InterruptException("Flush interrupted.", e);
}
}
Flink 中的kafka何时commit?的更多相关文章
- 在flink中使用jackson JSONKeyValueDeserializationSchema反序列化Kafka消息报错解决
在做支付订单宽表的场景,需要关联的表比较多而且支付有可能要延迟很久,这种情况下不太适合使用Flink的表Join,想到的另外一种解决方案是消费多个Topic的数据,再根据订单号进行keyBy,再在逻辑 ...
- flink⼿手动维护kafka偏移量量
flink对接kafka,官方模式方式是自动维护偏移量 但并没有考虑到flink消费kafka过程中,如果出现进程中断后的事情! 如果此时,进程中段: 1:数据可能丢失 从获取了了数据,但是在执⾏行行 ...
- Flink中的Time
戳更多文章: 1-Flink入门 2-本地环境搭建&构建第一个Flink应用 3-DataSet API 4-DataSteam API 5-集群部署 6-分布式缓存 7-重启策略 8-Fli ...
- spark streaming中维护kafka偏移量到外部介质
spark streaming中维护kafka偏移量到外部介质 以kafka偏移量维护到redis为例. redis存储格式 使用的数据结构为string,其中key为topic:partition, ...
- Apache Flink中的广播状态实用指南
感谢英文原文作者:https://data-artisans.com/blog/a-practical-guide-to-broadcast-state-in-apache-flink 不过,原文最近 ...
- Flink学习(二)Flink中的时间
摘自Apache Flink官网 最早的streaming 架构是storm的lambda架构 分为三个layer batch layer serving layer speed layer 一.在s ...
- 《从0到1学习Flink》—— Flink 中几种 Time 详解
前言 Flink 在流程序中支持不同的 Time 概念,就比如有 Processing Time.Event Time 和 Ingestion Time. 下面我们一起来看看这几个 Time: Pro ...
- 《从0到1学习Flink》—— 介绍Flink中的Stream Windows
前言 目前有许多数据分析的场景从批处理到流处理的演变, 虽然可以将批处理作为流处理的特殊情况来处理,但是分析无穷集的流数据通常需要思维方式的转变并且具有其自己的术语(例如,"windowin ...
- Flink 从0到1学习 —— Flink 中如何管理配置?
前言 如果你了解 Apache Flink 的话,那么你应该熟悉该如何像 Flink 发送数据或者如何从 Flink 获取数据.但是在某些情况下,我们需要将配置数据发送到 Flink 集群并从中接收一 ...
随机推荐
- std::accumulate使用的一个小细节
今天使用std::accumulate模板函数的时候出现了一个错误,特此记录一下. #include <iostream> #include <numeric> int mai ...
- NPOI 中的公式列的值的获取
方法1 NPOI 中,对 sheet 对象设置 ForceFormulaRecalculation = true,即可实现自动将 Excel 的公式计算出来. 方法2 循环 Excel 的行.列,取出 ...
- Spring Cloud环境搭建: Eureka Server
项目目录结构, 总共三个文件 ├── pom.xml └── src ├── main │ ├── java │ │ └── com │ │ └── rockbb │ │ ...
- Retrofit2完全教程
本文注目录: Retrofit入门 Retrofit注解详解 Gson与Converter RxJava与CallAdapter 自定义Converter 自定义CallAdapter 其它说明 前言 ...
- windows下Oracle Tuxedo编译应用前需要配置的相关环境变量
rem (c) BEA Systems, Inc. All Rights Reserved. rem Copyright (c) BEA Systems, Inc. rem All Rights Re ...
- Easyui + asp.net MVC 系列教程 第19-23 节 完成注销 登录限制过滤 添加用户
前面视频 文章地址 Easyui + asp.net MVC 系列教程 第09-17 节 完成登录 高清录制 Easyui + asp.net mvc + sqlite 开发教程(录屏)适合入门 ...
- Emacs的sr-speedbar中使能Go-mode
sr-speedbar使用了speedbar的文件检索功能,但是Emacs24自带的speedbar不支持go文件预览,下面是在speedbar中使能go-mode的一种方法: 1,按F10启动菜单栏 ...
- POJ 3678 Katu Puzzle (经典2-Sat)
Katu Puzzle Time Limit: 1000MS Memory Limit: 65536K Total Submissions: 6553 Accepted: 2401 Descr ...
- bootstrap fileinput 文件上传
最近因为项目需要研究了下bootstrap fileinput的使用,来记录下这几天的使用心得吧. 前台html页面的代码 <form role="form" id=&quo ...
- WinForm DataGridView新增加行
1.不显示最下面的新行 通常 DataGridView 的最下面一行是用户新追加的行(行头显示 * ).如果不想让用户新追加行即不想显示该新行,可以将 DataGridView 对象的 Allow ...