kafka 心跳和 reblance
kafka 的心跳是 kafka consumer 和 broker 之间的健康检查,只有当 broker coordinator 正常时,consumer 才会发送心跳。
consumer 和 rebalance 相关的 2 个配置参数:
- 参数名 --> MemberMetadata 字段
- session.timeout.ms --> MemberMetadata.sessionTimeoutMs
- max.poll.interval.ms --> MemberMetadata.rebalanceTimeoutMs
broker 端,sessionTimeoutMs 参数
broker 处理心跳的逻辑在 GroupCoordinator 类中:如果心跳超期, broker coordinator 会把消费者从 group 中移除,并触发 rebalance。
- private def completeAndScheduleNextHeartbeatExpiration(group: GroupMetadata, member: MemberMetadata) {
- // complete current heartbeat expectation
- member.latestHeartbeat = time.milliseconds()
- val memberKey = MemberKey(member.groupId, member.memberId)
- heartbeatPurgatory.checkAndComplete(memberKey)
- // reschedule the next heartbeat expiration deadline
- // 计算心跳截止时刻
- val newHeartbeatDeadline = member.latestHeartbeat + member.sessionTimeoutMs
- val delayedHeartbeat = new DelayedHeartbeat(this, group, member, newHeartbeatDeadline, member.sessionTimeoutMs)
- heartbeatPurgatory.tryCompleteElseWatch(delayedHeartbeat, Seq(memberKey))
- }
- // 心跳过期
- def onExpireHeartbeat(group: GroupMetadata, member: MemberMetadata, heartbeatDeadline: Long) {
- group.inLock {
- if (!shouldKeepMemberAlive(member, heartbeatDeadline)) {
- info(s"Member ${member.memberId} in group ${group.groupId} has failed, removing it from the group")
- removeMemberAndUpdateGroup(group, member)
- }
- }
- }
- private def shouldKeepMemberAlive(member: MemberMetadata, heartbeatDeadline: Long) =
- member.awaitingJoinCallback != null ||
- member.awaitingSyncCallback != null ||
- member.latestHeartbeat + member.sessionTimeoutMs > heartbeatDeadline
consumer 端:sessionTimeoutMs,rebalanceTimeoutMs 参数
如果客户端发现心跳超期,客户端会标记 coordinator 为不可用,并阻塞心跳线程;如果超过了 poll 消息的间隔超过了 rebalanceTimeoutMs,则 consumer 告知 broker 主动离开消费组,也会触发 rebalance
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.HeartbeatThread 代码片段:
- if (coordinatorUnknown()) {
- if (findCoordinatorFuture != null || lookupCoordinator().failed())
- // the immediate future check ensures that we backoff properly in the case that no
- // brokers are available to connect to.
- AbstractCoordinator.this.wait(retryBackoffMs);
- } else if (heartbeat.sessionTimeoutExpired(now)) {
- // the session timeout has expired without seeing a successful heartbeat, so we should
- // probably make sure the coordinator is still healthy.
- markCoordinatorUnknown();
- } else if (heartbeat.pollTimeoutExpired(now)) {
- // the poll timeout has expired, which means that the foreground thread has stalled
- // in between calls to poll(), so we explicitly leave the group.
- maybeLeaveGroup();
- } else if (!heartbeat.shouldHeartbeat(now)) {
- // poll again after waiting for the retry backoff in case the heartbeat failed or the
- // coordinator disconnected
- AbstractCoordinator.this.wait(retryBackoffMs);
- } else {
- heartbeat.sentHeartbeat(now);
- sendHeartbeatRequest().addListener(new RequestFutureListener<Void>() {
- @Override
- public void onSuccess(Void value) {
- synchronized (AbstractCoordinator.this) {
- heartbeat.receiveHeartbeat(time.milliseconds());
- }
- }
- @Override
- public void onFailure(RuntimeException e) {
- synchronized (AbstractCoordinator.this) {
- if (e instanceof RebalanceInProgressException) {
- // it is valid to continue heartbeating while the group is rebalancing. This
- // ensures that the coordinator keeps the member in the group for as long
- // as the duration of the rebalance timeout. If we stop sending heartbeats,
- // however, then the session timeout may expire before we can rejoin.
- heartbeat.receiveHeartbeat(time.milliseconds());
- } else {
- heartbeat.failHeartbeat();
- // wake up the thread if it's sleeping to reschedule the heartbeat
- AbstractCoordinator.this.notify();
- }
- }
- }
- });
- }
- /**
- * A helper class for managing the heartbeat to the coordinator
- */
- public final class Heartbeat {
- private final long sessionTimeout;
- private final long heartbeatInterval;
- private final long maxPollInterval;
- private final long retryBackoffMs;
- private volatile long lastHeartbeatSend; // volatile since it is read by metrics
- private long lastHeartbeatReceive;
- private long lastSessionReset;
- private long lastPoll;
- private boolean heartbeatFailed;
- public Heartbeat(long sessionTimeout,
- long heartbeatInterval,
- long maxPollInterval,
- long retryBackoffMs) {
- if (heartbeatInterval >= sessionTimeout)
- throw new IllegalArgumentException("Heartbeat must be set lower than the session timeout");
- this.sessionTimeout = sessionTimeout;
- this.heartbeatInterval = heartbeatInterval;
- this.maxPollInterval = maxPollInterval;
- this.retryBackoffMs = retryBackoffMs;
- }
- public void poll(long now) {
- this.lastPoll = now;
- }
- public void sentHeartbeat(long now) {
- this.lastHeartbeatSend = now;
- this.heartbeatFailed = false;
- }
- public void failHeartbeat() {
- this.heartbeatFailed = true;
- }
- public void receiveHeartbeat(long now) {
- this.lastHeartbeatReceive = now;
- }
- public boolean shouldHeartbeat(long now) {
- return timeToNextHeartbeat(now) == 0;
- }
- public long lastHeartbeatSend() {
- return this.lastHeartbeatSend;
- }
- public long timeToNextHeartbeat(long now) {
- long timeSinceLastHeartbeat = now - Math.max(lastHeartbeatSend, lastSessionReset);
- final long delayToNextHeartbeat;
- if (heartbeatFailed)
- delayToNextHeartbeat = retryBackoffMs;
- else
- delayToNextHeartbeat = heartbeatInterval;
- if (timeSinceLastHeartbeat > delayToNextHeartbeat)
- return 0;
- else
- return delayToNextHeartbeat - timeSinceLastHeartbeat;
- }
- public boolean sessionTimeoutExpired(long now) {
- return now - Math.max(lastSessionReset, lastHeartbeatReceive) > sessionTimeout;
- }
- public long interval() {
- return heartbeatInterval;
- }
- public void resetTimeouts(long now) {
- this.lastSessionReset = now;
- this.lastPoll = now;
- this.heartbeatFailed = false;
- }
- public boolean pollTimeoutExpired(long now) {
- return now - lastPoll > maxPollInterval;
- }
- }
join group 的处理逻辑:kafka.coordinator.group.GroupCoordinator#onCompleteJoin
kafka 心跳和 reblance的更多相关文章
- kafka consumer 分区reblance算法
转载请注明原创地址 http://www.cnblogs.com/dongxiao-yang/p/6238029.html 最近需要详细研究下kafka reblance过程中分区计算的算法细节,网上 ...
- kafka consumer频繁reblance
转载请注明地址http://www.cnblogs.com/dongxiao-yang/p/5417956.html 结论与下文相同,kafka不同topic的consumer如果用的groupid名 ...
- Kafka知识总结及面试题
目录 概念 Kafka基础概念 命令行 Kafka 数据存储设计 kafka在zookeeper中存储结构 生产者 生产者设计 消费者 消费者设计 面试题 kafka设计 请说明什么是Apache K ...
- Kafka集成SparkStreaming
Spark Streaming + Kafka集成指南 Kafka项目在版本0.8和0.10之间引入了一个新的消费者API,因此有两个独立的相应Spark Streaming包可用.请选择正确的包, ...
- kafka Auto offset commit faild reblance
今天在使用python消费kafka时遇到了一些问题, 特记录一下. 场景一. 特殊情况: 单独写程序只用来生产消费数据 开始时间: 10:42 Topic: t_facedec Partition: ...
- Kafka消费与心跳机制
1.概述 最近有同学咨询Kafka的消费和心跳机制,今天笔者将通过这篇博客来逐一介绍这些内容. 2.内容 2.1 Kafka消费 首先,我们来看看消费.Kafka提供了非常简单的消费API,使用者只需 ...
- Kafka技术内幕 读书笔记之(四) 新消费者——心跳任务
消费者拉取数据是在拉取器中完成的,发送心跳是在消费者的协调者上完成的,但并不是说拉取器和消费者的协调者就没有关联关系 . “消费者的协调者”的作用是确保客户端的消费者和服务端的协调者之间的正常通信,如 ...
- Kafka:主要参数详解(转)
原文地址:http://kafka.apache.org/documentation.html ############################# System ############### ...
- Kafka主要参数详解(转)
原文档地址:http://kafka.apache.org/documentation.html ############################# System ############## ...
随机推荐
- SpringBoot+Thymeleaf问题
springboot在controller返回数据到thymeleaf报404 用springboot做一个例子,访问controller可以返回数据,但是到thymeleaf却报404, 检查发现路 ...
- 类成员(static)和final修饰符
在Java类里只能包含成员变量.方法.构造器.初始化块.内部类(包括接口.枚举)5种成员,类成员是用static来修饰的,其属于整个类. 当使用实例来访问类成员时,实际上依然是委托给该类来访问类成员, ...
- Derivative of Softmax Loss Function
Derivative of Softmax Loss Function A softmax classifier: \[ p_j = \frac{\exp{o_j}}{\sum_{k}\exp{o_k ...
- CMT302 Coursework Assessment Pro-forma
Cardiff School of Computer Science and Informa5csCoursework Assessment Pro-formaModule Code: CMT302 ...
- elasticsearch搭建并通过go-mysql-elasticsearch同步db数据达到搜索引擎的目的
logstash-input-jdbc/elasticsearch-jdbc缺点:删除记录没有办法同步,只能两边执行自己的删除命令,版本16年后未更新. go-mysql-elasticsearch缺 ...
- 模板 RMQ问题ST表实现/单调队列
RMQ (Range Minimum/Maximum Query)问题是指: 对于长度为n的数列A,回答若干询问RMQ(A,i,j)(i,j<=n),返回数列A中下标在i,j里的最小(大)值,R ...
- python实现单例模式的三种方式及相关知识解释
python实现单例模式的三种方式及相关知识解释 模块模式 装饰器模式 父类重写new继承 单例模式作为最常用的设计模式,在面试中很可能遇到要求手写.从最近的学习python的经验而言,singlet ...
- Python3 tkinter基础 Scale orient 横竖 resolution单步步长 length 长度 tickinterval 指示刻度
Python : 3.7.0 OS : Ubuntu 18.04.1 LTS IDE : PyCharm 2018.2.4 Conda ...
- C#中的反射解析及使用(转)
原文:https://cloud.tencent.com/developer/article/1129356 1.对C#反射机制的理解 2.概念理解后,必须找到方法去完成,给出管理的主要语法 3.最终 ...
- MySQL之 视图,触发器,存储过程,函数,事物,数据库锁,数据库备份
1.视图 视图: 是一个虚拟表,其内容由查询定义: 视图有如下特点; 1. 视图的列可以来自不同的表,是表的抽象和逻辑意义上建立的新关系. 2. 视图是由基本表(实表)产生的表(虚表). 3. ...