kafka 的心跳是 kafka consumer 和 broker 之间的健康检查,只有当 broker coordinator 正常时,consumer 才会发送心跳。

consumer 和 rebalance 相关的 2 个配置参数:

  1. 参数名 --> MemberMetadata 字段
  2. session.timeout.ms --> MemberMetadata.sessionTimeoutMs
  3. max.poll.interval.ms --> MemberMetadata.rebalanceTimeoutMs

broker 端,sessionTimeoutMs 参数

broker 处理心跳的逻辑在 GroupCoordinator 类中:如果心跳超期, broker coordinator 会把消费者从 group 中移除,并触发 rebalance。

  1. private def completeAndScheduleNextHeartbeatExpiration(group: GroupMetadata, member: MemberMetadata) {
  2. // complete current heartbeat expectation
  3. member.latestHeartbeat = time.milliseconds()
  4. val memberKey = MemberKey(member.groupId, member.memberId)
  5. heartbeatPurgatory.checkAndComplete(memberKey)
  6.  
  7. // reschedule the next heartbeat expiration deadline
  8. // 计算心跳截止时刻
  9. val newHeartbeatDeadline = member.latestHeartbeat + member.sessionTimeoutMs
  10. val delayedHeartbeat = new DelayedHeartbeat(this, group, member, newHeartbeatDeadline, member.sessionTimeoutMs)
  11. heartbeatPurgatory.tryCompleteElseWatch(delayedHeartbeat, Seq(memberKey))
  12. }
  13.  
  14. // 心跳过期
  15. def onExpireHeartbeat(group: GroupMetadata, member: MemberMetadata, heartbeatDeadline: Long) {
  16. group.inLock {
  17. if (!shouldKeepMemberAlive(member, heartbeatDeadline)) {
  18. info(s"Member ${member.memberId} in group ${group.groupId} has failed, removing it from the group")
  19. removeMemberAndUpdateGroup(group, member)
  20. }
  21. }
  22. }
  23.  
  24. private def shouldKeepMemberAlive(member: MemberMetadata, heartbeatDeadline: Long) =
  25. member.awaitingJoinCallback != null ||
  26. member.awaitingSyncCallback != null ||
  27. member.latestHeartbeat + member.sessionTimeoutMs > heartbeatDeadline

consumer 端:sessionTimeoutMs,rebalanceTimeoutMs 参数

如果客户端发现心跳超期,客户端会标记 coordinator 为不可用,并阻塞心跳线程;如果超过了 poll 消息的间隔超过了 rebalanceTimeoutMs,则 consumer 告知 broker 主动离开消费组,也会触发 rebalance

org.apache.kafka.clients.consumer.internals.AbstractCoordinator.HeartbeatThread 代码片段:

  1. if (coordinatorUnknown()) {
  2. if (findCoordinatorFuture != null || lookupCoordinator().failed())
  3. // the immediate future check ensures that we backoff properly in the case that no
  4. // brokers are available to connect to.
  5. AbstractCoordinator.this.wait(retryBackoffMs);
  6. } else if (heartbeat.sessionTimeoutExpired(now)) {
  7. // the session timeout has expired without seeing a successful heartbeat, so we should
  8. // probably make sure the coordinator is still healthy.
  9. markCoordinatorUnknown();
  10. } else if (heartbeat.pollTimeoutExpired(now)) {
  11. // the poll timeout has expired, which means that the foreground thread has stalled
  12. // in between calls to poll(), so we explicitly leave the group.
  13. maybeLeaveGroup();
  14. } else if (!heartbeat.shouldHeartbeat(now)) {
  15. // poll again after waiting for the retry backoff in case the heartbeat failed or the
  16. // coordinator disconnected
  17. AbstractCoordinator.this.wait(retryBackoffMs);
  18. } else {
  19. heartbeat.sentHeartbeat(now);
  20.  
  21. sendHeartbeatRequest().addListener(new RequestFutureListener<Void>() {
  22. @Override
  23. public void onSuccess(Void value) {
  24. synchronized (AbstractCoordinator.this) {
  25. heartbeat.receiveHeartbeat(time.milliseconds());
  26. }
  27. }
  28.  
  29. @Override
  30. public void onFailure(RuntimeException e) {
  31. synchronized (AbstractCoordinator.this) {
  32. if (e instanceof RebalanceInProgressException) {
  33. // it is valid to continue heartbeating while the group is rebalancing. This
  34. // ensures that the coordinator keeps the member in the group for as long
  35. // as the duration of the rebalance timeout. If we stop sending heartbeats,
  36. // however, then the session timeout may expire before we can rejoin.
  37. heartbeat.receiveHeartbeat(time.milliseconds());
  38. } else {
  39. heartbeat.failHeartbeat();
  40.  
  41. // wake up the thread if it's sleeping to reschedule the heartbeat
  42. AbstractCoordinator.this.notify();
  43. }
  44. }
  45. }
  46. });
  47. }
  1. /**
  2. * A helper class for managing the heartbeat to the coordinator
  3. */
  4. public final class Heartbeat {
  5. private final long sessionTimeout;
  6. private final long heartbeatInterval;
  7. private final long maxPollInterval;
  8. private final long retryBackoffMs;
  9.  
  10. private volatile long lastHeartbeatSend; // volatile since it is read by metrics
  11. private long lastHeartbeatReceive;
  12. private long lastSessionReset;
  13. private long lastPoll;
  14. private boolean heartbeatFailed;
  15.  
  16. public Heartbeat(long sessionTimeout,
  17. long heartbeatInterval,
  18. long maxPollInterval,
  19. long retryBackoffMs) {
  20. if (heartbeatInterval >= sessionTimeout)
  21. throw new IllegalArgumentException("Heartbeat must be set lower than the session timeout");
  22.  
  23. this.sessionTimeout = sessionTimeout;
  24. this.heartbeatInterval = heartbeatInterval;
  25. this.maxPollInterval = maxPollInterval;
  26. this.retryBackoffMs = retryBackoffMs;
  27. }
  28.  
  29. public void poll(long now) {
  30. this.lastPoll = now;
  31. }
  32.  
  33. public void sentHeartbeat(long now) {
  34. this.lastHeartbeatSend = now;
  35. this.heartbeatFailed = false;
  36. }
  37.  
  38. public void failHeartbeat() {
  39. this.heartbeatFailed = true;
  40. }
  41.  
  42. public void receiveHeartbeat(long now) {
  43. this.lastHeartbeatReceive = now;
  44. }
  45.  
  46. public boolean shouldHeartbeat(long now) {
  47. return timeToNextHeartbeat(now) == 0;
  48. }
  49.  
  50. public long lastHeartbeatSend() {
  51. return this.lastHeartbeatSend;
  52. }
  53.  
  54. public long timeToNextHeartbeat(long now) {
  55. long timeSinceLastHeartbeat = now - Math.max(lastHeartbeatSend, lastSessionReset);
  56. final long delayToNextHeartbeat;
  57. if (heartbeatFailed)
  58. delayToNextHeartbeat = retryBackoffMs;
  59. else
  60. delayToNextHeartbeat = heartbeatInterval;
  61.  
  62. if (timeSinceLastHeartbeat > delayToNextHeartbeat)
  63. return 0;
  64. else
  65. return delayToNextHeartbeat - timeSinceLastHeartbeat;
  66. }
  67.  
  68. public boolean sessionTimeoutExpired(long now) {
  69. return now - Math.max(lastSessionReset, lastHeartbeatReceive) > sessionTimeout;
  70. }
  71.  
  72. public long interval() {
  73. return heartbeatInterval;
  74. }
  75.  
  76. public void resetTimeouts(long now) {
  77. this.lastSessionReset = now;
  78. this.lastPoll = now;
  79. this.heartbeatFailed = false;
  80. }
  81.  
  82. public boolean pollTimeoutExpired(long now) {
  83. return now - lastPoll > maxPollInterval;
  84. }
  85.  
  86. }

join group 的处理逻辑:kafka.coordinator.group.GroupCoordinator#onCompleteJoin

kafka 心跳和 reblance的更多相关文章

  1. kafka consumer 分区reblance算法

    转载请注明原创地址 http://www.cnblogs.com/dongxiao-yang/p/6238029.html 最近需要详细研究下kafka reblance过程中分区计算的算法细节,网上 ...

  2. kafka consumer频繁reblance

    转载请注明地址http://www.cnblogs.com/dongxiao-yang/p/5417956.html 结论与下文相同,kafka不同topic的consumer如果用的groupid名 ...

  3. Kafka知识总结及面试题

    目录 概念 Kafka基础概念 命令行 Kafka 数据存储设计 kafka在zookeeper中存储结构 生产者 生产者设计 消费者 消费者设计 面试题 kafka设计 请说明什么是Apache K ...

  4. Kafka集成SparkStreaming

    Spark Streaming + Kafka集成指南 Kafka项目在版本0.8和0.10之间引入了一个新的消费者API,因此有两个独立的相应Spark Streaming包可用.请选择正确的包,  ...

  5. kafka Auto offset commit faild reblance

    今天在使用python消费kafka时遇到了一些问题, 特记录一下. 场景一. 特殊情况: 单独写程序只用来生产消费数据 开始时间: 10:42 Topic: t_facedec Partition: ...

  6. Kafka消费与心跳机制

    1.概述 最近有同学咨询Kafka的消费和心跳机制,今天笔者将通过这篇博客来逐一介绍这些内容. 2.内容 2.1 Kafka消费 首先,我们来看看消费.Kafka提供了非常简单的消费API,使用者只需 ...

  7. Kafka技术内幕 读书笔记之(四) 新消费者——心跳任务

    消费者拉取数据是在拉取器中完成的,发送心跳是在消费者的协调者上完成的,但并不是说拉取器和消费者的协调者就没有关联关系 . “消费者的协调者”的作用是确保客户端的消费者和服务端的协调者之间的正常通信,如 ...

  8. Kafka:主要参数详解(转)

    原文地址:http://kafka.apache.org/documentation.html ############################# System ############### ...

  9. Kafka主要参数详解(转)

    原文档地址:http://kafka.apache.org/documentation.html ############################# System ############## ...

随机推荐

  1. SpringBoot+Thymeleaf问题

    springboot在controller返回数据到thymeleaf报404 用springboot做一个例子,访问controller可以返回数据,但是到thymeleaf却报404, 检查发现路 ...

  2. 类成员(static)和final修饰符

    在Java类里只能包含成员变量.方法.构造器.初始化块.内部类(包括接口.枚举)5种成员,类成员是用static来修饰的,其属于整个类. 当使用实例来访问类成员时,实际上依然是委托给该类来访问类成员, ...

  3. Derivative of Softmax Loss Function

    Derivative of Softmax Loss Function A softmax classifier: \[ p_j = \frac{\exp{o_j}}{\sum_{k}\exp{o_k ...

  4. CMT302 Coursework Assessment Pro-forma

    Cardiff School of Computer Science and Informa5csCoursework Assessment Pro-formaModule Code: CMT302 ...

  5. elasticsearch搭建并通过go-mysql-elasticsearch同步db数据达到搜索引擎的目的

    logstash-input-jdbc/elasticsearch-jdbc缺点:删除记录没有办法同步,只能两边执行自己的删除命令,版本16年后未更新. go-mysql-elasticsearch缺 ...

  6. 模板 RMQ问题ST表实现/单调队列

    RMQ (Range Minimum/Maximum Query)问题是指: 对于长度为n的数列A,回答若干询问RMQ(A,i,j)(i,j<=n),返回数列A中下标在i,j里的最小(大)值,R ...

  7. python实现单例模式的三种方式及相关知识解释

    python实现单例模式的三种方式及相关知识解释 模块模式 装饰器模式 父类重写new继承 单例模式作为最常用的设计模式,在面试中很可能遇到要求手写.从最近的学习python的经验而言,singlet ...

  8. Python3 tkinter基础 Scale orient 横竖 resolution单步步长 length 长度 tickinterval 指示刻度

             Python : 3.7.0          OS : Ubuntu 18.04.1 LTS         IDE : PyCharm 2018.2.4       Conda ...

  9. C#中的反射解析及使用(转)

    原文:https://cloud.tencent.com/developer/article/1129356 1.对C#反射机制的理解 2.概念理解后,必须找到方法去完成,给出管理的主要语法 3.最终 ...

  10. MySQL之 视图,触发器,存储过程,函数,事物,数据库锁,数据库备份

    1.视图 视图: 是一个虚拟表,其内容由查询定义: 视图有如下特点;  1. 视图的列可以来自不同的表,是表的抽象和逻辑意义上建立的新关系.  2. 视图是由基本表(实表)产生的表(虚表).  3. ...