从消费者看 rebalance

kafka java 客户端发送请求，大量使用 RequestFuture，因此先说明下该类。

RequestFuture 类的成员属性 listeners 是 RequestFutureListener 的集合，调用 complete 方法，会触发 listener 的 onSuccess 方法。

public void complete(T value) {

    try {

        if (value instanceof RuntimeException)

            throw new IllegalArgumentException("The argument to complete can not be an instance of RuntimeException");

        if (!result.compareAndSet(INCOMPLETE_SENTINEL, value))

            throw new IllegalStateException("Invalid attempt to complete a request future which is already complete");

        fireSuccess();

    } finally {

        completedLatch.countDown();

    }

}

private void fireSuccess() {

    T value = value();

    while (true) {

        RequestFutureListener<T> listener = listeners.poll();

        if (listener == null)

            break;

        listener.onSuccess(value);

    }

}

值得关注的是 compose 和 chain 方法，这两个方法均是为当前 RequestFuture 添加 listener，listener 的 onSuccess 又是调用另一个 RequestFuture 的方法。

public <S> RequestFuture<S> compose(final RequestFutureAdapter<T, S> adapter) {

    // 创建新的 RequestFuture 对象

    final RequestFuture<S> adapted = new RequestFuture<>();

    // 为旧的 RequestFuture 添加 listener

    addListener(new RequestFutureListener<T>() {

        @Override

        public void onSuccess(T value) {

            adapter.onSuccess(value, adapted);

        }

        @Override

        public void onFailure(RuntimeException e) {

            adapter.onFailure(e, adapted);

        }

    });

    // 返回新的 RequestFuture 对象

    return adapted;

}

public void chain(final RequestFuture<T> future) {

    // 为当前 RequestFuture 添加 listener

    addListener(new RequestFutureListener<T>() {

        @Override

        public void onSuccess(T value) {

            future.complete(value);

        }

        @Override

        public void onFailure(RuntimeException e) {

            future.raise(e);

        }

    });

}

rebalance 入口在 ConsumerCoordinator#poll

客户端判断是否需要重新加入组，即 rebalance

//ConsumerCoordinator#needRejoin

public boolean needRejoin() {

    if (!subscriptions.partitionsAutoAssigned())

        return false;

    // 所订阅 topic 的分区数量发生变化

    // we need to rejoin if we performed the assignment and metadata has changed

    if (assignmentSnapshot != null && !assignmentSnapshot.equals(metadataSnapshot))

        return true;

    // 所订阅的 topic 发生变化

    // we need to join if our subscription has changed since the last join

    if (joinedSubscription != null && !joinedSubscription.equals(subscriptions.subscription()))

        return true;

    // 消费者加入组，或退出组，由心跳线程设置 rejoinNeeded = true

    return super.needRejoin();

}

消费者开始 rebalance

// AbstractCoordinator#joinGroupIfNeeded

void joinGroupIfNeeded() {

    while (needRejoin() || rejoinIncomplete()) {

        ensureCoordinatorReady();

        if (needsJoinPrepare) {

            // 调用用户传入的 ConsumerRebalanceListener

            onJoinPrepare(generation.generationId, generation.memberId);

            needsJoinPrepare = false;

        }

        // 发送 join group 的请求

        RequestFuture<ByteBuffer> future = initiateJoinGroup();

        client.poll(future);

        if (future.succeeded()) {

            onJoinComplete(generation.generationId, generation.memberId, generation.protocol, future.value());

            resetJoinGroupFuture();

            needsJoinPrepare = true;

        } else {

            resetJoinGroupFuture();

            RuntimeException exception = future.exception();

            if (exception instanceof UnknownMemberIdException ||

                    exception instanceof RebalanceInProgressException ||

                    exception instanceof IllegalGenerationException)

                continue;

            else if (!future.isRetriable())

                throw exception;

            time.sleep(retryBackoffMs);

        }

    }

}

AbstractCoordinator#initiateJoinGroup

private synchronized RequestFuture<ByteBuffer> initiateJoinGroup() {

    if (joinFuture == null) {

        disableHeartbeatThread();

        state = MemberState.REBALANCING;

        joinFuture = sendJoinGroupRequest();

        joinFuture.addListener(new RequestFutureListener<ByteBuffer>() {

            @Override

            public void onSuccess(ByteBuffer value) {

                // handle join completion in the callback so that the callback will be invoked

                // even if the consumer is woken up before finishing the rebalance

                synchronized (AbstractCoordinator.this) {

                    log.info("Successfully joined group with generation {}", generation.generationId);

                    state = MemberState.STABLE;

                    rejoinNeeded = false;

                    if (heartbeatThread != null)

                        heartbeatThread.enable();

                }

            }

            @Override

            public void onFailure(RuntimeException e) {

                // we handle failures below after the request finishes. if the join completes

                // after having been woken up, the exception is ignored and we will rejoin

                synchronized (AbstractCoordinator.this) {

                    state = MemberState.UNJOINED;

                }

            }

        });

    }

    return joinFuture;

}

AbstractCoordinator#sendJoinGroupRequest

private RequestFuture<ByteBuffer> sendJoinGroupRequest() {

    if (coordinatorUnknown())

        return RequestFuture.coordinatorNotAvailable();

    // send a join group request to the coordinator

    log.info("(Re-)joining group");

    JoinGroupRequest.Builder requestBuilder = new JoinGroupRequest.Builder(

            groupId,

            this.sessionTimeoutMs,

            this.generation.memberId,

            protocolType(),

            metadata()).setRebalanceTimeout(this.rebalanceTimeoutMs);

    log.debug("Sending JoinGroup ({}) to coordinator {}", requestBuilder, this.coordinator);

    return client.send(coordinator, requestBuilder)

            .compose(new JoinGroupResponseHandler());

}

重点关注 client.send(coordinator, requestBuilder).compose(new JoinGroupResponseHandler());
为老的 RequestFuture 添加 listener，返回新的 RequestFuture

ConsumerNetworkClient#send

public RequestFuture<ClientResponse> send(Node node, AbstractRequest.Builder<?> requestBuilder) {

    long now = time.milliseconds();

    // 使用 RequestFutureCompletionHandler 作为回调函数

    RequestFutureCompletionHandler completionHandler = new RequestFutureCompletionHandler();

    ClientRequest clientRequest = client.newClientRequest(node.idString(), requestBuilder, now, true,

            completionHandler);

    unsent.put(node, clientRequest);

    // wakeup the client in case it is blocking in poll so that we can send the queued request

    client.wakeup();

    return completionHandler.future;

}

JoinGroupResponseHandler#handle

public void handle(JoinGroupResponse joinResponse, RequestFuture<ByteBuffer> future) {

    Errors error = joinResponse.error();

    if (error == Errors.NONE) {

        log.debug("Received successful JoinGroup response: {}", joinResponse);

        sensors.joinLatency.record(response.requestLatencyMs());

        synchronized (AbstractCoordinator.this) {

            if (state != MemberState.REBALANCING) {

                // if the consumer was woken up before a rebalance completes, we may have already left

                // the group. In this case, we do not want to continue with the sync group.

                future.raise(new UnjoinedGroupException());

            } else {

                AbstractCoordinator.this.generation = new Generation(joinResponse.generationId(),

                        joinResponse.memberId(), joinResponse.groupProtocol());

                if (joinResponse.isLeader()) {

                    onJoinLeader(joinResponse).chain(future);

                } else {

                    onJoinFollower().chain(future);

                }

            }

        }

    } else if (error == Errors.COORDINATOR_LOAD_IN_PROGRESS) {

        log.debug("Attempt to join group rejected since coordinator {} is loading the group.", coordinator());

        // backoff and retry

        future.raise(error);

    } else if (error == Errors.UNKNOWN_MEMBER_ID) {

        // reset the member id and retry immediately

        resetGeneration();

        log.debug("Attempt to join group failed due to unknown member id.");

        future.raise(Errors.UNKNOWN_MEMBER_ID);

    } else if (error == Errors.COORDINATOR_NOT_AVAILABLE

            || error == Errors.NOT_COORDINATOR) {

        // re-discover the coordinator and retry with backoff

        markCoordinatorUnknown();

        log.debug("Attempt to join group failed due to obsolete coordinator information: {}", error.message());

        future.raise(error);

    } else if (error == Errors.INCONSISTENT_GROUP_PROTOCOL

            || error == Errors.INVALID_SESSION_TIMEOUT

            || error == Errors.INVALID_GROUP_ID) {

        // log the error and re-throw the exception

        log.error("Attempt to join group failed due to fatal error: {}", error.message());

        future.raise(error);

    } else if (error == Errors.GROUP_AUTHORIZATION_FAILED) {

        future.raise(new GroupAuthorizationException(groupId));

    } else {

        // unexpected error, throw the exception

        future.raise(new KafkaException("Unexpected error in join group response: " + error.message()));

    }

}

收到响应后，最终的执行流是 RequestFutureCompletionHandler -> JoinGroupResponseHandler#handle

private RequestFuture<ByteBuffer> onJoinLeader(JoinGroupResponse joinResponse) {

    try {

        // perform the leader synchronization and send back the assignment for the group

        Map<String, ByteBuffer> groupAssignment = performAssignment(joinResponse.leaderId(), joinResponse.groupProtocol(),

                joinResponse.members());

        SyncGroupRequest.Builder requestBuilder =

                new SyncGroupRequest.Builder(groupId, generation.generationId, generation.memberId, groupAssignment);

        log.debug("Sending leader SyncGroup to coordinator {}: {}", this.coordinator, requestBuilder);

        return sendSyncGroupRequest(requestBuilder);

    } catch (RuntimeException e) {

        return RequestFuture.failure(e);

    }

}

private RequestFuture<ByteBuffer> onJoinFollower() {

    // send follower's sync group with an empty assignment

    SyncGroupRequest.Builder requestBuilder =

            new SyncGroupRequest.Builder(groupId, generation.generationId, generation.memberId,

                    Collections.<String, ByteBuffer>emptyMap());

    log.debug("Sending follower SyncGroup to coordinator {}: {}", this.coordinator, requestBuilder);

    return sendSyncGroupRequest(requestBuilder);

}

private RequestFuture<ByteBuffer> sendSyncGroupRequest(SyncGroupRequest.Builder requestBuilder) {

    if (coordinatorUnknown())

        return RequestFuture.coordinatorNotAvailable();

    return client.send(coordinator, requestBuilder)

            .compose(new SyncGroupResponseHandler());

}

用 RequestFuture 把 JoinGroupResponseHandler 和 SyncGroupResponseHandler 串联起来了

private class SyncGroupResponseHandler extends CoordinatorResponseHandler<SyncGroupResponse, ByteBuffer> {

    @Override

    public void handle(SyncGroupResponse syncResponse,

                       RequestFuture<ByteBuffer> future) {

        Errors error = syncResponse.error();

        if (error == Errors.NONE) {

            sensors.syncLatency.record(response.requestLatencyMs());

            future.complete(syncResponse.memberAssignment());

        } else {

            requestRejoin();

            if (error == Errors.GROUP_AUTHORIZATION_FAILED) {

                future.raise(new GroupAuthorizationException(groupId));

            } else if (error == Errors.REBALANCE_IN_PROGRESS) {

                log.debug("SyncGroup failed because the group began another rebalance");

                future.raise(error);

            } else if (error == Errors.UNKNOWN_MEMBER_ID

                    || error == Errors.ILLEGAL_GENERATION) {

                log.debug("SyncGroup failed: {}", error.message());

                resetGeneration();

                future.raise(error);

            } else if (error == Errors.COORDINATOR_NOT_AVAILABLE

                    || error == Errors.NOT_COORDINATOR) {

                log.debug("SyncGroup failed: {}", error.message());

                markCoordinatorUnknown();

                future.raise(error);

            } else {

                future.raise(new KafkaException("Unexpected error from SyncGroup: " + error.message()));

            }

        }

    }

}

rebalance 过程最后的 listener

joinFuture.addListener(new RequestFutureListener<ByteBuffer>() {

    @Override

    public void onSuccess(ByteBuffer value) {

        // handle join completion in the callback so that the callback will be invoked

        // even if the consumer is woken up before finishing the rebalance

        synchronized (AbstractCoordinator.this) {

            log.info("Successfully joined group with generation {}", generation.generationId);

            state = MemberState.STABLE;

            rejoinNeeded = false;

            if (heartbeatThread != null)

                heartbeatThread.enable();

        }

    }

    @Override

    public void onFailure(RuntimeException e) {

        // we handle failures below after the request finishes. if the join completes

        // after having been woken up, the exception is ignored and we will rejoin

        synchronized (AbstractCoordinator.this) {

            state = MemberState.UNJOINED;

        }

    }

});

从消费者看 rebalance的更多相关文章

OpenStack_Swift源代码分析——Ring的rebalance算法源代码具体分析
1 Command类中的rebalnace方法在上篇文章中解说了,创建Ring已经为Ring加入设备.在加入设备后须要对Ring进行平衡,平衡 swift-ring-builder object.b ...
RocketMQ 消费者
本文分析 DefaultMQPushConsumer,异步发送消息,多线程消费的情形. DefaultMQPushConsumerImpl MQClientInstance 一个客户端进程只有一个 M ...
kafka消费者offset存储策略
由于 consumer 在消费过程中可能会出现断电宕机等故障,consumer 恢复后,需要从故障前的位置的继续消费,所以 consumer 需要实时记录自己消费到了哪个 offset,以便故障恢 ...
Kafka Rebalance机制和选举策略总结
自建博客地址:https://www.bytelife.net,欢迎访问! 本文为博客同步发表文章,为了更好的阅读体验,建议您移步至我的博客本文作者: Jeffrey 本文链接: https://w ...
Kafka 0.8源码分析—ZookeeperConsumerConnector
1.HighLevelApi High Level Api是多线程的应用程序,以Topic的Partition数量为中心.消费的规则如下: 一个partition只能被同一个ConsumersGrou ...
RocketMQ之十：RocketMQ消息接收源码
1. 简介 1.1.接收消息 RebalanceService:均衡消息队列服务,负责通过MQClientInstance分配当前 Consumer 可消费的消息队列( MessageQueue ). ...
Kafka学习笔记（四）—— API使用
Kafka学习笔记(四)-- API使用 1.Producer API 1.1 消息发送流程 Kafka的Producer发送消息采用的是异步发送的方式.在消息发送的过程中,涉及到了两个线程--mai ...
【原创】美团二面：聊聊你对 Kafka Consumer 的架构设计
在上一篇中我们详细聊了关于 Kafka Producer 内部的底层原理设计思想和细节, 本篇我们主要来聊聊 Kafka Consumer 即消费者的内部底层原理设计思想. 1.Consumer之总体 ...
ASM磁盘组扩容流程
环境:RHEL 6.5 + GI 11.2.0.4 + Oracle 11.2.0.4 1.确认磁盘权限正确 2.图形界面配置 3.启用asmca配置 4.修改磁盘组rebalance power级别 ...

随机推荐

redis、rabitmq对比
redis.rabitmq对比原文地址简要介绍 RabbitMQ RabbitMQ是实现AMQP(高级消息队列协议)的消息中间件的一种,最初起源于金融系统,用于在分布式系统中存储转发消息,在易用性 ...
vue-resource对比axios import ... from和import {} from 的区别 element-ui
1.vue-resource对比axios 文章1 文章2 1.0 axios params 配置参数在url 显示,form-data 用于图片上传.文件上传 1.1 axios 全局配置 ax ...
jquery-mobile pop
一.弹框代码: <!DOCTYPE html> <html> <head> <meta charset="utf-8"> < ...
WLAN AutoConfig服务无法开机自动启动
又到“618”大促销,商家搞活动,买了一只小无线网卡,刚装上,一切正常.重新启动电脑后,发现无线网卡已被禁用!手工启用无线网卡也不能解决.到“计算机管理”-“服务”中将“WLAN Autoconfig ...
生成树计数及应用 Matrix-Tree
例:给定一个图,图上每条边是红色或蓝色求恰好有K条红边的生成树的个数,N<=50. Matrix-Tree定理对于限制条件可以利用多项式,把红边边权设为X,蓝边边权设为1. 最后求行列式得到 ...
SCC统计
Kosoraju SCC总数及记录SCC所需要的最少边情况 #include<cstdio> ; ; ][N], nxt[][N], v[][N], ed, q[N], t, vis[N] ...
安装kibana可视化平台工具
1.安装kibana 命令: wget https://artifacts.elastic.co/downloads/kibana/kibana-5.5.0-linux-x86_64.tar.gz ...
如何使用windows performance recorder
先下载WPA TOOLS:从该地址下载,选最新的版本,然后可以只选择下载WPA工具后面编写XML文件等等,可以参考这篇文章. 需要注意: 用管理员启动cmd后,如果想运行特定路径的文件,需要带上绝对 ...
别再误解MySQL和「幻读」了
The so-called phantom problem occurs within a transaction when the same query produces different set ...
LOJ-6279-数列分块入门3(分块, 二分)
链接: https://loj.ac/problem/6279 题意: 给出一个长为的数列,以及个操作,操作涉及区间加法,询问区间内小于某个值的前驱(比其小的最大元素). 思路: 同样的分块加二 ...

从消费者看 rebalance

从消费者看 rebalance的更多相关文章

随机推荐

热门专题