Why do Kafka consumers connect to zookeeper, and producers get metadata from brokers?


|
Why is it that consumers connect to zookeeper to retrieve the partition locations? And kafka producers have to connect to one of the brokers to retrieve metadata. My point is, what exactly is the use of zookeeper when every broker already has all the necessary metadata to tell producers the location to send their messages? Couldn't the brokers send this same information to the consumers? I can understand why brokers have the metadata, to not have to make a connection to zookeeper each time a new message is sent to them. Is there a function that zookeeper has that I'm missing? I'm finding it hard to think of a reason why zookeeper is really needed within a kafka cluster. |
|||
|
First of all, zookeeper is needed only for high level consumer. The main reason zookeeper is needed for a high level consumer is to track consumed offsets and handle load balancing. Now in more detail. Regarding offset tracking, imagine following scenario: you start a consumer, consume 100 messages and shut the consumer down. Next time you start your consumer you'll probably want to resume from your last consumed offset (which is 100), and that means you have to store the maximum consumed offset somewhere. Here's where zookeeper kicks in: it stores offsets for every group/topic/partition. So this way next time you start your consumer it may ask "hey zookeeper, what's the offset I should start consuming from?". Kafka is actually moving towards being able to store offsets not only in zookeeper, but in other storages as well (for now only Regarding load balancing, the amount of messages produced can be quite large to be handled by 1 machine and you'll probably want to add computing power at some point. Lets say you have a topic with 100 partitions and to handle this amount of messages you have 10 machines. There are several questions that arise here actually:
And again, here's where zookeeper kicks in: it tracks all consumers in group and each high level consumer is subscribed for changes in this group. The point is that when a consumer appears or disappears, zookeeper notifies all consumers and triggers rebalance so that they split partitions near-equally (e.g. to balance load). This way it guarantees if one of consumer dies others will continue processing partitions that were owned by this consumer. |
|||||
|


Why do Kafka consumers connect to zookeeper, and producers get metadata from brokers?的更多相关文章
- kafka集群和zookeeper集群的部署,kafka的java代码示例
来自:http://doc.okbase.net/QING____/archive/19447.html 也可参考: http://blog.csdn.net/21aspnet/article/det ...
- 使用不同的namespace让不同的kafka/Storm连接同一个zookeeper
背景介绍: 需要部署2个kafka独立环境,但是只有一个zookeeper集群. 需要部署2个独立的storm环境,但是只有一个zookeeper集群. ----------------------- ...
- org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeo ...
- Unable to connect to zookeeper server within timeout: 5000
错误 严重: StandardWrapper.Throwable org.springframework.beans.factory.BeanCreationException: Error crea ...
- CentOS7 搭建Kafka(一)zookeeper篇
CentOS7 搭建Kafka(一)zookeeper篇 近几年当红小生Kafka备受各路英雄好汉追捧,一点不比老前辈RabbitMQ和ActiveMQ差,因为流行,所以你就得学啊:我这么懒,肯定是不 ...
- Caused by: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 5000
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'brandControl ...
- apache kafka系列之在zookeeper中存储结构
1.topic注册信息 /brokers/topics/[topic] : 存储某个topic的partitions所有分配信息 Schema: { "version": ...
- Kafka学习之(五)搭建kafka集群之Zookeeper集群搭建
Zookeeper是一种在分布式系统中被广泛用来作为:分布式状态管理.分布式协调管理.分布式配置管理.和分布式锁服务的集群.kafka增加和减少服务器都会在Zookeeper节点上触发相应的事件kaf ...
- kafka集群与zookeeper集群 配置过程
Kafka的集群配置一般有三种方法,即 (1)Single node – single broker集群: (2)Single node – multiple broker集群: (3)Mult ...
随机推荐
- nginx支持跨域访问
1,进入nginx的html目录 vim ./crossdomain.xml 具体路径: /usr/local/nginx/html/crossdomain.xml 2,在crossdomain.xm ...
- [nodejs] nodejs开发个人博客(七)后台登陆
定义后台路径 访问这个路径进入后台页面 http://localhost:8888/admin/login 在后台路由控制器里面(/admin/index.js)调用登陆控制器(/admin/logi ...
- No application encryption key has been specified.
环境:php7.1.10laravel5.5出现: 解决:在根目录下执行: php artisan key:generate OK问题解决
- mediainfo使用
1.linux安装mediainfo yum install mediainfo(epel源安装) 2.输出文件信息到xml文件 mediainfo --OUTPUT=XML -f ftp ...
- Web Worker 初探
什么是Web Worker? Web Worker 是Html5 提出的能够在后台运行javascript的对象,独立于其他脚本,不会影响页面的性能,也不会影响你继续对于页面进行操作.通俗点讲,就是后 ...
- JavaScript主流框架3月趋势总结
原文: What’s New in JavaScript Frameworks-March 2018 译者: Fundebug 为了保证可读性,本文采用意译而非直译.另外,本文版权归原作者所有,翻译仅 ...
- 【 js 工具 】如何在Github Pages搭建自己写的页面?
最近发现 github 改版了,已没有像原来的 Launch automatic page generator 这样的按钮等,所以我对我的文章也进行了修正,对于新版来说,步骤更加简单了.欢迎享用. - ...
- CSS3 Transform、Transition和Animation属性总结
CSS3的三个与变形和动画啊相关的属性: Transform 浏览器支持情况: Internet Explorer 10.Firefox.Opera 支持 transform 属性. Internet ...
- iOS----------开发中常用的宏有那些
OC对象判断是否为空? 字符串是否为空 #define kStringIsEmpty(str) ([str isKindOfClass:[NSNull class]] || str == nil || ...
- 深入理解Java虚拟机05--虚拟机类加载机制
一.前言 我们一定心里有个疑问,我们那个多态是怎么回事?我们指定的一个接口,却可以等到运行时可以对应于不同的实现类.这是因为,Java有个特性就是依赖运行期动态加载和动态连接,这样实现了Java可以动 ...
