注:0.9版本Kafka的一个重大改变就是consumer和producer API的重新设计。

这篇Kafka的文档大致介绍了对于consumer API重新设计时想要实现的功能。0.9版本的确实现了这些功能,具体细节有几篇文档讲了,以后会翻译。

Motivation

We've received quite a lot of feedback on the consumer side features over the past few months. Some of them are improvements to the current consumer design and some are simply new feature/API requests. I have attempted to write up the requirements that I've heard on this wiki - Kafka 0.9 Consumer Rewrite Design
This would involve some significant changes to the consumer APIs, so we would like to collect feedback on the proposal from our community. Since the list of changes is not small, we would like to understand if some features are preferred over others, and more importantly, if some features are not required at all.

Thin consumer client:

  1. We have a lot of users who have expressed interest in using and writing non-java clients. Currently, this is pretty straightfoward for the SimpleConsumer but not for the high level consumer. The high level consumer does some complex failure detection and rebalancing, which is non-trivial to re-implement correctly.
  2. The goal is to have a very thin consumer client, with minimum dependencies to make this easy for the users.

Central co-ordination :

  1. The current version of the high level consumer suffers from herd and split brain problems, where multiple consumers in a group run a distributed algorithm to agree on the same partition ownership decision. Due to different view of the zookeeper data, they run into conflicts that makes the rebalancing attempt fail. But there is no way for a consumer to verify if a rebalancing operation completed successfully on the entire group. This also leads to some potential bugs in the rebalancing logic, for example,https://issues.apache.org/jira/browse/KAFKA-242
  2. This can be mediated by moving the failure detection and rebalancing logic to a centralized highly-available co-ordinator - Kafka 0.9 Consumer Rewrite Design

We think the first two requirements are the prerequisite of the rest. So we have proceeded by trying to design a centralized coordinator for consumer rebalancing without Zookeeper, for details please read here.

Allow manual partition assignment

  1. There are a number of stateful data systems would like to manually assign partitions to consumers. The main motive is to enable them to keep some local per-partition state since the mapping from their consumer to partition never changes; also there are some use cases where it makes sense to co-locate brokers and consumer processes, hence would be nice to optimize the automatic partition assignment algorithm to consider co-location. Examples of such systems are databases, search indexers etc
  2. A side effect of this requirement is wanting to turn off automatic rebalancing in the high level consumer.
  3. This feature depends on the central co-ordination feature since it is cannot be correctly and easily implemented with the current distributed co-ordination model.

Allow manual offset management

  1. Some systems require offset management in a custom database, at specific intervals. Overall, the requirement is to have access to the message metadata like topic, partition, offset of the message, and to be able to provide per-partition offsets on consumer startup.
  2. This would require designing new consumer APIs that allow providing offsets on startup and return message metadata with the consumer iterator.
  3. One thing that needs to be thought through is if the consumer client can be allowed to pick manual offset management for some, but not all topics. One option is to allow the consumer to pick one offset management only. This could potentially make the API a bit simpler
  4. This feature depends on the central co-ordination feature since it is cannot be correctly and easily implemented with the current distributed co-ordination model.

Invocation of user specified callback on rebalance

  1. Some applications maintain transient per-partition state in-memory. On rebalance operation, they would need to “flush” the transient state to some persistent storage.
  2. The requirement is to let the user plugin some sort of callback that the high level consumer invokes when a rebalance operation is triggered.
  3. This requirement has some overlap with the manual partition assignment requirement. Probably, if we allow manual partition assignment, such applications might be able to leverage that to flush transient state. But, the issue is that these applications do want automatic rebalancing and might not want to use the manual partition assignment feature.

Non blocking consumer APIs

    1. This requirement is coming from stream processing applications that implement high-level stream processing primitives like filter by, group by, join operations on kafka streams.
    2. To facilitate stream join operations, it is desirable that Kafka provides non-blocking consumer APIs. Today, since the consumer streams are essentially blocking, these sort of stream join operations are not possible.
    3. This requirement seems to involve some significant redesign of the consumer APIs and the consumer stream logic. So it will be good to give this some more thought.

动机

在过去的几个月内,我们收到了很多关于消费者端的特性的反馈。其中有一些是对于当前的consumer设计的改进,有一些是对新的特性/API的需求。我尝试把获取的需求写到一起到这个wiki上- Kafka 0.9 Consumer Rewrite Design

这将会关系到一些对于consumer API的显著的改变,所以我们需要从社区中收集对这些提议的反馈。因为改变列表并不上,我们需要了解是否有些特性比其它的更受欢迎,更重要的,有哪些特性是根本不需要的。


Central co-ordination:

(叫contral co-ordiantion是为与当前的使用zk做的distributed co-cordination相比较)

1. high level consumer容易受到herd以及split brain problem的影响(在consumer运行分布式算法来决定partition的归属时).并且,high level consumer在rebalance的过程中容易产生冲突,从而使得rebalance失败.但是consumer group里的单个consumer无法知道一个rebalance操作是否在整个group级别上成功执行.这也造成了在rebalance逻辑上存在一些潜在的bug,比如https://issues.apache.org/jira/browse/KAFKA-242

2. 所以需要把failure detection和rebalance的功能移到一个centralized high-available co-coordinator上 Kafka 0.9 Consumer Rewrite Design

这两个问题的解决是处理下面几个问题的前提,所以需要设计一个不再用Zookeeper来做consumer rebalance的中央协调系统.


允许手动地分配partition给consumer

1.在有些情况下,用户想要手动地分配partition给consumer.主要原因是,在有些情况下,consumer和partition的对应关系是不会变的,这样就能允许维护一个local per-partition state;有些情况下,用户也需要把broker和conumser process放在一起,这样就需要使得automatic partition assignment算法考虑co-location的事.这样的系统包括数据库,search indexeers等 .

2.要想实现这个功能,附加着就需要可以关闭high level consumer的 automatic rebalance功能

3. 这个功能依赖于contral co-cordination功能,因为在当前的分布式协调模型中,实现这个功能不容易.


允许手动的offset管理

  1. 有些系统需要使用自己的数据库管理offset,按照指定的间隔.总的来说,这就需要用户能获取到消息的metadata,比如topic, partition ,offset, 并且可以在consumer启动的时候为每个分区指定offset.

  2. 这就需要设计新的consumer API,使它能够在启动时设置offset,在consumer iterator中返回消息的metadata.

  3. 需要仔细考虑的情况一个情况是是否允许一个conumser对一些topic使用手动的offset管理,对另外一些topic使用自动的offset管理.可以考虑的一个选择时,只允许consumer选择一种offset管理方式,这样会使得它的API简单一些.

  4. 这个特性依赖于contral co-cordiante特性,因为当前的分布式协调模型不能正确地以及容易地实现这个特性.


允许在rebalance时指定回调函数

  1. 有些程序在内存中保存于临时的per-partition状态.在rebalance操作发生时,这些程序需要把transient state刷到持久性存储中.

  2. 这就需要可以让用户指定一个回调,当rebalance操作触发时,就会执行这个回调.

  3. 这个需求和手动partition分配的需求有所重复.可能会,如果我们允行手动分配partition,程序或许可以借助那个功能来flush traisient state.但是,问题在于,有些程序的确想使用自动的rebalancing, 而不想用手动分配partition的特性.


提供非阻塞的consumer API

  1. 这个需求来自于流处理程序,这些程序想要在Kafka流上实现高层流处理的基本操作,比如:filter by, group by, join.

  2. 为了使流的join操作更容易,需要Kafka提供非阻塞的consumer API.现在,因为consumer stream本质是阻塞的,所以实现这种stream join操作是不可能的.

  3. 这个需要看起来需要对consumer API以及consumer stream logic进行可观的重新设计.所以多花些时间考虑是有益的.


总结:

新的consumer API总的来说

1. 在分布式协调方面更健壮

2. 给用户提供了更多关于消息的元数据

3. 用户可以自己管理offset和partition分配

4. 提供了非阻塞的API

5. 为了做到这些,它使用了contral co-ordinate系统替代了之前的分布式协调系统。 

这样,这个新的consumer API就既可以完全替代之前的high level consumer, 又提供了以前只有simple API才能提供的一些功能(但是却隐藏了一些直接使用simple API的复杂性),所以会是一个更通用的API。

Consumer Client Re-Design (翻译)的更多相关文章

  1. 如何创建Kafka客户端:Avro Producer和Consumer Client

    1.目标 - Kafka客户端 在本文的Kafka客户端中,我们将学习如何使用Kafka API 创建Apache Kafka客户端.有几种方法可以创建Kafka客户端,例如最多一次,至少一次,以及一 ...

  2. trove design翻译

    trove的设计 高水平的描述 trove的目的是支持单租户数据库,在一个nova的实例中.没有限制nova是如何配置的,因为trove与其他OpenStack组件纯粹通过API. Trove-api ...

  3. Domain Driven Design and Development In Practice--转载

    原文地址:http://www.infoq.com/articles/ddd-in-practice Background Domain Driven Design (DDD) is about ma ...

  4. 【翻译】Flume 1.8.0 User Guide(用户指南) source

    翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...

  5. kafka Detailed Replication Design V3

    参考,https://cwiki.apache.org/confluence/display/KAFKA/kafka+Detailed+Replication+Design+V3 Major chan ...

  6. Kafka源码深度解析-序列7 -Consumer -coordinator协议与heartbeat实现原理

    转自:http://blog.csdn.net/chunlongyu/article/details/52791874 单线程的consumer 在前面我们讲过,KafkaProducer是线程安全的 ...

  7. 【原创】如何确定Kafka的分区数、key和consumer线程数

    在Kafak中国社区的qq群中,这个问题被提及的比例是相当高的,这也是Kafka用户最常碰到的问题之一.本文结合Kafka源码试图对该问题相关的因素进行探讨.希望对大家有所帮助.   怎么确定分区数? ...

  8. Kafka 0.9+Zookeeper3.4.6集群搭建、配置,新Client API的使用要点,高可用性测试,以及各种坑 (转载)

    Kafka 0.9版本对java client的api做出了较大调整,本文主要总结了Kafka 0.9在集群搭建.高可用性.新API方面的相关过程和细节,以及本人在安装调试过程中踩出的各种坑. 关于K ...

  9. 【转】如何确定Kafka的分区数、key和consumer线程数

    文章来源:http://www.cnblogs.com/huxi2b/p/4583249.html -------------------------------------------------- ...

随机推荐

  1. 20141017--异常语句try-catch

    //try-catch 尝试(try)-抓获(catch) try//尝试,保护起来,使程序出错也能执行 { //确定不会出错时不要用try,当不确定时使用try-catch可以捕获错误, int i ...

  2. (转)如何构建高性能,稳定SOA应用之-负载均衡-Decoupled Invocation(一)

    当我们在为一个软件设计架构的时候,我们不仅仅要确保所做出来的架构要满足系统的业务需求,更加要确保做出来的架构要满足可维护性,安全,稳定性的非业务行的需求. 另外一个非常重要的非功能性需求就是性能.性能 ...

  3. 10 个非常有用的 AngularJS 框架

    AngularJS是最流行的开源web app框架.AngularJS被用于解决阻碍单页应用程序开发的各种挑战. 你作为一个AngularJS用户,却不知道一些可以帮助你美化编码的资源?那么一定不能错 ...

  4. 6款基于SVG的HTML5CSS3应用和动画

    1.CSS3/SVG质感背景小图标 镂空效果图标按钮 今天我们来分享一款用CSS3和SVG实现的质感背景小图标,鼠标滑过图标时出现镂空的效果,并且有质感背景的描边,效果非常不错. 在线演示 源码下载 ...

  5. 让apache与mysql随着系统自动启动

    让apache与mysql随着系统自动启动 在Linux中有一个文件/etc/rc.d/rc.local文件,其系统在启动时会自动加载该文件,我们可以把要启动的服务放入这个文件中即可. 添加以下代码:

  6. FPGA使用技巧

    1 IOB       为了保证FPGA输入输出接口的时序,一般会要求将输入管脚首先打一拍再使用,输出接口也要打一拍再输出FPGA.将信号打一拍的方法是将信号通过一次寄存器,而且必须在IOB里面的寄存 ...

  7. Object-C编译的Protobuf

    因工作需要,要编译Object-C可用的Protocbuf,开始查资料, http://www.cnblogs.com/uniy/archive/2011/12/21/2296405.html 结果执 ...

  8. LANMP 如何禁止访问 .htaccess 文件

    很多朋友问我,为什么他已经在 Apache 规则里面加了禁止别人直接下载 .htaccess 文件,为什么还是可以下载? 其实这个很简单,因为 .htaccess 在 LANMP 环境下,当他作为文件 ...

  9. mysql数据库表格导出为excel表格

    在本地数据库中操作如下: 由于excel表格的编码是GBK,所以导出时要加一个设置字符编码: select * from 某个表 into outfile 'd:/文件名.xls' CHARACTER ...

  10. ASP 连接 MySQL 数据库两种方法

    一般都是用myodbc来连接.首先,在系统中安装 Mysql 的ODBC数据库驱动.如安装稳定版本是3.51.下载地址是:http://dev.mysql.com/downloads/connecto ...