不多说,直接上干货!

  一切来源于官网

http://kafka.apache.org/documentation/

Topics and Logs

话题和日志 (Topic和Log)

Let's first dive into the core abstraction Kafka provides for a stream of records—the topic.

让我们更深入的了解Kafka中的Topic。

  A topic is a category or feed name to which records are published. Topics in Kafka are always multi-subscriber; that is, a topic can have zero, one, or many consumers that subscribe to the data written to it.

  For each topic, the Kafka cluster maintains a partitioned log that looks like this:

Topic是发布的消息的类别或者种子Feed名。对于每一个Topic,Kafka集群维护这一个分区的log,就像下图中的示例:

                  

  Each partition is an ordered, immutable sequence of records that is continually appended to—a structured commit log. The records in the partitions are each assigned a sequential id number called the offset that uniquely identifies each record within the partition.

每一个分区都是一个顺序的、不可变的消息队列, 并且可以持续的添加。分区中的消息都被分了一个序列号,称之为偏移量(offset),在每个分区中此偏移量都是唯一的。

  The Kafka cluster retains all published records—whether or not they have been consumed—using a configurable retention period. For example, if the retention policy is set to two days, then for the two days after a record is published, it is available for consumption, after which it will be discarded to free up space. Kafka's performance is effectively constant with respect to data size so storing data for a long time is not a problem.

 Kafka集群保持所有的消息,直到它们过期, 无论消息是否被消费了。 实际上消费者所持有的仅有的元数据就是这个偏移量,也就是消费者在这个log中的位置。
 这个偏移量由消费者控制:正常情况当消费者消费消息的时候,偏移量也线性的的增加。
 但是实际偏移量由消费者控制,消费者可以将偏移量重置为更老的一个偏移量,重新读取消息。
 可以看到这种设计对消费者来说操作自如, 一个消费者的操作不会影响其它消费者对此log的处理。
 再说说分区。Kafka中采用分区的设计有几个目的。一是可以处理更多的消息,不受单台服务器的限制。
 Topic拥有多个分区意味着它可以不受限的处理更多的数据。第二,分区可以作为并行处理的单元,稍后会谈到这一点。

              

  In fact, the only metadata retained on a per-consumer basis is the offset or position of that consumer in the log. This offset is controlled by the consumer: normally a consumer will advance its offset linearly as it reads records, but, in fact, since the position is controlled by the consumer it can consume records in any order it likes. For example a consumer can reset to an older offset to reprocess data from the past or skip ahead to the most recent record and start consuming from "now".

  This combination of features means that Kafka consumers are very cheap—they can come and go without much impact on the cluster or on other consumers. For example, you can use our command line tools to "tail" the contents of any topic without changing what is consumed by any existing consumers.

  The partitions in the log serve several purposes. First, they allow the log to scale beyond a size that will fit on a single server. Each individual partition must fit on the servers that host it, but a topic may have many partitions so it can handle an arbitrary amount of data. Second they act as the unit of parallelism—more on that in a bit.

1.1 Introduction中 Topics and Logs官网剖析(博主推荐)的更多相关文章

  1. Flume Channel Selectors官网剖析(博主推荐)

    不多说,直接上干货! Flume Sources官网剖析(博主推荐) Flume Channels官网剖析(博主推荐) 一切来源于flume官网 http://flume.apache.org/Flu ...

  2. Flume Channels官网剖析(博主推荐)

    不多说,直接上干货! Flume Sources官网剖析(博主推荐) 一切来源于flume官网 http://flume.apache.org/FlumeUserGuide.html Flume Ch ...

  3. Flume Source官网剖析(博主推荐)

    不多说,直接上干货! 一切来源于flume官网 http://flume.apache.org/FlumeUserGuide.html Flume Sources Avro Source Thrift ...

  4. 1.2 Use Cases中 Website Activity Tracking官网剖析(博主推荐)

    不多说,直接上干货! 一切来源于官网 http://kafka.apache.org/documentation/ Website Activity Tracking 网站活动追踪 The origi ...

  5. Flume Interceptors官网剖析(博主推荐)

    不多说,直接上干货! Flume Sources官网剖析(博主推荐) Flume Channels官网剖析(博主推荐) Flume Channel Selectors官网剖析(博主推荐) Flume ...

  6. Event Serializers官网剖析(博主推荐)

    不多说,直接上干货! Flume Sources官网剖析(博主推荐) Flume Channels官网剖析(博主推荐) Flume Channel Selectors官网剖析(博主推荐) Flume ...

  7. Flume Sink Processors官网剖析(博主推荐)

    不多说,直接上干货! Flume Sources官网剖析(博主推荐) Flume Channels官网剖析(博主推荐) Flume Channel Selectors官网剖析(博主推荐) Flume ...

  8. Flume Sinks官网剖析(博主推荐)

    不多说,直接上干货! Flume Sources官网剖析(博主推荐) Flume Channels官网剖析(博主推荐) Flume Channel Selectors官网剖析(博主推荐) 一切来源于f ...

  9. 怎样取消老毛桃软件赞助商---只需在输入框中输入老毛桃官网网址“laomaotao.org”

    来源:www.laomaotao.org 时间:2015-01-29 在众多网友和赞助商的支持下,迄今为止,老毛桃u盘启动盘制作工具已经推出了多个版本.如果有用户希望取消显示老毛桃软件中的赞助商,那不 ...

随机推荐

  1. 解决Esxi5下安装Windows 8的问题

    在VM8工作站版下安装windows 8没有问题,可是到了Esxi5下,非得安装补丁不可.补丁下载地址: http://kb.vmware.com/selfservice/microsites/sea ...

  2. POSTGRESQL NO TABLE

    POSTGRESQL EXTENDING SQL GRIGGER PROCEDURAL

  3. Vuejs2.0构建一个彩票查询WebAPP(3)

    整个工程的目录及截图如下,源码下载    使用心得: 1.了解Vue的生命周期很有必要,详情参见博文Vue2.0 探索之路——生命周期和钩子函数的一些理解 2.Vuex全局状态管理真是美味不可言 st ...

  4. USB摄像头驱动框架分析(五)

    一.USB摄像头驱动框架如下所示:1.构造一个usb_driver2.设置   probe:        2.1. 分配video_device:video_device_alloc        ...

  5. 摄像头驱动——V4L2框架分析

    一.概述 Video for Linux 2,简称V4l2,是Linux内核中关于视频设备的内核驱动框架,为上层的访问底层的视频设备提供了统一的接口. 摄像头驱动是属于字符设备驱动程序.(分析linu ...

  6. mysql not in用法

    select * from zan where uid not in(select uid from zan where zhongjiang !=0) group by uid order by r ...

  7. CCF模拟 无线网络

    无线网络 时间限制: 1.0s 内存限制: 256.0MB   问题描述 目前在一个很大的平面房间里有 n 个无线路由器,每个无线路由器都固定在某个点上.任何两个无线路由器只要距离不超过 r 就能互相 ...

  8. HDU 4588 Count The Carries 数位DP || 打表找规律

    2013年南京邀请赛的铜牌题...做的非常是伤心.另外有两个不太好想到的地方.. ..a 能够等于零,另外a到b的累加和比較大.大约在2^70左右. 首先说一下解题思路. 首先统计出每一位的1的个数, ...

  9. BZOJ 3210 花神的浇花集会 计算几何- -?

    题目大意:给定平面上的n个点,求一个点到这n个点的切比雪夫距离之和最小 与3170不同的是这次选择的点无需是n个点中的一个 首先将每一个点(x,y)变为(x+y,x-y) 这样新点之间的曼哈顿距离的一 ...

  10. 浏览器(BOM)对象的一些内置方法总结

    浏览器(BOM)对象的一些内置方法总结 一.总结 1.bom就是浏览器那端执行的代码,dom就是服务器那端操作html的代码 2.记好bom的几个对象,那就很好理解很多代码了,也很好写很多代码了 二. ...