1. 安装步骤

Kafka伪分布式安装的思路跟Zookeeper的伪分布式安装思路完全一样，不过比Zookeeper稍微简单些(不需要创建myid文件),

主要是针对每个Kafka服务器配置一个单独的server.properties，三个服务器分别使用server.properties，server.1.properties, server.2.properties

cp server.properties server.1.properties

cp server.properties server.2.properties

修改server.1.properties和server.2.properties,主要有三个属性需要修改

broker.id=1

port=9093

log.dirs=/tmp/kafka-logs-1

port指的是Kakfa服务器监听的端口

启动三个Kafka:

bin/kafka-server-start.sh server.properties

bin/kafka-server-start.sh server.1.properties

bin/kafka-server-start.sh server.2.properties

2. Kafka脚本常用配置参数

2.1 kafka-console-consumer.sh

--from-beginning If the consumer does not already have an established offset to consume from, start with the earliest message present in the log rather than the latest message.

--topic <topic> The topic id to consume on

--zookeeper <urls> REQUIRED: The connection string for the zookeeper connection in the form host:port. Multiple URLS can be given to allow fail-over.

--group <gid> The group id to consume on. (default: console-consumer-37803)

在consumer端，不需要指定broke-list，而是通过zookeeper和topic找到所有的持有topic消息的broker

2.2 kafka-console-producer.sh

--topic <topic> REQUIRED: The topic id to produce messages to.

--broker-list <broker-list> REQUIRED: The broker list string in the form HOST1:PORT1,HOST2:PORT2.

2.3 kafka-topic.sh

--create Create a new topic.

--describe List details for the given topics.

--list List all available topics.

--partitions <Integer: # of partitions> The number of partitions for the topic being created or altered (WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected)

--replication-factor <Integer: replication factor> The replication factor for each partition in the topic being created

--zookeeper <urls> REQUIRED: The connection string for the zookeeper connection in the form host:port. Multiple URLS can be given to allow fail-over.

--topic <topic> The topic to be create, alter or describe. Can also accept a regular expression except for --create option

3. 伪机群测试

测试前，先总结有哪些测试点

目前想到的是，Partition有个leader的概念，leader partition是什么意思？干什么用的？

3.1 创建Topic

创建一个Topic，10个Partition，副本数为3，也就是说，每个broker上的每个分区，在其它节点都有副本，因为每个节点都有10个节点的数据

3.2 每个broker创建的目录

当创建完Topic后，每个Topic都会在Kakfa的配置目录下（比如/tmp/kafka-logs，建立相应的目录和文件）

topic_p10_r3-0

topic_p10_r3-1

----

topic_p10_r3-9

其中每个目录下面都有两个文件： 00000000000000000000.index 00000000000000000000.log

3.3 Topic的详细信息

./kafka-topics.sh --describe --topic topic_p10_r3  --zookeeper localhost:2181

得到的结果如下：

Topic:topic_p10_r3    PartitionCount:10    ReplicationFactor:3    Configs:

    Topic: topic_p10_r3    Partition: 0    Leader: 2    Replicas: 2,0,1    Isr: 2,0,1

    Topic: topic_p10_r3    Partition: 1    Leader: 0    Replicas: 0,1,2    Isr: 0,1,2

    Topic: topic_p10_r3    Partition: 2    Leader: 1    Replicas: 1,2,0    Isr: 1,2,0

    Topic: topic_p10_r3    Partition: 3    Leader: 2    Replicas: 2,1,0    Isr: 2,1,0

    Topic: topic_p10_r3    Partition: 4    Leader: 0    Replicas: 0,2,1    Isr: 0,2,1

    Topic: topic_p10_r3    Partition: 5    Leader: 1    Replicas: 1,0,2    Isr: 1,0,2

    Topic: topic_p10_r3    Partition: 6    Leader: 2    Replicas: 2,0,1    Isr: 2,0,1

    Topic: topic_p10_r3    Partition: 7    Leader: 0    Replicas: 0,1,2    Isr: 0,1,2

    Topic: topic_p10_r3    Partition: 8    Leader: 1    Replicas: 1,2,0    Isr: 1,2,0

    Topic: topic_p10_r3    Partition: 9    Leader: 2    Replicas: 2,1,0    Isr: 2,1,0

具体的含义是：

Here is an explanation of output. The first line gives a summary of all the partitions, each additional line gives information about one partition

"leader" is the node responsible for all reads and writes for the given partition. Each node will be the leader for a randomly selected portion of the partitions.
"replicas" is the list of nodes that replicate the log for this partition regardless of whether they are the leader or even if they are currently alive.
"isr" is the set of "in-sync" replicas. This is the subset of the replicas list that is currently alive and caught-up to the leader.

3.4 问题：如果副本数为1，是否表示每个partition在集群中只有1份(也就是说每个partition只会存在于一个broker上)，那么leader自然就表示这个partition就在leader所指的broker上了？

建立包含10个分区，同时只有一个副本的topic

./kafka-topics.sh --create  --topic  topic_p10_r1 --partitions 10 --replication-factor 1  --zookeeper localhost:2181

[hadoop@hadoop bin]$ ./kafka-topics.sh --describe --topic topic_p10_r1  --zookeeper localhost:2181

Topic:topic_p10_r1  PartitionCount:10   ReplicationFactor:1 Configs:

    Topic: topic_p10_r1 Partition: 0    Leader: 1   Replicas: 1 Isr: 1

    Topic: topic_p10_r1 Partition: 1    Leader: 2   Replicas: 2 Isr: 2

    Topic: topic_p10_r1 Partition: 2    Leader: 0   Replicas: 0 Isr: 0

    Topic: topic_p10_r1 Partition: 3    Leader: 1   Replicas: 1 Isr: 1

    Topic: topic_p10_r1 Partition: 4    Leader: 2   Replicas: 2 Isr: 2

    Topic: topic_p10_r1 Partition: 5    Leader: 0   Replicas: 0 Isr: 0

    Topic: topic_p10_r1 Partition: 6    Leader: 1   Replicas: 1 Isr: 1

    Topic: topic_p10_r1 Partition: 7    Leader: 2   Replicas: 2 Isr: 2

    Topic: topic_p10_r1 Partition: 8    Leader: 0   Replicas: 0 Isr: 0

    Topic: topic_p10_r1 Partition: 9    Leader: 1   Replicas: 1 Isr: 1

可见理解不错，每个partition有不同的leader，Leader所在的broker同时也是Replicas所在的broker(ID号一样)

因此可以理解，

1. 每个partition副本集都有一个leader

2. leader指的是partition副本集中的leader，它负责读写，然后负责将数据复制到其它的broker上。

3.一个Topic的所有partition会比较均匀的分布到多个broker上

3.5 broker挂了，Kafka的容错机制

在上面已经建立了两个Topic，一个是10个分区3个副本，一个是10个分区1个副本。此时，假如有一个broker挂了，看看这两个Topic的容错如何？

通过jps命令可以看到有三个Kafka进程。

通过ps -ef|grep server.2.properties可以找到brokerId为2的Kakfa进程，使用kill -9将其干掉。干掉的时候，console开始刷屏，异常信息一样，都是：

[2015-02-23 02:14:00,037] WARN Reconnect due to socket error: null (kafka.consumer.SimpleConsumer)

[2015-02-23 02:14:00,039] ERROR [ReplicaFetcherThread-0-2], Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4325; ClientId: ReplicaFetcherThread-0-2; ReplicaId: 1; MaxWait: 500 ms; MinBytes: 1 bytes; RequestInfo: [topic_p10_r3,3] -> PartitionFetchInfo(0,1048576),[topic_p10_r3,9] -> PartitionFetchInfo(0,1048576),[topic_p10_r3,6] -> PartitionFetchInfo(0,1048576),[topic_p10_r3,0] -> PartitionFetchInfo(0,1048576) (kafka.server.ReplicaFetcherThread)

java.net.ConnectException: Connection refused

    at sun.nio.ch.Net.connect0(Native Method)

    at sun.nio.ch.Net.connect(Net.java:465)

    at sun.nio.ch.Net.connect(Net.java:457)

    at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670)

    at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57)

    at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:44)

    at kafka.consumer.SimpleConsumer.reconnect(SimpleConsumer.scala:57)

    at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:79)

    at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71)

    at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109)

    at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109)

    at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109)

    at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)

    at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:108)

    at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108)

    at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108)

    at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)

    at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107)

    at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:96)

    at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:88)

    at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)

[2015-02-23 02:14:00,040] WARN Reconnect due to socket error: null (kafka.consumer.SimpleConsumer)

3,9,6,0 这个四个分区正是topic_p10_r3上broker2作为leader的partition，可见Kafka要做Leader移交，看看此时的topic_p10_r3和topic_p10_r1的情况，我们已经把broker2 kill掉了

topic_p10_r3（Partition切换到其它Leader上了。。。Rplicas还有3,。。。）

[hadoop@hadoop bin]$ ./kafka-topics.sh --describe --topic topic_p10_r3  --zookeeper localhost:2181

Topic:topic_p10_r3  PartitionCount:10   ReplicationFactor:3 Configs:

    Topic: topic_p10_r3 Partition: 0    Leader: 0   Replicas: 2,0,1 Isr: 0,1

    Topic: topic_p10_r3 Partition: 1    Leader: 0   Replicas: 0,1,2 Isr: 0,1

    Topic: topic_p10_r3 Partition: 2    Leader: 1   Replicas: 1,2,0 Isr: 1,0

    Topic: topic_p10_r3 Partition: 3    Leader: 1   Replicas: 2,1,0 Isr: 1,0

    Topic: topic_p10_r3 Partition: 4    Leader: 0   Replicas: 0,2,1 Isr: 0,1

    Topic: topic_p10_r3 Partition: 5    Leader: 1   Replicas: 1,0,2 Isr: 1,0

    Topic: topic_p10_r3 Partition: 6    Leader: 0   Replicas: 2,0,1 Isr: 0,1

    Topic: topic_p10_r3 Partition: 7    Leader: 0   Replicas: 0,1,2 Isr: 0,1

    Topic: topic_p10_r3 Partition: 8    Leader: 1   Replicas: 1,2,0 Isr: 1,0

    Topic: topic_p10_r3 Partition: 9    Leader: 1   Replicas: 2,1,0 Isr: 1,0

topic_p10_r1:没有切换，其中分区为1,47的Leader是-1了。。这就出错了

[hadoop@hadoop bin]$ ./kafka-topics.sh --describe --topic topic_p10_r1  --zookeeper localhost:2181

Topic:topic_p10_r1  PartitionCount:10   ReplicationFactor:1 Configs:

    Topic: topic_p10_r1 Partition: 0    Leader: 1   Replicas: 1 Isr: 1

    Topic: topic_p10_r1 Partition: 1    Leader: -1  Replicas: 2 Isr:

    Topic: topic_p10_r1 Partition: 2    Leader: 0   Replicas: 0 Isr: 0

    Topic: topic_p10_r1 Partition: 3    Leader: 1   Replicas: 1 Isr: 1

    Topic: topic_p10_r1 Partition: 4    Leader: -1  Replicas: 2 Isr:

    Topic: topic_p10_r1 Partition: 5    Leader: 0   Replicas: 0 Isr: 0

    Topic: topic_p10_r1 Partition: 6    Leader: 1   Replicas: 1 Isr: 1

    Topic: topic_p10_r1 Partition: 7    Leader: -1  Replicas: 2 Isr:

    Topic: topic_p10_r1 Partition: 8    Leader: 0   Replicas: 0 Isr: 0

    Topic: topic_p10_r1 Partition: 9    Leader: 1   Replicas: 1 Isr: 1

重启broker 2得到结果如下：（对于topic_p10_r3，leader没有变化，即每个Partition都有自己的Leader，新加入的broker只能follower；而topic_p10_r1,则会选出Leader）

[hadoop@hadoop bin]$ ./kafka-topics.sh --describe --topic topic_p10_r3  --zookeeper localhost:2181

Topic:topic_p10_r3  PartitionCount:10   ReplicationFactor:3 Configs:

    Topic: topic_p10_r3 Partition: 0    Leader: 0   Replicas: 2,0,1 Isr: 0,1,2

    Topic: topic_p10_r3 Partition: 1    Leader: 0   Replicas: 0,1,2 Isr: 0,1,2

    Topic: topic_p10_r3 Partition: 2    Leader: 1   Replicas: 1,2,0 Isr: 1,0,2

    Topic: topic_p10_r3 Partition: 3    Leader: 1   Replicas: 2,1,0 Isr: 1,0,2

    Topic: topic_p10_r3 Partition: 4    Leader: 0   Replicas: 0,2,1 Isr: 0,1,2

    Topic: topic_p10_r3 Partition: 5    Leader: 1   Replicas: 1,0,2 Isr: 1,0,2

    Topic: topic_p10_r3 Partition: 6    Leader: 0   Replicas: 2,0,1 Isr: 0,1,2

    Topic: topic_p10_r3 Partition: 7    Leader: 0   Replicas: 0,1,2 Isr: 0,1,2

    Topic: topic_p10_r3 Partition: 8    Leader: 1   Replicas: 1,2,0 Isr: 1,0,2

    Topic: topic_p10_r3 Partition: 9    Leader: 1   Replicas: 2,1,0 Isr: 1,0,2

[hadoop@hadoop bin]$ ./kafka-topics.sh --describe --topic topic_p10_r1  --zookeeper localhost:2181

Topic:topic_p10_r1  PartitionCount:10   ReplicationFactor:1 Configs:

    Topic: topic_p10_r1 Partition: 0    Leader: 1   Replicas: 1 Isr: 1

    Topic: topic_p10_r1 Partition: 1    Leader: 2   Replicas: 2 Isr: 2

    Topic: topic_p10_r1 Partition: 2    Leader: 0   Replicas: 0 Isr: 0

    Topic: topic_p10_r1 Partition: 3    Leader: 1   Replicas: 1 Isr: 1

    Topic: topic_p10_r1 Partition: 4    Leader: 2   Replicas: 2 Isr: 2

    Topic: topic_p10_r1 Partition: 5    Leader: 0   Replicas: 0 Isr: 0

    Topic: topic_p10_r1 Partition: 6    Leader: 1   Replicas: 1 Isr: 1

    Topic: topic_p10_r1 Partition: 7    Leader: 2   Replicas: 2 Isr: 2

    Topic: topic_p10_r1 Partition: 8    Leader: 0   Replicas: 0 Isr: 0

    Topic: topic_p10_r1 Partition: 9    Leader: 1   Replicas: 1 Isr: 1

Kafka Topic的详细信息捎带主要的安装步骤的更多相关文章

【原创】Kafka topic常见命令解析
本文着重介绍几个常用的topic命令行命令,包括listTopic,createTopic,deleteTopic和describeTopic等.由于alterTopic并不是很常用,本文中就不涉及了 ...
Kafka的配置文件详细描述
在kafka/config/目录下面有3个配置文件: producer.properties consumer.properties server.properties (1).producer.pr ...
Kafka Topic Partition Replica Assignment实现原理及资源隔离方案
本文共分为三个部分: Kafka Topic创建方式 Kafka Topic Partitions Assignment实现原理 Kafka资源隔离方案 1. Kafka Topic创建方式 ...
kafka topic 相关操作
1.列出集群中的topic bin/kafka-topics.sh --zookeeper spark1:2181,spark2:2181,spark3:2181 --list 2.创建topic r ...
kafka系列六、java管理kafka Topic
package com.example.demo.topic; import kafka.admin.AdminUtils; import kafka.admin.RackAwareMode; imp ...
kafka topic 完全删除
kafka topic 完全删除 1.自动删除脚本(得配置server.properties 中 delete.topic.enable=true) ./kafka-topics.sh --zoo ...
How to: Calculate a Property Value Based on Values from a Detail Collection 如何：基于详细信息集合中的值计算属性值
This topic describes how to implement a business class, so that one of its properties is calculated ...
Add an Editor to a Detail View 将编辑器添加到详细信息视图
In this lesson, you will learn how to add an editor to a Detail View. For this purpose, the Departme ...
kafka topic查看删除
1,查看kafka topic列表,使用--list参数 >bin/kafka-topics.sh --zookeeper 127.0.0.1:2181 --list __consumer_of ...

随机推荐

C++闭包，一样很简单
引用百度上对闭包的定义:闭包是指可以包含自由(未绑定到特定对象)变量的代码块:这些变量不是在这个代码块内或者任何全局上下文中定义的,而是在定义代码块的环境中定义(局部变量).“闭包” 一词来源于以下两 ...
Delphi Base64编码/解码及ZLib压缩/解压
最近在写的程序与SOAP相关,所以用到了一些Base64编码/解码及数据压缩/解压方面的知识. 在这里来作一些总结: 一.Base64编码/解码一般用到的是Delphi自带的单元EncdDe ...
ubuntu 下 caffe 的安装
官方下载说明:Caffe | Installation: Ubuntu 在 ubuntu 的一些较新版本中(14.04 以上),caffe 的所有依赖包都可以使用 apt-get 大法搞定. 1. 依 ...
Jsp bug_001
报错: The superclass "javax.servlet.http.HttpServlet" was not found on the Java Build Path 解 ...
vue-router设置页面标题
通过vue-router设置页面标题 const router = new Router({ routes: [ { path: '/', name: 'EntryConfirmation', met ...
wpf控件开发基础(5) -依赖属性实践
原文:wpf控件开发基础(5) -依赖属性实践知识回顾接上篇,回顾这三篇讲了什么东西首先说明了属性的现存问题,然后介绍了依赖属性的基本用法及其解决方案,由于依赖属性以静态属性的方式存在,进而又介 ...
Java--基础命名空间和相关东西（JAVA工程师必须会，不然杀了祭天）
java.lang (提供利用 Java 编程语言进行程序设计的基础类)java.lang.annotation(提供了引用对象类,支持在某种程度上与垃圾回收器之间的交互)java.lang.inst ...
R 语言学习（二）—— 向量
1. 入门将摄氏度转化为华氏度 >> 27*1.8+32 [1] 80.6 [1]:表示数字的向量索引号,在 R 语言中任何一个数字都看作一个向量. 向量化 >> temp ...
关于提高UDP发送效率的方法
UDP的发送效率和什么因素有关呢? 直观觉得,UDP的切包长越大,应该发送效率越高(最长为65536).可是依据实际測试和在网上查到的资料的结果,包长度为1024为发送效率最高. 这样的结果让人感到疑 ...
Delphi内存管理（Integer、Boolean、Record、枚举等都是在作用域内编译器自动申请内存，出了作用域自动释放；另外，字符串、Variant、动态数组、接口也是由Delphi自动管理）
一.什么是堆.栈? 程序需要的内存空间分为 heap(堆) 和 stack(栈),heap 是自由存储区, stack 是自动存储区,使用 heap 需要手动申请.手动释放, stack 是自动申请. ...

Kafka Topic的详细信息 捎带主要的安装步骤