Kafka内部提供了许多管理脚本,这些脚本都放在$KAFKA_HOME/bin目录下,而这些类的实现都是放在源码的kafka/core/src/main/scala/kafka/tools/路径下。

Consumer Offset Checker

  Consumer Offset Checker主要是运行kafka.tools.ConsumerOffsetChecker类,对应的脚本是kafka-consumer-offset-checker.sh,会显示出Consumer的Group、Topic、分区ID、分区对应已经消费的Offset、logSize大小,Lag以及Owner等信息。

如果运行kafka-consumer-offset-checker.sh脚本的时候什么信息都不输入,那么会显示以下信息:

[iteblog@www.iteblog.com /]$ bin/kafka-consumer-offset-checker.sh
Check the offset of your consumers.
Option                                  Description                           
------                                  -----------                           
--broker-info                           Print broker info                     
--group                                 Consumer group.                       
--help                                  Print this message.                   
--retry.backoff.ms <Integer>            Retry back-off to use for failed      
                                          offset queries. (default: 3000)     
--socket.timeout.ms <Integer>           Socket timeout to use when querying   
                                          for offsets. (default: 6000)        
--topic                                 Comma-separated list of consumer      
                                          topics (all topics if absent).      
--zookeeper                             ZooKeeper connect string. (default:   
                                          localhost:2181)

我们根据提示,输入的命令如下:

[iteblog@www.iteblog.com /]$ bin/kafka-consumer-offset-checker.sh --zookeeper www.iteblog.com:2181 --topic test --group spark --broker-info
Group           Topic      Pid Offset          logSize         Lag             Owner
spark    test       0   34666914        34674392        7478            none
spark    test       1   34670481        34678029        7548            none
spark    test       2   34670547        34678002        7455            none
spark    test       3   34664512        34671961        7449            none
spark    test       4   34680143        34687562        7419            none
spark    test       5   34672309        34679823        7514            none
spark    test       6   34674660        34682220        7560            none
BROKER INFO
2 -> www.iteblog.com:9092
5 -> www.iteblog.com:9093
4 -> www.iteblog.com:9094
7 -> www.iteblog.com:9095
1 -> www.iteblog.com:9096
3 -> www.iteblog.com:9097
6 -> www.iteblog.com:9098

Dump Log Segment

  有时候我们需要验证日志索引是否正确,或者仅仅想从log文件中直接打印消息,我们可以使用kafka.tools.DumpLogSegments类来实现,先来看看它需要的参数:

[iteblog@www.iteblog.com /]$ bin/kafka-run-class.sh kafka.tools.DumpLogSegments
Parse a log file and dump its contents to the console, useful for debugging a seemingly corrupt log segment.
Option                                  Description                           
------                                  -----------                           
--deep-iteration                        if set, uses deep instead of shallow  
                                          iteration                           
--files <file1, file2, ...>             REQUIRED: The comma separated list of 
                                          data and index log files to be dumped
--key-decoder-class                     if set, used to deserialize the keys. 
                                          This class should implement kafka.  
                                          serializer.Decoder trait. Custom jar
                                          should be available in kafka/libs   
                                          directory. (default: kafka.         
                                          serializer.StringDecoder)           
--max-message-size <Integer: size>      Size of largest message. (default:    
                                          5242880)                            
--print-data-log                        if set, printing the messages content 
                                          when dumping data logs              
--value-decoder-class                   if set, used to deserialize the       
                                          messages. This class should         
                                          implement kafka.serializer.Decoder  
                                          trait. Custom jar should be         
                                          available in kafka/libs directory.  
                                          (default: kafka.serializer.         
                                          StringDecoder)                      
--verify-index-only                     if set, just verify the index log     
                                          without printing its content

  很明显,我们在使用kafka.tools.DumpLogSegments的时候必须输入--files,这个参数指的就是Kafka中Topic分区所在的绝对路径。分区所在的目录由config/server.properties文件中log.dirs参数决定。比如我们想看/home/q/kafka/kafka_2.10-0.8.2.1/data/test-4/00000000000034245135.log日志文件的相关情况可以 使用下面的命令:

[iteblog@www.iteblog.com /]$ bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files /iteblog/data/test-4/00000000000034245135.log
Dumping /home/q/kafka/kafka_2.10-0.8.2.1/data/test-4/00000000000034245135.log
Starting offset: 34245135
offset: 34245135 position: 0 isvalid: true payloadsize: 4213 magic: 0 compresscodec: NoCompressionCodec crc: 865449274 keysize: 4213
offset: 34245136 position: 8452 isvalid: true payloadsize: 4657 magic: 0 compresscodec: NoCompressionCodec crc: 4123037760 keysize: 4657
offset: 34245137 position: 17792 isvalid: true payloadsize: 3921 magic: 0 compresscodec: NoCompressionCodec crc: 541297511 keysize: 3921
offset: 34245138 position: 25660 isvalid: true payloadsize: 2290 magic: 0 compresscodec: NoCompressionCodec crc: 1346104996 keysize: 2290
offset: 34245139 position: 30266 isvalid: true payloadsize: 2284 magic: 0 compresscodec: NoCompressionCodec crc: 1930558677 keysize: 2284
offset: 34245140 position: 34860 isvalid: true payloadsize: 268 magic: 0 compresscodec: NoCompressionCodec crc: 57847488 keysize: 268
offset: 34245141 position: 35422 isvalid: true payloadsize: 263 magic: 0 compresscodec: NoCompressionCodec crc: 2964399224 keysize: 263
offset: 34245142 position: 35974 isvalid: true payloadsize: 1875 magic: 0 compresscodec: NoCompressionCodec crc: 647039113 keysize: 1875
offset: 34245143 position: 39750 isvalid: true payloadsize: 648 magic: 0 compresscodec: NoCompressionCodec crc: 865445580 keysize: 648
offset: 34245144 position: 41072 isvalid: true payloadsize: 556 magic: 0 compresscodec: NoCompressionCodec crc: 1174686061 keysize: 556
offset: 34245145 position: 42210 isvalid: true payloadsize: 4211 magic: 0 compresscodec: NoCompressionCodec crc: 3691302513 keysize: 4211
offset: 34245146 position: 50658 isvalid: true payloadsize: 2299 magic: 0 compresscodec: NoCompressionCodec crc: 2367114411 keysize: 2299
offset: 34245147 position: 55282 isvalid: true payloadsize: 642 magic: 0 compresscodec: NoCompressionCodec crc: 4122061921 keysize: 642
offset: 34245148 position: 56592 isvalid: true payloadsize: 4211 magic: 0 compresscodec: NoCompressionCodec crc: 3257991653 keysize: 4211
offset: 34245149 position: 65040 isvalid: true payloadsize: 2278 magic: 0 compresscodec: NoCompressionCodec crc: 2103489307 keysize: 2278
offset: 34245150 position: 69622 isvalid: true payloadsize: 269 magic: 0 compresscodec: NoCompressionCodec crc: 792857391 keysize: 269
offset: 34245151 position: 70186 isvalid: true payloadsize: 640 magic: 0 compresscodec: NoCompressionCodec crc: 791599616 keysize: 640

可以看出,这个命令将Kafka中Message中Header的相关信息和偏移量都显示出来了,但是没有看到日志的内容,我们可以通过--print-data-log来设置。如果需要查看多个日志文件,可以以逗号分割。

导出Zookeeper中Group相关的偏移量

  有时候我们需要导出某个Consumer group各个分区的偏移量,我们可以通过使用Kafka的kafka.tools.ExportZkOffsets类来满足。来看看这个类需要的参数:

[iteblog@www.iteblog.com /]$ bin/kafka-run-class.sh kafka.tools.ExportZkOffsets
Export consumer offsets to an output file.
Option                                  Description                           
------                                  -----------                           
--group                                 Consumer group.                       
--help                                  Print this message.                   
--output-file                           Output file                           
--zkconnect                             ZooKeeper connect string. (default:   
                                          localhost:2181)

我们需要输入Consumer group,Zookeeper的地址以及保存文件路径:

[iteblog@www.iteblog.com /]$ bin/kafka-run-class.sh kafka.tools.ExportZkOffsets --group spark --zkconnect www.iteblog.com:2181 --output-file ~/offset
 
[iteblog@www.iteblog.com /]$ vim ~/offset
/consumers/spark/offsets/test/3:34846274
/consumers/spark/offsets/test/2:34852378
/consumers/spark/offsets/test/1:34852360
/consumers/spark/offsets/test/0:34848170
/consumers/spark/offsets/test/6:34857010
/consumers/spark/offsets/test/5:34854268
/consumers/spark/offsets/test/4:34861572

注意,--output-file参数必须在指定,否则会出错。

通过JMX获取metrics信息

  我们可以通过kafka.tools.JmxTool类打印出Kafka相关的metrics信息。

[iteblog@www.iteblog.com /]$ bin/kafka-run-class.sh kafka.tools.JmxTool
Dump JMX values to standard output.
Option                                  Description                           
------                                  -----------                           
--attributes <name>                     The whitelist of attributes to query. 
                                          This is a comma-separated list. If  
                                          no attributes are specified all     
                                          objects will be queried.            
--date-format <format>                  The date format to use for formatting 
                                          the time field. See java.text.      
                                          SimpleDateFormat for options.       
--help                                  Print usage information.              
--jmx-url <service-url>                 The url to connect to to poll JMX     
                                          data. See Oracle javadoc for        
                                          JMXServiceURL for details. (default:
                                          service:jmx:rmi:///jndi/rmi://:     
                                          9999/jmxrmi)                        
--object-name <name>                    A JMX object name to use as a query.  
                                          This can contain wild cards, and    
                                          this option can be given multiple   
                                          times to specify more than one      
                                          query. If no objects are specified  
                                          all objects will be queried.        
--reporting-interval <Integer: ms>      Interval in MS with which to poll jmx 
                                          stats. (default: 2000)

可以这么使用

[iteblog@www.iteblog.com /]$ bin/kafka-run-class.sh kafka.tools.JmxTool --jmx-url service:jmx:rmi:///jndi/rmi://www.iteblog.com:1099/jmxrmi

运行上面命令前提是在启动kafka集群的时候指定export JMX_PORT=,这样才会开启JMX。然后就可以通过上面命令打印出Kafka所有的metrics信息。

Kafka数据迁移工具

  这个工具主要有两个:kafka.tools.KafkaMigrationToolkafka.tools.MirrorMaker。第一个主要是用于将Kafka 0.7上面的数据迁移到Kafka 0.8(https://cwiki.apache.org/confluence/display/KAFKA/Migrating+from+0.7+to+0.8);而后者可以同步两个Kafka集群的数据(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330)。都是从原端消费Messages,然后发布到目标端。

[iteblog@www.iteblog.com /]$ bin/kafka-run-class.sh kafka.tools.KafkaMigrationTool --kafka.07.jar kafka-0.7.19.jar --zkclient.01.jar zkclient-0.2.0.jar --num.producers 16 --consumer.config=sourceCluster2Consumer.config --producer.config=targetClusterProducer.config --whitelist=.*
 
[iteblog@www.iteblog.com /]$ bin/kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config sourceCluster1Consumer.config --consumer.config sourceCluster2Consumer.config --num.streams 2 --producer.config targetClusterProducer.config --whitelist=".*"

日志重放工具

  这个工具主要作用是从一个Kafka集群里面读取指定Topic的消息,并将这些消息发送到其他集群的指定topic中:

[iteblog@www.iteblog.com /]$ bin/kafka-replay-log-producer.sh
Missing required argument "[broker-list]"
Option                                  Description                           
------                                  -----------                           
--broker-list <hostname:port>           REQUIRED: the broker list must be     
                                          specified.                          
--inputtopic <input-topic>              REQUIRED: The topic to consume from.  
--messages <Integer: count>             The number of messages to send.       
                                          (default: -1)                       
--outputtopic <output-topic>            REQUIRED: The topic to produce to     
--property <producer properties>        A mechanism to pass properties in the 
                                          form key=value to the producer. This
                                          allows the user to override producer
                                          properties that are not exposed by  
                                          the existing command line arguments 
--reporting-interval <Integer: size>    Interval at which to print progress   
                                          info. (default: 5000)               
--sync                                  If set message send requests to the   
                                          brokers are synchronously, one at a 
                                          time as they arrive.                
--threads <Integer: threads>            Number of sending threads. (default: 1)
--zookeeper <zookeeper url>             REQUIRED: The connection string for   
                                          the zookeeper connection in the form
                                          host:port. Multiple URLS can be     
                                          given to allow fail-over. (default: 
                                          127.0.0.1:2181)

Simple Consume脚本

  kafka-simple-consumer-shell.sh工具主要是使用Simple Consumer API从指定Topic的分区读取数据并打印在终端:

bin/kafka-simple-consumer-shell.sh --broker-list www.iteblog.com:9092 --topic test --partition 0

更新Zookeeper中的偏移量

  kafka.tools.UpdateOffsetsInZK工具可以更新Zookeeper中指定Topic所有分区的偏移量,可以指定成 earliest或者latest:

[iteblog@www.iteblog.com /]$ bin/kafka-run-class.sh kafka.tools.UpdateOffsetsInZK
USAGE: kafka.tools.UpdateOffsetsInZK$ [earliest | latest] consumer.properties topic

需要指定是更新成earliest或者latest,consumer.properties文件的路径以及topic的名称

转自

Kafka管理工具介绍 – 过往记忆
https://www.iteblog.com/archives/1605.html

Kafka管理工具介绍【转】的更多相关文章

  1. Kafka管理工具介绍

    Kafka内部提供了许多管理脚本,这些脚本都放在$KAFKA_HOME/bin目录下,而这些类的实现都是放在源码的kafka/core/src/main/scala/kafka/tools/路径下. ...

  2. 给ambari集群里的kafka安装基于web的kafka管理工具Kafka-manager(图文详解)

    不多说,直接上干货! 参考博客 基于Web的Kafka管理器工具之Kafka-manager的编译部署详细安装 (支持kafka0.8.0.9和0.10以后版本)(图文详解)(默认端口或任意自定义端口 ...

  3. pip软件包管理工具介绍及基本使用

    pip软件包管理工具介绍及基本使用 一分耕耘,一分收获,要收获得好,必须耕耘得好.-- 徐特立 一.pip软件包管理工具介绍: 定义:pip是Python包管理工具 作用:对Python包的查找.下载 ...

  4. 用户管理 之 Linux 用户管理工具介绍

    Linux是一个多用户的操作系统,她有完美的用户管理工具,这些工具包括用户的查询.添加.修改,以及用户之间相互切换的工具等:通过这些工具,我们能安全.轻松的完成用户管理: 在这里我们要引入用户控制工具 ...

  5. ASP.NET 网站管理工具介绍

    有没有感觉对 web.config 的操作很烦呢? 老是手动来编辑 web.config 确实挺麻烦的, 不过自 ASP.NET 2.0 起便有了 ASP.NET 网站管理工具, 这个工具呢,其实就是 ...

  6. [CoreOS]CoreOS 实战:CoreOS 及管理工具介绍

    转载:http://www.infoq.com/cn/articles/what-is-coreos [编者按]CoreOS是一个基于Docker的轻量级容器化Linux发行版,专为大型数据中心而设计 ...

  7. API管理工具介绍

    ​ 时间都去哪里了 敏捷迭代和团队协作,前后端分离的工作模式几乎是每个互联网公司的常规工作模式. 前后端分离,各自开发的优点很多,其中一项是它只需要提供一个统一的API接口,即可被web,iOS,An ...

  8. Python包管理工具介绍

    常见的包管理工具及关系 setuptools -->distribute easy_install-->pip 1.distribute distribute是对标准库disutils模块 ...

  9. npm管理工具介绍

    概述 Npm是NodeJS包管理工具,在最新版本中Nodejs集成了npm,可以通过输入 "npm -v" 来测试是否成功安装.如果你安装的是旧版本的 npm,可以通过 npm 命 ...

随机推荐

  1. 设计模式_策略模式_在Spring中的应用

    一.理论 在spring中经常有读取配置文件的需求,这里就会用到一个Spring提供的Resource接口 Resource 接口是具体资源访问策略的抽象,也是所有资源访问类所实现的接口.Resour ...

  2. Ruby on rails 项目启动流程

    众所周知,我们可以通过rails s 这个命令来启动一个rails 项目,但是这条命令都干了哪些事呢?抽时间研究了下,同时感谢tomwang1013的博客.当我们输入rails s 这个命令的时候,项 ...

  3. hdu 5179 beautiful number(数位dp)

    原题链接 题意:求[l,r]中高位%低位等于0的数字个数.(不含0)分析:此题有三种方法.1.暴搜,毕竟最多才10个位.2.数位dp,预处理好整体的,再处理细节. dp[i][j]表示第i位上的数字位 ...

  4. Linux下main函数启动过程【程序员自我修养笔记】【自用】

    1. 入口函数和程序初始化 1.1 程序从main开始吗? 当程序执行到main函数的第一行时,很多事情都已经完成了: [证1]如下是一段C语言代码: 代码中可以看到,在程序刚刚执行到main的时候, ...

  5. 061、flannel的连通与隔离(2019-04-01 周一)

    参考https://www.cnblogs.com/CloudMan6/p/7447716.html   flannel网络连通性测试 不同host上的容器可以通过flannel网络进行通信,需要借助 ...

  6. maven多模块依赖源码调试

    Maven多模块项目中,通常存在摸个模块同时依赖其他多个基础模块的情况.在eclipse中使用run-jetty-run插件调试时,常常会出现找不到被依赖模块对应源码的错误提示.举个例子,模块A同时依 ...

  7. extjs.net Combox赋值

    1.直接赋值 ].Rows) //遍历获取两个值 { Ext.Net.ListItem listItem = new Ext.Net.ListItem(); //每次创建一个Ext.Net.ListI ...

  8. Sql分页代码示例

    select * from (select ROW_NUMBER()over( order by id) orderid,* from test) a where a.orderid between ...

  9. 20155324 2016-2017-2 《Java程序设计》第6周学习总结

    20155324 2016-2017-2 <Java程序设计>第6周学习总结 教材学习内容总结 InputStream与OutputStream 串流设计 1.串流:Java将输入/输出抽 ...

  10. Coursera, Deep Learning 1, Neural Networks and Deep Learning - week1, Introduction to deep learning

    整个deep learing 系列课程主要包括哪些内容 Intro to Deep learning