Kafka Tuning Recommendations
- Kafka Brokers per Server
- Recommend 1 Kafka broker per server- Kafka not only disk-intensive but can be network intensive so if you run multiple broker in a single host network I/O can be the bottleneck . Running single broker per host and having a cluster will give you better availability.
- Increase Disks allocated to Kafka Broker
- Kafka parallelism is largely driven by the number of disks and partitions per topic.
- From the Kafka documentation: “We recommend using multiple drives to get good throughput and not sharing the same drives used for Kafka data with application logs or other OS filesystem activity to ensure good latency. As of 0.8 you can format and mount each drive as its own directory. If you configure multiple data directories partitions will be assigned round-robin to data directories. Each partition will be entirely in one of the data directories. If data is not well balanced among partitions this can lead to load imbalance between disks.”
- Number of Threads
- Make sure you set num.io.threads to at least no.of disks you are going to use by default its 8. It be can higher than the number of disks.
- Set num.network.threads higher based on number of concurrent producers, consumers, and replication factor.
- Number of partitions
- Ideally you want to assign the default number of partitions (num.partitions) to at least n-1 servers. This can break up the write workload and it allows for greater parallelism on the consumer side. Remember that Kafka does total ordering within a partition, not over multiple partitions, so make sure you partition intelligently on the producer side to parcel up units of work that might span multiple messages/events.
- Message Size
- Kafka is designed for small messages. I recommend you to avoid using kafka for larger messages. If thats not avoidable there are several ways to go about sending larger messages like 1MB. Use compression if the original message is json, xml or text using compression is the best option to reduce the size. Large messages will affect your performance and throughput. Check your topic partitions and replica.fetch.size to make sure it doesn’t go over your physical ram.
- Large Messages
- Another approach is to break the message into smaller chunks and use the same message key to send it same partition. This way you are sending small messages and these can be re-assembled at the consumer side.
- Broker side:
- message.max.bytes defaults to 1000000 . This indicates the maximum size of message that a kafka broker will accept.
- replica.fetch.max.bytes defaults to 1MB . This has to be bigger than message.max.bytes otherwise brokers will not be able to replicate messages.
- Consumer side:
- fetch.message.max.bytes defaults to 1MB. This indicates maximum size of a message that a consumer can read. This should be equal or larger than message.max.bytes.
- Kafka Heap Size
- By default kafka-broker jvm is set to 1Gb this can be increased using Ambari kafka-env template. When you are sending large messages JVM garbage collection can be an issue. Try to keep the Kafka Heap size below 4GB.
- Example: In kafka-env.sh add following settings.
- export KAFKA_HEAP_OPTS="-Xmx16g -Xms16g"
- export KAFKA_JVM_PERFORMANCE_OPTS="-XX:MetaspaceSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80"
- Example: In kafka-env.sh add following settings.
- By default kafka-broker jvm is set to 1Gb this can be increased using Ambari kafka-env template. When you are sending large messages JVM garbage collection can be an issue. Try to keep the Kafka Heap size below 4GB.
- Dedicated Zookeeper
- Have a separate zookeeper cluster dedicated to Storm/Kafka operations. This will improve Storm/Kafka’s performance for writing offsets to Zookeeper, it will not be competing with HBase or other components for read/write access.
- ZK on separate nodes from Kafka Broker
- Do Not Install zk nodes on the same node as kafka broker if you want optimal Kafka performance. Disk I/O both kafka and zk are disk I/O intensive.
- Disk Tuning sections
- Please review the Kafka documentation on filesystem tuning parameters here.
- Disable THP according to documentation here.
- Either ext4 or xfs filesystems are recommended for performance benefit.
- Minimal replication
- If you are doing replication, start with 2x rather than 3x for Kafka clusters larger than 3 machines. Alternatively, use 2x even if a 3 node cluster if you are able to reprocess upstream from your source.
- Avoid Cross Rack Kafka deployments
- Avoid cross-rack Kafka deployments for now until Kafka 0.8.2 - see: https://issues.apache.org/jira/browse/KAFKA-1215
Kafka Tuning Recommendations的更多相关文章
- 深入了解SQL Tuning Advisor(转载)
1.前言:一直以来SQL调优都是DBA比较费力的技术活,而且很多DBA如果没有从事过开发的工作,那么调优更是一项头疼的工作,即使是SQL调优很厉害的高手,在SQL调优的过程中也要不停的分析执行计划.加 ...
- Kafka性能调优 - Kafka优化的方法
今天,我们将讨论Kafka Performance Tuning.在本文“Kafka性能调优”中,我们将描述在设置集群配置时需要注意的配置.此外,我们将讨论Tuning Kafka Producers ...
- jmeter分布式压测
stop.sh需要跑Jmeter的服务器上安装Jmeteryum install lrzsz 安装rz.sz命令rz jemter的压缩包 拷贝到/usr/local/tools下面unzip apa ...
- jmeter学习记录--03--jmeter负载与监听
jmeter场景主要通过线程组设置完成,有些复杂场景需要与逻辑控制器配合. 一.测试计划设计与执行 场景设计 jmete线程组实际是一个线程池,根据用户设置进行线程池的初始优化,在运行时做各种异常的处 ...
- jmeter对自身性能的优化
测试环境 apache-jmeter-2.13 1. 问题描述 单台机器的下JMeter启动较大线程数时可能会出现运行报错的情况,或者在运行一段时间后,JMeter每秒生成的请求数会逐步下降, ...
- JMeter JMeter自身运行性能优化
JMeter自身运行性能优化 by:授客 QQ:1033553122 测试环境 apache-jmeter-2.13 1. 问题描述 单台机器的下JMeter启动较大线程数时可能会出现运行 ...
- 【翻译自mos文章】私有网络所用的协议 与 Oracle RAC
说的太经典了,不敢翻译.直接上原文. 来源于: Network Protocols and Real Application Clusters (文档 ID 278132.1) PURPOSE --- ...
- JMeter内存溢出:java.lang.OutOfMemoryError: Java heap space解决方法
一.问题原因 用JMeter压测,有时候当模拟并发请求较大或者脚本运行时间较长时,JMeter会停止,报OOM(内存溢出)错误. 原因是JMeter是一个纯Java开发的工具,内存由java虚拟机JV ...
- Jmeter系列(35)- 设置JVM内存
场景 单台机器的下JMeter启动较大线程数时可能会出现运行报错的情况,或者在运行一段时间后,JMeter每秒生成的请求数会逐步下降,直到为0,即JMeter运行变得很"卡",这时 ...
随机推荐
- Windows无人值守文件unattend制作以及自定义系统安装
原文链接:Create media for automated unattended install of Windows 10 我从来没看到过像上面的文章一样这么详细的描述过Windows10的无人 ...
- Agent Job代理 执行SSIS Package
摘要: 在使用Agent Job时, 运行SSIS包的Run as账号,必须有SSIS中connection manager的连接权限. 如果没有连接权限,可以用创建proxy账号,并确保proxy账 ...
- Javascript高级编程学习笔记(95)—— WebGL(1) 类型化数组
WebGL webgl 是针对 canvas 的 3D上下文,与其它Web技术不同,WebGL并非是W3C制定的标准,而是由 Khronos Group 制定的. 类型化数组 WebGL所涉及的复杂运 ...
- 微信公众号支付提示mch_id参数格式错误
背景: .Net MVC微信公众号支付功能 问题: 今天在做网站微信支付的时候,一直提示“微信公众号支付提示mch_id参数格式错误” ! 解决方法: 其实这个问题一般并不是说你配置有错,首先它提示你 ...
- mysql的学习笔记(八)
1.存储引擎(表类型) mysql将数据以不同的技术存储在文件(内存)中,这种技术称为存储引擎.每一种存储引擎使用不同的存储机制,索引技巧,锁定水平,提供广泛且不同的功能. mysql支持的存储引擎 ...
- 微服务实战(三):落地微服务架构到直销系统(构建基于RabbitMq的消息总线)
从前面文章可以看出,消息总线是EDA(事件驱动架构)与微服务架构的核心部件,没有消息总线,就无法很好的实现微服务之间的解耦与通讯.通常我们可以利用现有成熟的消息代理产品或云平台提供的消息服务来构建自己 ...
- 【反编译系列】二、反编译代码(jeb)
版权声明:本文为HaiyuKing原创文章,转载请注明出处! 概述 一般情况下我们都是使用dex2jar + jd-gui的方式反编译代码,在实际使用过程中,有时候发现反编译出来的代码阅读效果不是很好 ...
- LindDotNetCore~ISoftDelete软删除接口
回到目录 概念 ISoftDelete即软删除,数据在进行delete后不会从数据库清除,而只是标记一个状态,在业务范围里都不能获取到这个数据,这在ORM框架里还是比较容易实现的,对传统的ado来说需 ...
- parsing:NLP之chart parser句法分析器
已迁移到我新博客,阅读体验更佳parsing:NLP之chart parser句法分析器 完整代码实现放在我的github上:click me 一.任务要求 实现一个基于简单英语语法的chart句法分 ...
- Spring Cloud Alibaba 新版本发布:众多期待内容整合打包加入!
在Nacos 1.0.0 Release之后,Spring Cloud Alibaba也终于发布了最新的版本.该版本距离上一次发布,过去了整整4个月!下面就随我一起看看,这个大家期待已久的版本都有哪些 ...