Understanding the Internal Message Buffers of Storm

Jun 21st, 2013

When you are optimizing the performance of your Storm topologies it helps to understand how Storm’s internal messagequeues are configured and put to use. In this short article I will explain and illustrate how Storm version 0.8/0.9implements the intra-worker
communication that happens within a worker process and its associated executor threads.

Internal messaging within Storm worker processes

Terminology: I will use the terms message and (Storm)
tuple
interchangeably in the following sections.

When I say “internal messaging” I mean the messaging that happens within a worker process in Storm, which is communicationthat is restricted to happen within the same Storm machine/node. For this communication Storm relies on various messagequeues backed
by LMAX Disruptor, which is a high performance inter-threadmessaging library.

Note that this communication within the threads of a worker process is different from Storm’s
inter-workercommunication, which normally happens across machines and thus over the network. For the latter Storm usesZeroMQ by default (in Storm 0.9 there is experimental support for
Netty asthe network messaging backend). That is, ZeroMQ/Netty are used when a task in one worker process wants to send data toa task that runs in a worker process on different machine in the Storm cluster.

So for your reference:

  • Intra-worker communication in Storm (inter-thread on the same Storm node): LMAX Disruptor
  • Inter-worker communication (node-to-node across the network): ZeroMQ or Netty
  • Inter-topology communication: nothing built into Storm, you must take care of this yourself with e.g. a messagingsystem such as Kafka/RabbitMQ, a database, etc.

If you do not know what the differences are between Storm’s worker processes, executor threads and tasks please take alook atUnderstanding
the Parallelism of a Storm Topology
.

Illustration

Let us start with a picture before we discuss the nitty-gritty details in the next section.

Figure 1: Overview of a worker’s internal message queues in Storm. Queues related to a worker process are colored inred, queues related to the worker’s various executor threads are colored in green. For readability reasons I show onlyone
worker process (though normally a single Storm node runs multiple such processes) and only one executor threadwithin that worker process (of which, again, there are usually many per worker process).

Detailed description

Now that you got a first glimpse of Storm’s intra-worker messaging setup we can discuss the details.

Worker processes

To manage its incoming and outgoing messages each worker process has a single receive thread that listens on the worker’sTCP port (as configured via
supervisor.slots.ports). The parameter topology.receiver.buffer.size determines thebatch size that the receive thread uses to place incoming messages into the incoming queues of the worker’s executorthreads. Similarly, each worker
has a single send thread that is responsible for reading messages from the worker’stransfer queue and sending them over the network to downstream consumers. The size of the transfer queue is configuredvia
topology.transfer.buffer.size.

  • The topology.receiver.buffer.size is the maximum number of messages that are batched together at once forappending to an executor’s incoming queue by the worker receive thread (which reads the messages from the network)Setting this parameter
    too high may cause a lot of problems (“heartbeat thread gets starved, throughput plummets”).The default value is 8 elements, and the value must be a power of 2 (this requirement comes indirectly from LMAXDisruptor).
1
2
3
// Example: configuring via Java API
Config conf = new Config();
conf.put(Config.TOPOLOGY_RECEIVER_BUFFER_SIZE, 16); // default is 8
Note that topology.receiver.buffer.size is in contrast to the other buffer size related parameters described in this article actually not configuring the size of an LMAX Disruptor queue. Rather it sets the size of a simple
ArrayList that is used to buffer incoming messages because in this specific case the data structure does not need to be shared with other threads, i.e. it is local to the worker’s receive thread. But because the content of this buffer is used to fill a
Disruptor-backed queue (executor incoming queues) it must still be a power of 2. See
launch-receive-thread! in backtype.storm.messaging.loader for details.
  • Each element of the transfer queue configured with topology.transfer.buffer.size is actually a
    list of tuples.The various executor send threads will batch outgoing tuples off their outgoing queues onto the transfer queue. Thedefault value is 1024 elements.
1
2
// Example: configuring via Java API
conf.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 32); // default is 1024

Executors

Each worker process controls one or more executor threads. Each executor thread has its own
incoming queue andoutgoing queue. As described above, the worker process runs a dedicated worker receive thread that is responsiblefor moving incoming messages to the appropriate incoming queue of the worker’s various executor threads. Similarly,each
executor has its dedicated send thread that moves an executor’s outgoing messages from its outgoing queue to the“parent” worker’s transfer queue. The sizes of the executors’ incoming and outgoing queues are configured viatopology.executor.receive.buffer.size
and topology.executor.send.buffer.size, respectively.

Each executor thread has a single thread that handles the user logic for the spout/bolt (i.e. your application code),and a single send thread which moves messages from the executor’s outgoing queue to the worker’s transfer queue.

  • The topology.executor.receive.buffer.size is the size of the incoming queue for an executor. Each element ofthis queue is a
    list of tuples. Here, tuples are appended in batch. The default value is 1024 elements, andthe value must be a power of 2 (this requirement comes from LMAX Disruptor).
1
2
// Example: configuring via Java API
conf.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 16384); // batched; default is 1024
  • The topology.executor.send.buffer.size is the size of the outgoing queue for an executor. Each element of thisqueue will contain a
    single tuple. The default value is 1024 elements, and the value must be a power of 2 (thisrequirement comes from LMAX Disruptor).
1
2
// Example: configuring via Java API
conf.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 16384); // individual tuples; default is 1024

Where to go from here

How to configure Storm’s internal message buffers

The various default values mentioned above are defined inconf/defaults.yaml. You can override these valuesglobally in a Storm cluster’s
conf/storm.yaml. You can also configure these parameters per individual Stormtopology via
backtype.storm.Config in Storm’s JavaAPI.

How to configure Storm’s parallelism

The correct configuration of Storm’s message buffers is closely tied to the workload pattern of your topology as wellas the configured
parallelism of your topologies. SeeUnderstanding the Parallelism of a Storm Topologyfor more details about the latter.

Understand what’s going on in your Storm topology

The Storm UI is a good start to inspect key metrics of your running Storm topologies. For instance, it shows you theso-called “capacity” of a spout/bolt. The various metrics will help you decide whether your changes to thebuffer-related configuration parameters
described in this article had a positive or negative effect on the performanceof your Storm topologies. SeeRunning a Multi-Node Storm Cluster for details.

Apart from that you can also generate your own application metrics and track them with a tool like Graphite.See my articles
Sending Metrics From Storm to Graphite andInstalling and Running Graphite via RPM and Supervisordfor details. It might also
be worth checking out ooyala’smetrics_storm project on GitHub (I haven’t used it yet).

Advice on performance tuning

Watch Nathan Marz’s talk onTuning and Productionization of Storm.

The TL;DR version is: Try the following settings as a first start and see whether it improves the performance of yourStorm topology.

1
2
3
4
conf.put(Config.TOPOLOGY_RECEIVER_BUFFER_SIZE,             8);
conf.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 32);
conf.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 16384);
conf.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 16384);
Interested in more? You can subscribe to this blog, or follow me on Twitter.

Posted by Michael G. Noll Jun 21st, 2013
Filed under Programming, Storm

原文地址: http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/

Understanding the Internal Message Buffers of Storm的更多相关文章

  1. STORM在线业务实践-集群空闲CPU飙高问题排查

    源:http://daiwa.ninja/index.php/2015/07/18/storm-cpu-overload/ 2015-07-18AUTHORDAIWA STORM在线业务实践-集群空闲 ...

  2. STORM在线业务实践-集群空闲CPU飙高问题排查(转)

    最近将公司的在线业务迁移到Storm集群上,上线后遇到低峰期CPU耗费严重的情况.在解决问题的过程中深入了解了storm的内部实现原理,并且解决了一个storm0.9-0.10版本一直存在的严重bug ...

  3. Storm-源码分析- Disruptor在storm中的使用

    Disruptor 2.0, (http://ifeve.com/disruptor-2-change/) Disruptor为了更便于使用, 在2.0做了比较大的调整, 比较突出的是更换了几乎所有的 ...

  4. Storm worker 并行度等理解

    Storm 调优是非常重要的, 仅次于写出正确的代码, 好在Storm官网上有关于worker executors tasks的介绍, http://storm.incubator.apache.or ...

  5. Storm内部的消息传递机制

    作者:Jack47 转载请保留作者和原文出处 欢迎关注我的微信公众账号程序员杰克,两边的文章会同步,也可以添加我的RSS订阅源. 一个Storm拓扑,就是一个复杂的多阶段的流式计算.Storm中的组件 ...

  6. Storm 学习之路(二)—— Storm核心概念详解

    一.Storm核心概念 1.1 Topologies(拓扑) 一个完整的Storm流处理程序被称为Storm topology(拓扑).它是一个是由Spouts 和Bolts通过Stream连接起来的 ...

  7. Storm 系列(二)—— Storm 核心概念详解

    一.Storm核心概念 1.1 Topologies(拓扑) 一个完整的 Storm 流处理程序被称为 Storm topology(拓扑).它是一个是由 Spouts 和 Bolts 通过 Stre ...

  8. Storm如何保证可靠的消息处理

    作者:Jack47 PS:如果喜欢我写的文章,欢迎关注我的微信公众账号程序员杰克,两边的文章会同步,也可以添加我的RSS订阅源. 本文主要翻译自Storm官方文档Guaranteeing messag ...

  9. Android Message Handling Mechanism

    转自:http://solarex.github.io/blog/2015/09/22/android-message-handling-mechanism/ Android is a message ...

随机推荐

  1. WinSCP无法连接 ubuntu 的解决方法

    ubuntu默认不安装sshd服务 需要sudo apt-get install ssh 你可以在ubuntu本机ssh localhost测试是否成功安装了ssh 因为WinSCP是基于ssh的sf ...

  2. iOS 10 UserNotifications 使用说明

    本教程以贴代码为主.尽可能直观,少量说明. 注意:XCode8的需要手动开启主target Capabilities中的Push Notification. 关于创建多个target后真机测试的证书问 ...

  3. 微信小程序-位置坐标

    wx.getLocation(OBJECT) 获取当前的地理位置.速度. OBJECT参数说明: success返回参数说明: 示例代码: wx.getLocation({ type: 'wgs84' ...

  4. DuiLib 源码分析之解析xml类CMarkup & CMarkupNode 头文件

    xml使用的还是比较多的,duilib界面也是通过xml配置实现的 duilib提供了CMarkkup和CMarkupNode类解析xml,使用起来也是比较方便的,比较好奇它是怎么实现的,如果自己来写 ...

  5. AngularJS小知识点一

    AngularJS是由谷歌公司及一个由开发者组成的个人社区共同打造.其主要优势在于帮助使用者在web应用程序中实现必要的动态视图.它是通过原生的MVC(模型-视图-控制器)功能来增强HTML. PS: ...

  6. 最新榜单!消金企业TOP10,数据、风控、催收服务方TOP5

    最新榜单!消金企业TOP10,数据.风控.催收服务方TOP5 布谷TIME2016-12-15 17:47:59消费 风控阅读(164)评论(0) 声明:本文由入驻搜狐公众平台的作者撰写,除搜狐官方账 ...

  7. python成长之路【第九篇】:网络编程

    一.套接字 1.1.套接字套接字最初是为同一主机上的应用程序所创建,使得主机上运行的一个程序(又名一个进程)与另一个运行的程序进行通信.这就是所谓的进程间通信(Inter Process Commun ...

  8. linux下的nodejs安装

      linux下安装nodejs的方式: 1.源码安装 2.nvm安装 这里推荐使用nvm安装,避免下载nodejs源码:   安装步骤: 一.安装git        一般linux系统的git版本 ...

  9. easymock所测试的方法内部新NEW对象的处理

    问题:当记录的方法的参数是方法所在类内部新NEW的对象时,静态的记录方法交互就会失效,例如 调用的方法: public calss A{ public void method(User u){ u.s ...

  10. Vim插件管理器Vundle使用

    参考地址:http://www.linuxidc.com/Linux/2012-12/75684.htm Vundle(Vim bundle) 是一个vim的插件管理器. 其Github地址为: ht ...