1. ISpout接口

ISpout作为实现spout的核心interface, spout负责feeding message, 并且track这些message.
如果需要Spout track发出的message, 必须给出message-id, 这个message-id可以是任意类型, 但是如果不指定或将message-id置空, storm就不会track这个message

必须要注意的是, spout线程会在一个线程中调用ack, fail, nextTuple, 所以不用考虑互斥, 但是也要这些function中, 避免任意的block

  1. /**
  2. * ISpout is the core interface for implementing spouts. A Spout is responsible
  3. * for feeding messages into the topology for processing. For every tuple emitted by
  4. * a spout, Storm will track the (potentially very large) DAG of tuples generated
  5. * based on a tuple emitted by the spout. When Storm detects that every tuple in
  6. * that DAG has been successfully processed, it will send an ack message to the Spout.
  7. *
  8. * <p>If a tuple fails to be fully process within the configured timeout for the
  9. * topology (see {@link backtype.storm.Config}), Storm will send a fail message to the spout
  10. * for the message.</p>
  11. *
  12. * <p> When a Spout emits a tuple, it can tag the tuple with a message id. The message id
  13. * can be any type. When Storm acks or fails a message, it will pass back to the
  14. * spout the same message id to identify which tuple it's referring to. If the spout leaves out
  15. * the message id, or sets it to null, then Storm will not track the message and the spout
  16. * will not receive any ack or fail callbacks for the message.</p>
  17. *
  18. * <p>Storm executes ack, fail, and nextTuple all on the same thread. This means that an implementor
  19. * of an ISpout does not need to worry about concurrency issues between those methods. However, it
  20. * also means that an implementor must ensure that nextTuple is non-blocking: otherwise
  21. * the method could block acks and fails that are pending to be processed.</p>
  22. */
  23. public interface ISpout extends Serializable {
  24. /**
  25. * Called when a task for this component is initialized within a worker on the cluster.
  26. * It provides the spout with the environment in which the spout executes.
  27. *
  28. * <p>This includes the:</p>
  29. *
  30. * @param conf The Storm configuration for this spout. This is the configuration provided to the topology merged in with cluster configuration on this machine.
  31. * @param context This object can be used to get information about this task's place within the topology, including the task id and component id of this task, input and output information, etc.
  32. * @param collector The collector is used to emit tuples from this spout. Tuples can be emitted at any time, including the open and close methods. The collector is thread-safe and should be saved as an instance variable of this spout object.
  33. */
  34. void open(Map conf, TopologyContext context, SpoutOutputCollector collector);
  35.  
  36. /**
  37. * Called when an ISpout is going to be shutdown. There is no guarentee that close
  38. * will be called, because the supervisor kill -9's worker processes on the cluster.
  39. *
  40. * <p>The one context where close is guaranteed to be called is a topology is
  41. * killed when running Storm in local mode.</p>
  42. */
  43. void close();
  44.  
  45. /**
  46. * Called when a spout has been activated out of a deactivated mode.
  47. * nextTuple will be called on this spout soon. A spout can become activated
  48. * after having been deactivated when the topology is manipulated using the
  49. * `storm` client.
  50. */
  51. void activate();
  52.  
  53. /**
  54. * Called when a spout has been deactivated. nextTuple will not be called while
  55. * a spout is deactivated. The spout may or may not be reactivated in the future.
  56. */
  57. void deactivate();
  58.  
  59. /**
  60. * When this method is called, Storm is requesting that the Spout emit tuples to the
  61. * output collector. This method should be non-blocking, so if the Spout has no tuples
  62. * to emit, this method should return. nextTuple, ack, and fail are all called in a tight
  63. * loop in a single thread in the spout task. When there are no tuples to emit, it is courteous
  64. * to have nextTuple sleep for a short amount of time (like a single millisecond)
  65. * so as not to waste too much CPU.
  66. */
  67. void nextTuple();
  68.  
  69. /**
  70. * Storm has determined that the tuple emitted by this spout with the msgId identifier
  71. * has been fully processed. Typically, an implementation of this method will take that
  72. * message off the queue and prevent it from being replayed.
  73. */
  74. void ack(Object msgId);
  75.  
  76. /**
  77. * The tuple emitted by this spout with the msgId identifier has failed to be
  78. * fully processed. Typically, an implementation of this method will put that
  79. * message back on the queue to be replayed at a later time.
  80. */
  81. void fail(Object msgId);

 

2. SpoutOutputCollector

用于expose spout发送(emit) tuples的接口

和bolt的output collector相比, spout的output collector可以指定message-id, 用于spout track该message

 

emit

List<Integer> emit(String streamId, List<Object> tuple, Object messageId)

emit, 3个参数, 发送到的streamid, tuple, 和message-id

        如果streamid为空, 则发送到默认stream, Utils.DEFAULT_STREAM_ID

        如果messageid为空, 则spout不会track this message

        1个返回值, 最终发送到的task ids

 

emitDirect

void emitDirect(int taskId, String streamId, List<Object> tuple, Object messageId)

directgrouping, 直接通过taskid指定发送的task

 

  1. /**
  2. * This output collector exposes the API for emitting tuples from an {@link backtype.storm.topology.IRichSpout}.
  3. * The main difference between this output collector and {@link OutputCollector}
  4. * for {@link backtype.storm.topology.IRichBolt} is that spouts can tag messages with ids so that they can be
  5. * acked or failed later on. This is the Spout portion of Storm's API to
  6. * guarantee that each message is fully processed at least once.
  7. */
  8. public class SpoutOutputCollector implements ISpoutOutputCollector {
  9. ISpoutOutputCollector _delegate;
  10.  
  11. public SpoutOutputCollector(ISpoutOutputCollector delegate) {
  12. _delegate = delegate;
  13. }
  14.  
  15. /**
  16. * Emits a new tuple to the specified output stream with the given message ID.
  17. * When Storm detects that this tuple has been fully processed, or has failed
  18. * to be fully processed, the spout will receive an ack or fail callback respectively
  19. * with the messageId as long as the messageId was not null. If the messageId was null,
  20. * Storm will not track the tuple and no callback will be received. The emitted values must be
  21. * immutable.
  22. *
  23. * @return the list of task ids that this tuple was sent to
  24. */
  25. public List<Integer> emit(String streamId, List<Object> tuple, Object messageId) {
  26. return _delegate.emit(streamId, tuple, messageId);
  27. }
  28.  
  29. /**
  30. * Emits a new tuple to the default output stream with the given message ID.
  31. * When Storm detects that this tuple has been fully processed, or has failed
  32. * to be fully processed, the spout will receive an ack or fail callback respectively
  33. * with the messageId as long as the messageId was not null. If the messageId was null,
  34. * Storm will not track the tuple and no callback will be received. The emitted values must be
  35. * immutable.
  36. *
  37. * @return the list of task ids that this tuple was sent to
  38. */
  39. public List<Integer> emit(List<Object> tuple, Object messageId) {
  40. return emit(Utils.DEFAULT_STREAM_ID, tuple, messageId);
  41. }
  42.  
  43. /**
  44. * Emits a tuple to the default output stream with a null message id. Storm will
  45. * not track this message so ack and fail will never be called for this tuple. The
  46. * emitted values must be immutable.
  47. */
  48. public List<Integer> emit(List<Object> tuple) {
  49. return emit(tuple, null);
  50. }
  51.  
  52. /**
  53. * Emits a tuple to the specified output stream with a null message id. Storm will
  54. * not track this message so ack and fail will never be called for this tuple. The
  55. * emitted values must be immutable.
  56. */
  57. public List<Integer> emit(String streamId, List<Object> tuple) {
  58. return emit(streamId, tuple, null);
  59. }
  60.  
  61. /**
  62. * Emits a tuple to the specified task on the specified output stream. This output
  63. * stream must have been declared as a direct stream, and the specified task must
  64. * use a direct grouping on this stream to receive the message. The emitted values must be
  65. * immutable.
  66. */
  67. public void emitDirect(int taskId, String streamId, List<Object> tuple, Object messageId) {
  68. _delegate.emitDirect(taskId, streamId, tuple, messageId);
  69. }
  70.  
  71. /**
  72. * Emits a tuple to the specified task on the default output stream. This output
  73. * stream must have been declared as a direct stream, and the specified task must
  74. * use a direct grouping on this stream to receive the message. The emitted values must be
  75. * immutable.
  76. */
  77. public void emitDirect(int taskId, List<Object> tuple, Object messageId) {
  78. emitDirect(taskId, Utils.DEFAULT_STREAM_ID, tuple, messageId);
  79. }
  80.  
  81. /**
  82. * Emits a tuple to the specified task on the specified output stream. This output
  83. * stream must have been declared as a direct stream, and the specified task must
  84. * use a direct grouping on this stream to receive the message. The emitted values must be
  85. * immutable.
  86. *
  87. * <p> Because no message id is specified, Storm will not track this message
  88. * so ack and fail will never be called for this tuple.</p>
  89. */
  90. public void emitDirect(int taskId, String streamId, List<Object> tuple) {
  91. emitDirect(taskId, streamId, tuple, null);
  92. }
  93.  
  94. /**
  95. * Emits a tuple to the specified task on the default output stream. This output
  96. * stream must have been declared as a direct stream, and the specified task must
  97. * use a direct grouping on this stream to receive the message. The emitted values must be
  98. * immutable.
  99. *
  100. * <p> Because no message id is specified, Storm will not track this message
  101. * so ack and fail will never be called for this tuple.</p>
  102. */
  103. public void emitDirect(int taskId, List<Object> tuple) {
  104. emitDirect(taskId, tuple, null);
  105. }
  106.  
  107. @Override
  108. public void reportError(Throwable error) {
  109. _delegate.reportError(error);
  110. }
  111. }

Storm-源码分析- spout (backtype.storm.spout)的更多相关文章

  1. Storm源码分析--Nimbus-data

    nimbus-datastorm-core/backtype/storm/nimbus.clj (defn nimbus-data [conf inimbus] (let [forced-schedu ...

  2. JStorm与Storm源码分析(四)--均衡调度器,EvenScheduler

    EvenScheduler同DefaultScheduler一样,同样实现了IScheduler接口, 由下面代码可以看出: (ns backtype.storm.scheduler.EvenSche ...

  3. JStorm与Storm源码分析(三)--Scheduler,调度器

    Scheduler作为Storm的调度器,负责为Topology分配可用资源. Storm提供了IScheduler接口,用户可以通过实现该接口来自定义Scheduler. 其定义如下: public ...

  4. JStorm与Storm源码分析(二)--任务分配,assignment

    mk-assignments主要功能就是产生Executor与节点+端口的对应关系,将Executor分配到某个节点的某个端口上,以及进行相应的调度处理.代码注释如下: ;;参数nimbus为nimb ...

  5. JStorm与Storm源码分析(一)--nimbus-data

    Nimbus里定义了一些共享数据结构,比如nimbus-data. nimbus-data结构里定义了很多公用的数据,请看下面代码: (defn nimbus-data [conf inimbus] ...

  6. storm源码分析之任务分配--task assignment

    在"storm源码分析之topology提交过程"一文最后,submitTopologyWithOpts函数调用了mk-assignments函数.该函数的主要功能就是进行topo ...

  7. storm源码分析之topology提交过程

    storm集群上运行的是一个个topology,一个topology是spouts和bolts组成的图.当我们开发完topology程序后将其打成jar包,然后在shell中执行storm jar x ...

  8. JStorm与Storm源码分析(五)--SpoutOutputCollector与代理模式

    本文主要是解析SpoutOutputCollector源码,顺便分析该类中所涉及的设计模式–代理模式. 首先介绍一下Spout输出收集器接口–ISpoutOutputCollector,该接口主要声明 ...

  9. Nimbus<三>Storm源码分析--Nimbus启动过程

    Nimbus server, 首先从启动命令开始, 同样是使用storm命令"storm nimbus”来启动看下源码, 此处和上面client不同, jvmtype="-serv ...

  10. Storm-源码分析-acker (backtype.storm.daemon.acker)

    backtype.storm.daemon.acker 设计的巧妙在于, 不用分别记录和track, stream过程中所有的tuple, 而只需要track root tuple, 而所有中间过程都 ...

随机推荐

  1. 【转载】Oracle之内存结构(SGA、PGA)

    [转自]http://blog.itpub.net/25264937/viewspace-694917/ 一.内存结构 SGA(System Global Area):由所有服务进程和后台进程共享: ...

  2. 181213 - 解决Android的应用APP背景色突然被改变的问题

    在魅族最新的特定版本出现APP背景突然被改变颜色的问题 出问题的机型相关信息 型号:魅族16th Plus Android 版本: 8.1.0 安全补丁 版本: 2018年10月1日 Flyme 版本 ...

  3. Java编程介绍

    原文地址:http://happyshome.cn/blog/java/introduction.html 本文介绍的编程基础知识很Java适合刚開始学习的人. 要学习编程,你须要了解编程语言的语法和 ...

  4. Excel累加上一行的数值

    默认一拖是每一行+1,现在想加任意: =A1+X 然后一拖就可以加X了.

  5. phpMyAdmin安装教程

    phpMyAdmin安装教程: 解压:将下载文件解压缩到 WEB 访问路径下.文件目录如phpmyadmin. 配置文件:然后配置目录下libraries文件下的 config.default.php ...

  6. Macbook小问题

    Macbook小问题 有时候 AppStore 和 Safari,QQ等 无法上网,但 chrome 却是正常的.解决办法:终端输入如下命令,其实是在 kill 掉网卡进程. sudo killall ...

  7. 快速解读GC日志

    本文是 Plumbr 发行的 Java垃圾收集指南 的部分内容.文中将介绍GC日志的输出格式, 以及如何解读GC日志, 从中提取有用的信息.我们通过 -XX:+UseSerialGC 选项,指定JVM ...

  8. HTML5七巧板canvas绘图(复习)

    <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <m ...

  9. linux重命名session和window

    重命名 window title 最近想要给screen session中的每一个 窗口命名一个标识名字,而不是默认的 $ bash 相关命令: ctrl+z(我的screen配置的+z,默认是+a) ...

  10. linux web.py spawn-fcgi web.py 配置

    本来要用uwsgi,但是...介于以前说过...这台服务器略老...redhat 3的系统...确实很老,没法用yum,没法安装很多东西,打算自己编译uwsgi,但是编译各种错误...花了快一天,最后 ...