对于executor thread是整个storm最为核心的代码, 因为在这个thread里面真正完成了大部分工作, 而其他的如supervisor,worker都是封装调用.

对于executor的mk-threads, 是通过mutilmethods对spout和bolt分别定义不同的逻辑

1. Spout Thread

(defmethod mk-threads :spout [executor-data task-datas]
(let [{:keys [storm-conf component-id worker-context transfer-fn report-error sampler open-or-prepare-was-called?]} executor-data
;;1.1 定义pending
        ^ISpoutWaitStrategy spout-wait-strategy (init-spout-wait-strategy storm-conf)
max-spout-pending (executor-max-spout-pending storm-conf (count task-datas))
^Integer max-spout-pending (if max-spout-pending (int max-spout-pending))
last-active (atom false)
spouts (ArrayList. (map :object (vals task-datas)))
rand (Random. (Utils/secureRandomLong))
pending (RotatingMap.
2 ;; microoptimize for performance of .size method
(reify RotatingMap$ExpiredCallback
(expire [this msg-id [task-id spout-id tuple-info start-time-ms]]
(let [time-delta (if start-time-ms (time-delta-ms start-time-ms))] ;;start-time-ms是取样赋值的,一般为null,只有有start-time-ms,才会产生time-delta
(fail-spout-msg executor-data (get task-datas task-id) spout-id tuple-info time-delta)
))))
 
        ;;1.2 定义tuple-action-fn
tuple-action-fn (fn [task-id ^TupleImpl tuple]
(let [stream-id (.getSourceStreamId tuple)]
(condp = stream-id
Constants/SYSTEM_TICK_STREAM_ID (.rotate pending)
Constants/METRICS_TICK_STREAM_ID (metrics-tick executor-data task-datas tuple)
(let [id (.getValue tuple 0) ;;tuple values, values[0]为id
[stored-task-id spout-id tuple-finished-info start-time-ms] (.remove pending id)];;从pending中删除tuple,重要!
(when spout-id
(when-not (= stored-task-id task-id)
(throw-runtime "Fatal error, mismatched task ids: " task-id "" stored-task-id))
(let [time-delta (if start-time-ms (time-delta-ms start-time-ms))]
(condp = stream-id
ACKER-ACK-STREAM-ID (ack-spout-msg executor-data (get task-datas task-id) ;;ack
spout-id tuple-finished-info time-delta)
ACKER-FAIL-STREAM-ID (fail-spout-msg executor-data (get task-datas task-id) ;;fail
spout-id tuple-finished-info time-delta)
)))
;; TODO: on failure, emit tuple to failure stream
))))
receive-queue (:receive-queue executor-data) ;;取得receive disruptor queue
event-handler (mk-task-receiver executor-data tuple-action-fn) ;;定义disruptor/clojure-handler, 使用tuple-action-fn处理从receive-queue里面得到的tuple
has-ackers? (has-ackers? storm-conf)
emitted-count (MutableLong. 0)
empty-emit-streak (MutableLong. 0) ;; the overflow buffer is used to ensure that spouts never block when emitting
;; this ensures that the spout can always clear the incoming buffer (acks and fails), which
;; prevents deadlock from occuring across the topology (e.g. Spout -> Bolt -> Acker -> Spout, and all
;; buffers filled up)
;; when the overflow buffer is full, spouts stop calling nextTuple until it's able to clear the overflow buffer
;; this limits the size of the overflow buffer to however many tuples a spout emits in one call of nextTuple,
;; preventing memory issues
overflow-buffer (LinkedList.)]
    ;; 1.3 async-loop thread
[(async-loop
(fn []
;; If topology was started in inactive state, don't call (.open spout) until it's activated first.
(while (not @(:storm-active-atom executor-data))
(Thread/sleep 100)) (log-message "Opening spout " component-id ":" (keys task-datas))
(doseq [[task-id task-data] task-datas
:let [^ISpout spout-obj (:object task-data)
tasks-fn (:tasks-fn task-data)
                      ;; 1.3.1 send-spout-msg
send-spout-msg (fn [out-stream-id values message-id out-task-id]
(.increment emitted-count)
(let [out-tasks (if out-task-id
(tasks-fn out-task-id out-stream-id values) ;;direct grouping
(tasks-fn out-stream-id values)) ;;调用grouper产生target tasks
rooted? (and message-id has-ackers?) ;;指定messageid并且有acker, 说明需要track该message, root?意思需要track的DAG的root
root-id (if rooted? (MessageId/generateId rand)) ;;rand.nextLong, 随机long, 产生root-id
out-ids (fast-list-for [t out-tasks] (if rooted? (MessageId/generateId rand)))] ;;对于发送到的每个task, 产生一个out-id(out-edgeid)
(fast-list-iter [out-task out-tasks id out-ids]
(let [tuple-id (if rooted?
(MessageId/makeRootId root-id id);;返回包含hashmap{root-id, out-id}的MessageId对象
                                                                          (MessageId/makeUnanchored))  ;;返回包含hashmap{}的MessageId对象 
                                                               out-tuple (TupleImpl. worker-context   ;;生成tuple对象
                                                                                     values
task-id
out-stream-id
tuple-id)]
(transfer-fn out-task ;;调用executor->transfer-fn将tuple发送到spout的发送queue
out-tuple
overflow-buffer)))
(if rooted?
(do ;;如果需要跟踪
(.put pending root-id [task-id  ;;往pending queue增加需要track的tuple信息
                                                                    message-id
{:stream out-stream-id :values values}
(if (sampler) (System/currentTimeMillis))]) ;;只有sampler为true, 才会设置starttime,后面才会更新metrics和stats
(task/send-unanchored task-data ;;往ACKER-INIT-STREAM发送message, 告诉acker track该message
ACKER-INIT-STREAM-ID
[root-id (bit-xor-vals out-ids) task-id]
overflow-buffer))
(when message-id ;;rooted?为false, 而有message-id, 意味着没有acker(has-ackers?为false)
(ack-spout-msg executor-data task-data message-id ;;既然没有acker, 就直接ack
{:stream out-stream-id :values values}
(if (sampler) 0))))
(or out-tasks []) ;;send-spout-msg返回值, 发送的task lists或空[]
))]]
(builtin-metrics/register-all (:builtin-metrics task-data) storm-conf (:user-context task-data)) ;;注册builtin-metrics
          ;; 1.3.2 spout.open
(.open spout-obj
storm-conf
(:user-context task-data)
(SpoutOutputCollector.
(reify ISpoutOutputCollector ;;实现ISpoutOutputCollector
(^List emit [this ^String stream-id ^List tuple ^Object message-id] ;;实现emit
(send-spout-msg stream-id tuple message-id nil)
)
(^void emitDirect [this ^int out-task-id ^String stream-id
^List tuple ^Object message-id]
(send-spout-msg stream-id tuple message-id out-task-id)
)
(reportError [this error]
(report-error error)
)))))
(reset! open-or-prepare-was-called? true)
(log-message "Opened spout " component-id ":" (keys task-datas))
        ;; 1.3.3 setup-metrics!
(setup-metrics! executor-data) ;;使用schedule-recurring定期给自己发送METRICS_TICK tuple (disruptor/consumer-started! (:receive-queue executor-data)) ;;设置queue上面的consumerStartedFlag表示consumer已经启动
;;1.3.4 fn
        (fn []
;; This design requires that spouts be non-blocking
(disruptor/consume-batch receive-queue event-handler) ;;从recieve-queue取出batch tuples, 并使用tuple-action-fn处理 ;; try to clear the overflow-buffer, 将overflow-buffer里面的数据放到发送的缓存queue里面
(try-cause
(while (not (.isEmpty overflow-buffer))
(let [[out-task out-tuple] (.peek overflow-buffer)]
(transfer-fn out-task out-tuple false nil)
(.removeFirst overflow-buffer)))
(catch InsufficientCapacityException e
)) (let [active? @(:storm-active-atom executor-data)
curr-count (.get emitted-count)]
(if (and (.isEmpty overflow-buffer) ;;只有当overflow-buffer为空, 并且pending没有达到上限的时候, spout可以继续emit tuple
(or (not max-spout-pending)
(< (.size pending) max-spout-pending)))
(if active? ;;storm集群是否active
(do ;;storm active
(when-not @last-active ;;如果当前spout出于unactive状态
(reset! last-active true)
(log-message "Activating spout " component-id ":" (keys task-datas))
(fast-list-iter [^ISpout spout spouts] (.activate spout))) ;;先active spout (fast-list-iter [^ISpout spout spouts] (.nextTuple spout))) ;;调用nextTuple,产生新的tuple
(do ;;storm unactive
(when @last-active ;;如果spout出于active状态
(reset! last-active false)
(log-message "Deactivating spout " component-id ":" (keys task-datas))
(fast-list-iter [^ISpout spout spouts] (.deactivate spout))) ;;deactive spout并休眠
;; TODO: log that it's getting throttled
(Time/sleep 100))))
(if (and (= curr-count (.get emitted-count)) active?) ;;没有能够emit新的tuple(前后emitted-count没有变化)
(do (.increment empty-emit-streak)
(.emptyEmit spout-wait-strategy (.get empty-emit-streak))) ;;调用spout-wait-strategy进行sleep
(.set empty-emit-streak 0)
))
0)) ;;返回0, 表示async-loop的sleep时间为0
:kill-fn (:report-error-and-die executor-data)
:factory? true
:thread-name component-id)]))

1.1 定义pending

spout在emit tuple后, 会等待ack或fail, 所以这些tuple暂时不能直接从删掉, 只能先放入pending队列, 直到最终被ack或fail后, 才能被删除

首先, tuple pending的个数是有限制的, p*num-tasks
p是TOPOLOGY-MAX-SPOUT-PENDING, num-tasks是spout的task数

max-spout-pending (executor-max-spout-pending storm-conf (count task-datas))
(defn executor-max-spout-pending [storm-conf num-tasks]
(let [p (storm-conf TOPOLOGY-MAX-SPOUT-PENDING)]
(if p (* p num-tasks))))

然后, spouts需要两种情况下需要wait, nextTuple为空, 或达到maxSpoutPending上限

/**
* The strategy a spout needs to use when its waiting. Waiting is
* triggered in one of two conditions:
*
* 1. nextTuple emits no tuples
* 2. The spout has hit maxSpoutPending and can't emit any more tuples
*
* The default strategy sleeps for one millisecond.
*/
public interface ISpoutWaitStrategy {
void prepare(Map conf);
void emptyEmit(long streak);
}

默认的wait策略是, sleep1毫秒, 可以在TOPOLOGY-SPOUT-WAIT-STRATEGY上配置特有的wait strategy class

^ISpoutWaitStrategy spout-wait-strategy (init-spout-wait-strategy storm-conf)

最后, 定义pending的结构, 并且pending是会设置超时的, 不然万一后面的blot发生问题, 会导致spout block

pending (RotatingMap.
2 ;; microoptimize for performance of .size method, buckets数为2
(reify RotatingMap$ExpiredCallback
(expire [this msg-id [task-id spout-id tuple-info start-time-ms]]
(let [time-delta (if start-time-ms (time-delta-ms start-time-ms))]
(fail-spout-msg executor-data (get task-datas task-id) spout-id tuple-info time-delta)
))))

RotatingMap (backtype.storm.utils), 是无cleaner线程版的TimeCacheMap(Storm starter - SingleJoinExample)

其他的基本一致, 主要数据结构为, LinkedList<HashMap<K, V>> _buckets;

最主要的操作是rotate, 删除旧bucket, 添加新bucket

    public Map<K, V> rotate() {
Map<K, V> dead = _buckets.removeLast();
_buckets.addFirst(new HashMap<K, V>());
if(_callback!=null) {
for(Entry<K, V> entry: dead.entrySet()) {
_callback.expire(entry.getKey(), entry.getValue());
}
}
return dead;
}

但RotatingMap需要外部的计数器来触发rotate, storm是通过SYSTEM_TICK来触发, 下面会看到

1.2 定义tuple-action-fn

tuple-action-fn, 处理不同stream的tuple

1.2.1 SYSTEM_TICK_STREAM_ID

(.rotate pending) rotate pending列表

1.2.2 METRICS_TICK_STREAM_ID

执行(metrics-tick executor-data task-datas tuple)

触发component发送builtin-metrics的data, 到METRICS_STREAM, 最终发送到metric-bolt统计当前的component处理tuples的情况

具体逻辑, 就是创建task-info和data-points, 并send到METRICS_STREAM

(defn metrics-tick [executor-data task-datas ^TupleImpl tuple]
(let [{:keys [interval->task->metric-registry ^WorkerTopologyContext worker-context]} executor-data
interval (.getInteger tuple 0)] ;;metrics tick tuple的values[0]表示interval
(doseq [[task-id task-data] task-datas
:let [name->imetric (-> interval->task->metric-registry (get interval) (get task-id)) ;;topology context的_registeredMetrics实际指向interval->task->metric-registry
task-info (IMetricsConsumer$TaskInfo.
(. (java.net.InetAddress/getLocalHost) getCanonicalHostName)
(.getThisWorkerPort worker-context)
(:component-id executor-data)
task-id
(long (/ (System/currentTimeMillis) 1000))
interval)
data-points (->> name->imetric
(map (fn [[name imetric]]
(let [value (.getValueAndReset ^IMetric imetric)]
(if value
(IMetricsConsumer$DataPoint. name value)))))
(filter identity)
(into []))]]
(if (seq data-points)
(task/send-unanchored task-data Constants/METRICS_STREAM_ID [task-info data-points]))))) ;;将[task-info data-points]发送到METRICS_STREAM

1.2.3 default, 普通tuple

对于spout而言, 作为topology的source, 收到的tuple只会是ACKER-ACK-STREAM或ACKER-FAIL-STREAM
所以收到tuple, 取得msgid, 从pending列表中删除

最终根据steamid, 调用ack-spout-msg或fail-spout-msg

(defn- ack-spout-msg [executor-data task-data msg-id tuple-info time-delta]
(let [storm-conf (:storm-conf executor-data)
^ISpout spout (:object task-data)
task-id (:task-id task-data)]
(when (= true (storm-conf TOPOLOGY-DEBUG))
(log-message "Acking message " msg-id))
(.ack spout msg-id) ;;ack
(task/apply-hooks (:user-context task-data) .spoutAck (SpoutAckInfo. msg-id task-id time-delta)) ;;执行ack hook
(when time-delta ;;满足sample条件, 更新builtin-metrics和stats
(builtin-metrics/spout-acked-tuple! (:builtin-metrics task-data) (:stats executor-data) (:stream tuple-info) time-delta)
(stats/spout-acked-tuple! (:stats executor-data) (:stream tuple-info) time-delta))))

以ack-spout-msg为例, fail基本一样, 只是调用.fail而已

1.3 async-loop thread

这是executor的主线程, 没有使用disruptor.consume-loop来实现, 是因为这里不仅仅包含对recieve tuple的处理
所以使用async-loop来直接实现

前面也了解过, async-loop的实现是新开线程执行afn, 返回为sleeptime, 然后sleep sleeptime后继续执行afn……

这里的实现比较奇特,

在afn中只是做了准备工作, 比如定义send-spout-msg, 初始化spout…

然后afn, 返回一个fn, 真正重要的工作在这个fn里面执行了, 因为sleeptime在作为函数参数的时候, 也一定会先被evaluate

比较奇葩, 为什么要这样...

1.3.1 send-spout-msg

首先生成send-spout-msg函数, 这个函数最终被emit, emitDirect调用, 用于发送spout msg

所以逻辑就是首先根据message-id判断是否需要track, 需要则利用MessageId生成root-id和out-id

然后生成tuple对象(TupleImpl)

先看看MessageId和TupleImpl的定义

这里的MessageId和emit传入的message-id没有什么关系, 这个名字起的容易混淆

这里主要的操作就是通过generateId产生随机id, 然后通过makeRootId, 将[root-id, out-id]加入Map, anchorsToIds

package backtype.storm.tuple;
public class MessageId {
private Map<Long, Long> _anchorsToIds; public static long generateId(Random rand) {
return rand.nextLong();
} public static MessageId makeUnanchored() {
return makeId(new HashMap<Long, Long>());
} public static MessageId makeId(Map<Long, Long> anchorsToIds) {
return new MessageId(anchorsToIds);
} public static MessageId makeRootId(long id, long val) {
Map<Long, Long> anchorsToIds = new HashMap<Long, Long>();
anchorsToIds.put(id, val);
return new MessageId(anchorsToIds);
}
public class TupleImpl extends IndifferentAccessMap implements Seqable, Indexed, IMeta, Tuple {
private List<Object> values;
private int taskId;
private String streamId;
private GeneralTopologyContext context;
private MessageId id;
private IPersistentMap _meta = null; Long _processSampleStartTime = null;
Long _executeSampleStartTime = null;
}

后面做的事, 使用transfer-fn将tuple发到发送queue, 然后在pending中增加item用于tracking, 并send message到acker通知它track这个message

1.3.2 spout.open, 初始化spout

很简单, 关键是实现ISpoutOutputCollector, emit, emitDirect

1.3.3 setup-metrics!, METRICS_TICK的来源

使用schedule-recurring定期给自己发送METRICS_TICK tuple, 以触发builtin-metrics的定期发送

1.3.4 fn

里面做了spout thread最关键的几件事, 最终返回0, 表示async-loop的sleep时间
handle recieve-queue里面的tuple

调用nextTuple…

注意所有事情都是在一个线程里面顺序做的, 所以不能有block的逻辑

2. Bolt Thread

(defmethod mk-threads :bolt [executor-data task-datas]
(let [execute-sampler (mk-stats-sampler (:storm-conf executor-data))
executor-stats (:stats executor-data)
{:keys [storm-conf component-id worker-context transfer-fn report-error sampler
open-or-prepare-was-called?]} executor-data
rand (Random. (Utils/secureRandomLong))
 
        ;;2.1 tuple-action-fn
tuple-action-fn (fn [task-id ^TupleImpl tuple]
(let [stream-id (.getSourceStreamId tuple)]
(condp = stream-id
Constants/METRICS_TICK_STREAM_ID (metrics-tick executor-data task-datas tuple)
(let [task-data (get task-datas task-id)
                                    ^IBolt bolt-obj (:object task-data)  ;;取出bolt对象
                                    user-context (:user-context task-data)
sampler? (sampler)
execute-sampler? (execute-sampler)
now (if (or sampler? execute-sampler?) (System/currentTimeMillis))] ;;满足sample条件,记录当前时间
                                (when sampler?
(.setProcessSampleStartTime tuple now))
(when execute-sampler?
(.setExecuteSampleStartTime tuple now))
(.execute bolt-obj tuple) ;;调用Bolt的execute方法
(let [delta (tuple-execute-time-delta! tuple)] ;;只有上面生成了now, 这里delta才不为空
(task/apply-hooks user-context .boltExecute (BoltExecuteInfo. tuple task-id delta)) ;;执行boltExecute hook
(when delta ;;满足sample条件, 则更新builtin-metrics和stats
(builtin-metrics/bolt-execute-tuple! (:builtin-metrics task-data)
executor-stats
(.getSourceComponent tuple)
(.getSourceStreamId tuple)
delta)
(stats/bolt-execute-tuple! executor-stats
(.getSourceComponent tuple)
(.getSourceStreamId tuple)
delta)))))))] ;; TODO: can get any SubscribedState objects out of the context now
    ;;2.2 async-loop
[(async-loop
(fn []
;; If topology was started in inactive state, don't call prepare bolt until it's activated first.
(while (not @(:storm-active-atom executor-data))
(Thread/sleep 100)) (log-message "Preparing bolt " component-id ":" (keys task-datas))
(doseq [[task-id task-data] task-datas
:let [^IBolt bolt-obj (:object task-data)
tasks-fn (:tasks-fn task-data)
user-context (:user-context task-data)
                      ;;2.2.1 bolt-emit
bolt-emit (fn [stream anchors values task]
(let [out-tasks (if task
(tasks-fn task stream values) ;;direct grouping
(tasks-fn stream values))]
(fast-list-iter [t out-tasks] ;;每个target out-task
(let [anchors-to-ids (HashMap.)] ;;初始化,用于保存tuple上产生的edges和roots之间的关系
(fast-list-iter [^TupleImpl a anchors] ;;每个anchor(源tuple)
(let [root-ids (-> a .getMessageId .getAnchorsToIds .keySet)] ;;得到所有的root-ids,anchor可能来自多个源
                                                                        (when (pos? (count root-ids))
(let [edge-id (MessageId/generateId rand)] ;;为每个anchor产生新的edge-id
(.updateAckVal a edge-id) ;;和anchor tuple的_outAckVal做异或, 缓存新产生的edgeid
(fast-list-iter [root-id root-ids]
(put-xor! anchors-to-ids root-id edge-id)) ;;生成新的anchors-to-ids, 保存新edge和所有root-id的关系到anchors-to-ids
))))
(transfer-fn t
(TupleImpl. worker-context
values
task-id
stream
(MessageId/makeId anchors-to-ids)))))
(or out-tasks [])))]] ;;返回值, target task ids
(builtin-metrics/register-all (:builtin-metrics task-data) storm-conf user-context)
 
           2.2.2 prepare
(.prepare bolt-obj
storm-conf
user-context
(OutputCollector.
(reify IOutputCollector
(emit [this stream anchors values]
(bolt-emit stream anchors values nil))
(emitDirect [this task stream anchors values]
(bolt-emit stream anchors values task))
(^void ack [this ^Tuple tuple]
(let [^TupleImpl tuple tuple
ack-val (.getAckVal tuple)] ;;取出缓存的新edges
(fast-map-iter [[root id] (.. tuple getMessageId getAnchorsToIds)] ;;对于anchors-to-ids中记录的每个root进行ack
(task/send-unanchored task-data
ACKER-ACK-STREAM-ID
[root (bit-xor id ack-val)]) ;;发送ack消息, ack和同步新edges
))
(let [delta (tuple-time-delta! tuple)] ;;更新metrics和stats
(task/apply-hooks user-context .boltAck (BoltAckInfo. tuple task-id delta))
(when delta
(builtin-metrics/bolt-acked-tuple! (:builtin-metrics task-data)
executor-stats
(.getSourceComponent tuple)
(.getSourceStreamId tuple)
delta)
(stats/bolt-acked-tuple! executor-stats
(.getSourceComponent tuple)
(.getSourceStreamId tuple)
delta))))
(^void fail [this ^Tuple tuple]
(fast-list-iter [root (.. tuple getMessageId getAnchors)]
(task/send-unanchored task-data
ACKER-FAIL-STREAM-ID
[root])) ;;对应fail比较简单,任意一个edge失败,都表示root失败
(let [delta (tuple-time-delta! tuple)]
(task/apply-hooks user-context .boltFail (BoltFailInfo. tuple task-id delta))
(when delta
(builtin-metrics/bolt-failed-tuple! (:builtin-metrics task-data)
executor-stats
(.getSourceComponent tuple)
(.getSourceStreamId tuple))
(stats/bolt-failed-tuple! executor-stats
(.getSourceComponent tuple)
(.getSourceStreamId tuple)
delta))))
(reportError [this error]
(report-error error)
)))))
(reset! open-or-prepare-was-called? true)
(log-message "Prepared bolt " component-id ":" (keys task-datas))
(setup-metrics! executor-data) ;;创建metrics tick (let [receive-queue (:receive-queue executor-data)
event-handler (mk-task-receiver executor-data tuple-action-fn)] ;;用tuple-action-fn创建receive queue的event-handler
(disruptor/consumer-started! receive-queue) ;;标识consumer开始运行
(fn []
(disruptor/consume-batch-when-available receive-queue event-handler) ;;真正的consume receive-queue
0))) ;;sleep 0s
:kill-fn (:report-error-and-die executor-data)
:factory? true
:thread-name component-id)]))

2.1 tuple-action-fn

先判断tuple的stream-id, 对于METRICS_TICK的处理参考上面

否则, 就是普通的tuple, 用对应的task去处理
对于一个executor线程中包含多个task, 其实就是这里根据task-id选择不同的task-data

并且最终调用bolt-obj的execute, 这就是user定义的bolt逻辑

^IBolt bolt-obj (:object task-data)

(.execute bolt-obj tuple)

2.2 async-loop, 启动线程

2.2.1 bolt-emit

类似send-spout-msg, 被emit调用, 用于发送tuple, Storm的命名风格不统一

调用task-fn产生out-tasks, 以及调用transfer-fn, 将tuples发送到发送队列都比较好理解

关键中一段对于anchors-to-ids的操作, 刚开始有些费解...这个anchors-to-ids 到底干吗用的?

用于记录的DAG图中, 该tuple产生的edge, 以及和root的关系

代码里面anchor表示的是源tuple, 而理解上anchor更象是一种关系, 所以有些confuse 
所以上面的逻辑就是新产生edge-id, 虽然相同的out-task, 但不同的anchor会产生不同的edge-id

然后对每个anchor的root-ids, 产生map [root-id, edge-id] (上面的逻辑是异或, 因为不同anchors可能有相同的root)

最终就是得到该tuple产生edges和所有相关的roots之间的关系

然后其中的(.updateAckVal a edge-id)是干吗的?

为了节省一次向acker的消息发送, 理论上, 应该在创建edge的时候发送一次消息去acker上注册一下, 然后在ack的时候再发送一次消息去acker完成ack

但是storm做了优化, 节省了在创建edge的这次消息发送

优化的做法是,

将新创建的edge-id, 缓存在父tuple的_outAckVal上, 因为处理完紧接着会去ack父tuple, 所以在这个时候将新创建的edge信息一起同步到acker,具体看下面的ack实现

所以这里调用updateAckVal去更新父tuple的_outAckVal(做异或), 而没有向acker发送消息

关于storm跟踪所有tuple的方法

传统的方法, 在spout的时候, 生成rootid, 之后每次emit tuple, 产生一条edgeid, 就可以记录下整个DAG

然后在ack的时候, 只需要标记或删除这些edgeid, 表明已经处理完就ok.

这样的问题在于, 如果DAG图比较复杂, 那么这个结构会很大, 可扩展性不好

storm采用的方法是, 不需要记录具体的每条edge, 因为实际上他并不关心有哪些edge, 他只关心每条edge是否都被ack了, 所以只需要不停的做异或, 成对的异或结果为0

2.2.1 prepare

主要在于OutputCollector的实现,

其中emit和emitDirect都是直接调用bolt-emit, 很简单

重点就是ack和fail的实现

其中比较难理解的是, 发送ack消息是不是直接发送本身的edge-id, 而是(bit-xor id ack-val)

其实做了两件事, ack当前tuple和同步新的edges

因为acker拿到id和ack-val也是和acker记录的值做异或, 所以这里先直接做异或, 省得在消息中需要发送两个参数

总结

如果有耐心看到这儿, 再附送两幅图...

Storm-源码分析-Topology Submit-Executor-mk-threads的更多相关文章

  1. Storm源码分析--Nimbus-data

    nimbus-datastorm-core/backtype/storm/nimbus.clj (defn nimbus-data [conf inimbus] (let [forced-schedu ...

  2. JStorm与Storm源码分析(四)--均衡调度器,EvenScheduler

    EvenScheduler同DefaultScheduler一样,同样实现了IScheduler接口, 由下面代码可以看出: (ns backtype.storm.scheduler.EvenSche ...

  3. JStorm与Storm源码分析(三)--Scheduler,调度器

    Scheduler作为Storm的调度器,负责为Topology分配可用资源. Storm提供了IScheduler接口,用户可以通过实现该接口来自定义Scheduler. 其定义如下: public ...

  4. JStorm与Storm源码分析(二)--任务分配,assignment

    mk-assignments主要功能就是产生Executor与节点+端口的对应关系,将Executor分配到某个节点的某个端口上,以及进行相应的调度处理.代码注释如下: ;;参数nimbus为nimb ...

  5. storm源码分析之任务分配--task assignment

    在"storm源码分析之topology提交过程"一文最后,submitTopologyWithOpts函数调用了mk-assignments函数.该函数的主要功能就是进行topo ...

  6. JStorm与Storm源码分析(一)--nimbus-data

    Nimbus里定义了一些共享数据结构,比如nimbus-data. nimbus-data结构里定义了很多公用的数据,请看下面代码: (defn nimbus-data [conf inimbus] ...

  7. storm源码分析之topology提交过程

    storm集群上运行的是一个个topology,一个topology是spouts和bolts组成的图.当我们开发完topology程序后将其打成jar包,然后在shell中执行storm jar x ...

  8. Nimbus<三>Storm源码分析--Nimbus启动过程

    Nimbus server, 首先从启动命令开始, 同样是使用storm命令"storm nimbus”来启动看下源码, 此处和上面client不同, jvmtype="-serv ...

  9. JStorm与Storm源码分析(五)--SpoutOutputCollector与代理模式

    本文主要是解析SpoutOutputCollector源码,顺便分析该类中所涉及的设计模式–代理模式. 首先介绍一下Spout输出收集器接口–ISpoutOutputCollector,该接口主要声明 ...

  10. Mesos源码分析(15): Test Executor的运行

    Test Executor的代码在src/examples/test_executor.cpp中   int main(int argc, char** argv) {   TestExecutor ...

随机推荐

  1. python 配置

    一.下载 https://www.python.org/ftp/python/3.4.2/python-3.4.2.amd64.msi 二.配置python--eclipse插件 1.直接在eclip ...

  2. 213. String Compression【easy】

    Implement a method to perform basic string compression using the counts of repeated characters. For ...

  3. ecshop3.0.0 release0518 SQL注入

    bugscan上的漏洞,自己复现了一下 注入在根目录下的flow.php elseif ($_REQUEST['step'] == 'repurchase') { include_once('incl ...

  4. Fly (From Wikipedia)

    True flies are insects of the order Diptera, the name being derived from the Greek δι- di- "two ...

  5. STM32F10x_RTC日历

    Ⅰ.概述 接着上一篇文章来讲述关于RTC的计数功能,我们以实例RTC日历(读写年.月.日.星期.时.分.秒)来讲述该章节. STM32F1系列芯片的RTC功能和其他系列(F0.F2.F4等)相比来说, ...

  6. JS——简单的正则表达式验证

    <!-- 用户注册:结构层:html;表现层:css;行为层:javascript; html利用ul,li来构造: 注意事项:1.每个Input都要有相应的id,这是在js中去调用的. 2.& ...

  7. shell常用的判断条件

    .判断文件夹是否存在 if [ -d /home/q/www ];then echo "true"; else echo "false" ;fi (系统内存在文 ...

  8. Linux多条指令之间;和&&

    Linux 中经常使用到一个命令,如 make && make install,这里也可以使用 make ; make install,那么在 Linux 中执行命令 ; 和 & ...

  9. Storm学习笔记——安装配置

    1.安装一个zookeeper集群 2.上传storm的安装包,解压 3.修改配置文件conf/storm.yaml #所使用的zookeeper集群主机storm.zookeeper.servers ...

  10. js删除数组中某一项,splice()

    ' ","childTagName":"高中"}, {","childTagName":"初中"}] ...