本文主要研究一下storm的OpaquePartitionedTridentSpoutExecutor
  
  TridentTopology.newStream
  
  storm-core-1.2.2-sources.jar!/org/apache/storm/trident/TridentTopology.java
  
  public Stream newStream(String txId, IOpaquePartitionedTridentSpout spout) {
  
  return newStream(txId, new OpaquePartitionedTridentSpoutExecutor(spout));
  
  }
  
  TridentTopology.newStream方法,对于IOpaquePartitionedTridentSpout类型的spout会使用OpaquePartitionedTridentSpoutExecutor来包装;而KafkaTridentSpoutOpaque则实现了IOpaquePartitionedTridentSpout接口
  
  TridentTopologyBuilder.buildTopology
  
  storm-core-1.2.2-sources.jar!/org/apache/storm/trident/topology/TridentTopologyBuilder.java
  
  public StormTopology buildTopology(Map<String, Number> masterCoordResources) {
  
  TopologyBuilder builder = new TopologyBuilder();
  
  Map<GlobalStreamId, String> batchIdsForSpouts = fleshOutStreamBatchIds(false);
  
  Map<GlobalStreamId, String> batchIdsForBolts = fleshOutStreamBatchIds(true);
  
  Map<String, List<String>> batchesToCommitIds = new HashMap<>();
  
  Map<String, List<ITridentSpout>> batchesToSpouts = new HashMap<>();
  
  for(String id: _spouts.keySet()) {
  
  TransactionalSpoutComponent c = _spouts.get(id);
  
  if(c.spout instanceof IRichSpout) {
  
  //TODO: wrap this to set the stream name
  
  builder.setSpout(id, (IRichSpout) c.spout, c.parallelism);
  
  } else {
  
  String batchGroup = c.batchGroupId;
  
  if(!batchesToCommitIds.containsKey(batchGroup)) {
  
  batchesToCommitIds.put(batchGroup, new ArrayList<String>());
  
  }
  
  batchesToCommitIds.get(batchGroup).add(c.commitStateId);
  
  if(!batchesToSpouts.containsKey(batchGroup)) {
  
  batchesToSpouts.put(batchGroup, new ArrayList<ITridentSpout>());
  
  }
  
  batchesToSpouts.get(batchGroup).add((ITridentSpout) c.spout);
  
  BoltDeclarer scd =
  
  builder.setBolt(spoutCoordinator(id), new TridentSpoutCoordinator(c.commitStateId, (ITridentSpout) c.spout))
  
  .globalGrouping(masterCoordinator(c.batchGroupId), MasterBatchCoordinator.BATCH_STREAM_ID)
  
  .globalGrouping(masterCoordinator(c.batchGroupId), MasterBatchCoordinator.SUCCESS_STREAM_ID);
  
  for(Map<String, Object> m: c.componentConfs) {
  
  scd.addConfigurations(m);
  
  }
  
  Map<String, TridentBoltExecutor.CoordSpec> specs = new HashMap();
  
  specs.put(c.batchGroupId, new CoordSpec());
  
  BoltDeclarer bd = builder.setBolt(id,
  
  new TridentBoltExecutor(
  
  new TridentSpoutExecutor(
  
  c.commitStateId,
  
  c.streamName,
  
  ((ITridentSpout) c.spout)),
  
  batchIdsForSpouts,
  
  specs),
  
  c.parallelism);
  
  bd.allGrouping(spoutCoordinator(id), MasterBatchCoordinator.BATCH_STREAM_ID);
  
  bd.allGrouping(masterCoordinator(batchGroup), MasterBatchCoordinator.SUCCESS_STREAM_ID);
  
  if(c.spout instanceof ICommitterTridentSpout) {
  
  bd.allGrouping(masterCoordinator(batchGroup), MasterBatchCoordinator.COMMIT_STREAM_ID);
  
  }
  
  for(Map<String, Object> m: c.componentConfs) {
  
  bd.addConfigurations(m);
  
  }
  
  }
  
  }
  
  //......
  
  return builder.createTopology();
  
  }
  
  TridentTopologyBuilder.buildTopology会将IOpaquePartitionedTridentSpout(OpaquePartitionedTridentSpoutExecutor)使用TridentSpoutExecutor包装,然后再使用TridentBoltExecutor包装为bolt
  
  OpaquePartitionedTridentSpoutExecutor
  
  storm-core-1.2.2-sources.jar!/org/apache/storm/trident/spout/OpaquePartitionedTridentSpoutExecutor.java
  
  public class OpaquePartitionedTridentSpoutExecutor implements ICommitterTridentSpout<Object> {
  
  protected final Logger LOG = LoggerFactory.getLogger(OpaquePartitionedTridentSpoutExecutor.class);
  
  IOpaquePartitionedTridentSpout<Object, ISpoutPartition, Object> _spout;
  
  //......
  
  public OpaquePartitionedTridentSpoutExecutor(IOpaquePartitionedTridentSpout<Object, ISpoutPartition, Object> spout) {
  
  _spout = spout;
  
  }
  
  @Override
  
  public ITridentSpout.BatchCoordinator<Object> getCoordinator(String txStateId, Map conf, TopologyContext context) {
  
  return new Coordinator(conf, context);
  
  }
  
  @Override
  
  public ICommitterTridentSpout.Emitter getEmitter(String txStateId, Map conf, TopologyContext context) {
  
  return new Emitter(txStateId, conf, context);
  
  }
  
  @Override
  
  public Fields getOutputFields() {
  
  return _spout.getOutputFields();
  
  }
  
  @Override
  
  public Map<String, Object> getComponentConfiguration() {
  
  return _spout.getComponentConfiguration();
  
  }
  
  }
  
  OpaquePartitionedTridentSpoutExecutor实现了ICommitterTridentSpout,这里getCoordinator返回的是ITridentSpout.BatchCoordinator,getEmitter返回的是ICommitterTridentSpout.Emitter
  
  ITridentSpout.BatchCoordinator
  
  storm-core-1.2.2-sources.jar!/org/apache/storm/trident/spout/OpaquePartitionedTridentSpoutExecutor.java
  
  public class Coordinator implements ITridentSpout.BatchCoordinator<Object> {
  
  IOpaquePartitionedTridentSpout.Coordinator _coordinator;
  
  public Coordinator(Map conf, TopologyContext context) {
  
  _coordinator = _spout.getCoordinator(conf, context);
  
  }
  
  @Override
  
  public Object initializeTransaction(long txid, Object prevMetadata, Object currMetadata) {
  
  LOG.debug("Initialize Transaction. [txid = {}], [prevMetadata = {}], [currMetadata = {}]", txid, prevMetadata, currMetadata);
  
  return _coordinator.getPartitionsForBatch();
  
  }
  
  @Override
  
  public void close() {
  
  LOG.debug("Closing");
  
  _coordinator.close();
  
  LOG.debug("Closed");
  
  }
  
  @Override
  
  public void success(long txid) {
  
  LOG.debug("Success [txid = {}]", txid);
  
  }
  
  @Override
  
  public boolean isReady(long txid) {
  
  boolean ready = _coordinator.isReady(txid);
  
  LOG.debug("[isReady = {}], [txid = {}]", ready, txid);
  
  return ready;
  
  }
  
  }
  
  包装了spout的_coordinator,它的类型IOpaquePartitionedTridentSpout.Coordinator,这里仅仅是多了debug日志
  
  ICommitterTridentSpout.Emitter
  
  storm-core-1.2.2-sources.jar!/org/apache/storm/trident/spout/OpaquePartitionedTridentSpoutExecutor.java
  
  public class Emitter www.gcyl152.com/ implements ICommitterTridentSpout.Emitter {
  
  IOpaquePartitionedTridentSpout.Emitter<Object, ISpoutPartition, Object> _emitter;
  
  TransactionalState _state;
  
  TreeMap<Long, Map<String, Object>> _cachedMetas = new TreeMap<>();
  
  Map<String, EmitterPartitionState> _partitionStates = new HashMap<>();
  
  int _index;
  
  int _numTasks;
  
  public Emitter(String txStateId, Map conf, TopologyContext context) {
  
  _emitter = _spout.getEmitter(conf, context);
  
  _index = context.getThisTaskIndex(www.michenggw.com);
  
  _numTasks = context.getComponentTasks(context.getThisComponentId()).size();
  
  _state = TransactionalState.newUserState(conf, txStateId);
  
  LOG.debug("Created {}", this);
  
  }
  
  Object _savedCoordinatorMeta = null;
  
  boolean _changedMeta = false;
  
  @Override
  
  public void emitBatch(TransactionAttempt tx, Object coordinatorMeta, TridentCollector collector) {
  
  LOG.debug("Emitting Batch. [transaction www.yigouyule2.cn= {}], [coordinatorMeta = {}], [collector = {}], [{}]",
  
  tx, coordinatorMeta, collector, this);
  
  if(_savedCoordinatorMeta==null || !_savedCoordinatorMeta.equals(coordinatorMeta)) {
  
  _partitionStates.clear();
  
  final List<ISpoutPartition> taskPartitions = _emitter.getPartitionsForTask(_index, _numTasks, coordinatorMeta);
  
  for (ISpoutPartition partition : taskPartitions) {
  
  _partitionStates.put(partition.getId(), new EmitterPartitionState(new RotatingTransactionalState(_state, partition.getId()), partition));
  
  }
  
  // refresh all partitions for backwards compatibility with old spout
  
  _emitter.refreshPartitions(_emitter.getOrderedPartitions(coordinatorMeta));
  
  _savedCoordinatorMeta = coordinatorMeta;
  
  _changedMeta = true;
  
  }
  
  Map<String, Object> metas = new HashMap<>();
  
  _cachedMetas.put(tx.getTransactionId(www.mcyllpt.com), metas);
  
  Entry<Long, Map<String, Object>> entry = _cachedMetas.lowerEntry(tx.getTransactionId());
  
  Map<String, Object> prevCached;
  
  if(entry!=null) {
  
  prevCached = entry.getValue();
  
  } else {
  
  prevCached = new HashMap<>();
  
  }
  
  for(Entry<String, EmitterPartitionState> e: _partitionStates.entrySet()) {
  
  String id = e.getKey();
  
  EmitterPartitionState s = e.getValue();
  
  s.rotatingState.removeState(tx.getTransactionId());
  
  Object lastMeta = prevCached.get(id);
  
  if(lastMeta==null) lastMeta = s.rotatingState.getLastState();
  
  Object meta = _emitter.emitPartitionBatch(tx, collector, s.partition, lastMeta);
  
  metas.put(id, meta);
  
  }
  
  LOG.debug("Emitted Batch. [transaction = {}], [coordinatorMeta = {}], [collector = {}], [{}]",
  
  tx, coordinatorMeta, collector, this);
  
  }
  
  @Override
  
  public void success(TransactionAttempt tx) {
  
  for(EmitterPartitionState state: _partitionStates.values()) {
  
  state.rotatingState.cleanupBefore(tx.getTransactionId());
  
  }
  
  LOG.debug("Success transaction {}. [{}]", tx, this);
  
  }
  
  @Override
  
  public void commit(TransactionAttempt attempt) {
  
  LOG.debug("Committing transaction {}. [{}]", attempt, this);
  
  // this code here handles a case where a previous commit failed, and the partitions
  
  // changed since the last commit. This clears out any state for the removed partitions
  
  // for this txid.
  
  // we make sure only a single task ever does this. we're also guaranteed that
  
  // it's impossible for there to be another writer to the directory for that partition
  
  // because only a single commit can be happening at once. this is because in order for
  
  // another attempt of the batch to commit, the batch phase must have succeeded in between.
  
  // hence, all tasks for the prior commit must have finished committing (whether successfully or not)
  
  if(_changedMeta && _index==0) {
  
  Set<String> validIds = new HashSet<>();
  
  for(ISpoutPartition p: _emitter.getOrderedPartitions(_savedCoordinatorMeta)) {
  
  validIds.add(p.getId());
  
  }
  
  for(String existingPartition: _state.list("")) {
  
  if(!validIds.contains(existingPartition)) {
  
  RotatingTransactionalState s = new RotatingTransactionalState(_state, existingPartition);
  
  s.removeState(attempt.getTransactionId());
  
  }
  
  }
  
  _changedMeta = false;
  
  }
  
  Long txid = attempt.getTransactionId();
  
  Map<String, Object> metas = _cachedMetas.remove(txid);
  
  for(Entry<String, Object> entry: metas.entrySet()) {
  
  _partitionStates.get(entry.getKey()).rotatingState.overrideState(txid, entry.getValue());
  
  }
  
  LOG.debug("Exiting commit method for transaction {}. [{}]", attempt, this);
  
  }
  
  @Override
  
  public void close() {
  
  LOG.debug("Closing");
  
  _emitter.close();
  
  LOG.debug("Closed");
  
  }
  
  @Override
  
  public String toString() {
  
  return "Emitter{" +
  
  ", _state=" + _state +
  
  ", _cachedMetas=" + _cachedMetas +
  
  ", _partitionStates=" + _partitionStates +
  
  ", _index=" + _index +
  
  ", _numTasks=" + _numTasks +
  
  ", _savedCoordinatorMeta=" + _savedCoordinatorMeta +
  
  ", _changedMeta=" + _changedMeta +
  
  '}';
  
  }
  
  }
  
  static class EmitterPartitionState {
  
  public RotatingTransactionalState rotatingState;
  
  public ISpoutPartition partition;
  
  public EmitterPartitionState(RotatingTransactionalState s, ISpoutPartition p) {
  
  rotatingState = s;
  
  partition = p;
  
  }
  
  }
  
  这里对spout的IOpaquePartitionedTridentSpout.Emitter进行了封装,_partitionStates使用了EmitterPartitionState
  
  emitBatch方法首先计算_partitionStates,然后计算prevCached,最后调用_emitter.emitPartitionBatch(tx, collector, s.partition, lastMeta)
  
  success方法调用state.rotatingState.cleanupBefore(tx.getTransactionId()),清空该txid之前的状态信息;commit方法主要是更新_partitionStates
  
  KafkaTridentSpoutOpaque
  
  storm-kafka-client-1.2.2-sources.jar!/org/apache/storm/kafka/spout/trident/KafkaTridentSpoutOpaque.java
  
  public class KafkaTridentSpoutOpaque<K,V> implements IOpaquePartitionedTridentSpout<List<Map<String, Object>>,
  
  KafkaTridentSpoutTopicPartition, Map<String, Object>> {
  
  private static final long serialVersionUID = -8003272486566259640L;
  
  private static final Logger LOG = LoggerFactory.getLogger(KafkaTridentSpoutOpaque.class);
  
  private final KafkaTridentSpoutManager<K, V> kafkaManager;
  
  public KafkaTridentSpoutOpaque(KafkaSpoutConfig<K, V> conf) {
  
  this(new KafkaTridentSpoutManager<>(conf));
  
  }
  
  public KafkaTridentSpoutOpaque(KafkaTridentSpoutManager<K, V> kafkaManager) {
  
  this.kafkaManager = kafkaManager;
  
  LOG.debug("Created {}", this.toString());
  
  }
  
  @Override
  
  public Emitter<List<Map<String, Object>>, KafkaTridentSpoutTopicPartition, Map<String, Object>> getEmitter(
  
  Map conf, TopologyContext context) {
  
  return new KafkaTridentSpoutEmitter<>(kafkaManager, context);
  
  }
  
  @Override
  
  public Coordinator<List<Map<String, Object>>> getCoordinator(Map conf, TopologyContext context) {
  
  return new KafkaTridentSpoutOpaqueCoordinator<>(kafkaManager);
  
  }
  
  @Override
  
  public Map<String, Object> getComponentConfiguration() {
  
  return null;
  
  }
  
  @Override
  
  public Fields getOutputFields() {
  
  final Fields outputFields = kafkaManager.getFields();
  
  LOG.debug("OutputFields = {}", outputFields);
  
  return outputFields;
  
  }
  
  @Override
  
  public final String toString() {
  
  return super.toString() +
  
  "{kafkaManager=" + kafkaManager + '}';
  
  }
  
  }
  
  KafkaTridentSpoutOpaque的getCoordinator返回的是KafkaTridentSpoutOpaqueCoordinator;getEmitter返回的是KafkaTridentSpoutEmitter
  
  KafkaTridentSpoutOpaqueCoordinator
  
  storm-kafka-client-1.2.2-sources.jar!/org/apache/storm/kafka/spout/trident/KafkaTridentSpoutOpaqueCoordinator.java
  
  public class KafkaTridentSpoutOpaqueCoordinator<K,V> implements IOpaquePartitionedTridentSpout.Coordinator<List<Map<String, Object>>>,
  
  Serializable {
  
  private static final Logger LOG = LoggerFactory.getLogger(KafkaTridentSpoutOpaqueCoordinator.class);
  
  private final TopicPartitionSerializer tpSerializer = new TopicPartitionSerializer();
  
  private final KafkaTridentSpoutManager<K,V> kafkaManager;
  
  public KafkaTridentSpoutOpaqueCoordinator(KafkaTridentSpoutManager<K, V> kafkaManager) {
  
  this.kafkaManager = kafkaManager;
  
  LOG.debug("Created {}", this.toString());
  
  }
  
  @Override
  
  public boolean isReady(long txid) {
  
  LOG.debug("isReady = true");
  
  return true; // the "old" trident kafka spout always returns true, like this
  
  }
  
  @Override
  
  public List<Map<String, Object>> getPartitionsForBatch() {
  
  final ArrayList<TopicPartition> topicPartitions = new ArrayList<>(kafkaManager.getTopicPartitions());
  
  LOG.debug("TopicPartitions for batch {}", topicPartitions);
  
  List<Map<String, Object>> tps = new ArrayList<>();
  
  for(TopicPartition tp : topicPartitions) {
  
  tps.add(tpSerializer.toMap(tp));
  
  }
  
  return tps;
  
  }
  
  @Override
  
  public void close() {
  
  LOG.debug("Closed"); // the "old" trident kafka spout is no op like this
  
  }
  
  @Override
  
  public final String toString() {
  
  return super.toString() +
  
  "{kafkaManager=" + kafkaManager +
  
  '}';
  
  }
  
  }
  
  这里的isReady始终返回true,getPartitionsForBatch方法主要是将kafkaManager.getTopicPartitions()信息转换为map结构
  
  KafkaTridentSpoutEmitter
  
  storm-kafka-client-1.2.2-sources.jar!/org/apache/storm/kafka/spout/trident/KafkaTridentSpoutEmitter.java
  
  public class KafkaTridentSpoutEmitter<K, V> implements IOpaquePartitionedTridentSpout.Emitter<
  
  List<Map<String, Object>>,
  
  KafkaTridentSpoutTopicPartition,
  
  Map<String, Object>>,
  
  Serializable {
  
  private static final long serialVersionUID = -7343927794834130435L;
  
  private static final Logger LOG = LoggerFactory.getLogger(KafkaTridentSpoutEmitter.class);
  
  // Kafka
  
  private final KafkaConsumer<K, V> kafkaConsumer;
  
  // Bookkeeping
  
  private final KafkaTridentSpoutManager<K, V> kafkaManager;
  
  // set of topic-partitions for which first poll has already occurred, and the first polled txid
  
  private final Map<TopicPartition, Long> firstPollTransaction = new HashMap<>();
  
  // Declare some KafkaTridentSpoutManager references for convenience
  
  private final long pollTimeoutMs;
  
  private final KafkaSpoutConfig.FirstPollOffsetStrategy firstPollOffsetStrategy;
  
  private final RecordTranslator<K, V> translator;
  
  private final Timer refreshSubscriptionTimer;
  
  private final TopicPartitionSerializer tpSerializer = new TopicPartitionSerializer();
  
  private TopologyContext topologyContext;
  
  /**
  
  * Create a new Kafka spout emitter.
  
  * @param kafkaManager The Kafka consumer manager to use
  
  * @param topologyContext The topology context
  
  * @param refreshSubscriptionTimer The timer for deciding when to recheck the subscription
  
  */
  
  public KafkaTridentSpoutEmitter(KafkaTridentSpoutManager<K, V> kafkaManager, TopologyContext topologyContext, Timer refreshSubscriptionTimer) {
  
  this.kafkaConsumer = kafkaManager.createAndSubscribeKafkaConsumer(topologyContext);
  
  this.kafkaManager = kafkaManager;
  
  this.topologyContext = topologyContext;
  
  this.refreshSubscriptionTimer = refreshSubscriptionTimer;
  
  this.translator = kafkaManager.getKafkaSpoutConfig().getTranslator();
  
  final KafkaSpoutConfig<K, V> kafkaSpoutConfig = kafkaManager.getKafkaSpoutConfig();
  
  this.pollTimeoutMs = kafkaSpoutConfig.getPollTimeoutMs();
  
  this.firstPollOffsetStrategy = kafkaSpoutConfig.getFirstPollOffsetStrategy();
  
  LOG.debug("Created {}", this.toString());
  
  }
  
  /**
  
  * Creates instance of this class with default 500 millisecond refresh subscription timer
  
  */
  
  public KafkaTridentSpoutEmitter(KafkaTridentSpoutManager<K, V> kafkaManager, TopologyContext topologyContext) {
  
  this(kafkaManager, topologyContext, new Timer(500,
  
  kafkaManager.getKafkaSpoutConfig().getPartitionRefreshPeriodMs(), TimeUnit.MILLISECONDS));
  
  }
  
  //......
  
  @Override
  
  public Map<String, Object> emitPartitionBatch(TransactionAttempt tx, TridentCollector collector,
  
  KafkaTridentSpoutTopicPartition currBatchPartition, Map<String, Object> lastBatch) {
  
  LOG.debug("Processing batch: [transaction = {}], [currBatchPartition = {}], [lastBatchMetadata = {}], [collector = {}]",
  
  tx, currBatchPartition, lastBatch, collector);
  
  final TopicPartition currBatchTp = currBatchPartition.getTopicPartition();
  
  final Set<TopicPartition> assignments = kafkaConsumer.assignment();
  
  KafkaTridentSpoutBatchMetadata lastBatchMeta = lastBatch == null ? null : KafkaTridentSpoutBatchMetadata.fromMap(lastBatch);
  
  KafkaTridentSpoutBatchMetadata currentBatch = lastBatchMeta;
  
  Collection<TopicPartition> pausedTopicPartitions = Collections.emptySet();
  
  if (assignments == null || !assignments.contains(currBatchPartition.getTopicPartition())) {
  
  LOG.warn("SKIPPING processing batch [transaction = {}], [currBatchPartition = {}], [lastBatchMetadata = {}], " +
  
  "[collector = {}] because it is not part of the assignments {} of consumer instance [{}] " +
  
  "of consumer group [{}]", tx, currBatchPartition, lastBatch, collector, assignments,
  
  kafkaConsumer, kafkaManager.getKafkaSpoutConfig().getConsumerGroupId());
  
  } else {
  
  try {
  
  // pause other topic-partitions to only poll from current topic-partition
  
  pausedTopicPartitions = pauseTopicPartitions(currBatchTp);
  
  seek(currBatchTp, lastBatchMeta, tx.getTransactionId());
  
  // poll
  
  if (refreshSubscriptionTimer.isExpiredResetOnTrue()) {
  
  kafkaManager.getKafkaSpoutConfig().getSubscription().refreshAssignment();
  
  }
  
  final ConsumerRecords<K, V> records = kafkaConsumer.poll(pollTimeoutMs);
  
  LOG.debug("Polled [{}] records from Kafka.", records.count());
  
  if (!records.isEmpty()) {
  
  emitTuples(collector, records);
  
  // build new metadata
  
  currentBatch = new KafkaTridentSpoutBatchMetadata(currBatchTp, records);
  
  }
  
  } finally {
  
  kafkaConsumer.resume(pausedTopicPartitions);
  
  LOG.trace("Resumed topic-partitions {}", pausedTopicPartitions);
  
  }
  
  LOG.debug("Emitted batch: [transaction = {}], [currBatchPartition = {}], [lastBatchMetadata = {}], " +
  
  "[currBatchMetadata = {}], [collector = {}]", tx, currBatchPartition, lastBatch, currentBatch, collector);
  
  }
  
  return currentBatch == null ? null : currentBatch.toMap();
  
  }
  
  private void emitTuples(TridentCollector collector, ConsumerRecords<K, V> records) {
  
  for (ConsumerRecord<K, V> record : records) {
  
  final List<Object> tuple = translator.apply(record);
  
  collector.emit(tuple);
  
  LOG.debug("Emitted tuple {} for record [{}]", tuple, record);
  
  }
  
  }
  
  @Override
  
  public void refreshPartitions(List<KafkaTridentSpoutTopicPartition> partitionResponsibilities) {
  
  LOG.trace("Refreshing of topic-partitions handled by Kafka. " +
  
  "No action taken by this method for topic partitions {}", partitionResponsibilities);
  
  }
  
  /**
  
  * Computes ordered list of topic-partitions for this task taking into consideration that topic-partitions
  
  * for this task must be assigned to the Kafka consumer running on this task.
  
  *
  
  * @param allPartitionInfo list of all partitions as returned by {@link KafkaTridentSpoutOpaqueCoordinator}
  
  * @return ordered list of topic partitions for this task
  
  */
  
  @Override
  
  public List<KafkaTridentSpoutTopicPartition> getOrderedPartitions(final List<Map<String, Object>> allPartitionInfo) {
  
  List<TopicPartition> allTopicPartitions = new ArrayList<>();
  
  for(Map<String, Object> map : allPartitionInfo) {
  
  allTopicPartitions.add(tpSerializer.fromMap(map));
  
  }
  
  final List<KafkaTridentSpoutTopicPartition> allPartitions = newKafkaTridentSpoutTopicPartitions(allTopicPartitions);
  
  LOG.debug("Returning all topic-partitions {} across all tasks. Current task index [{}]. Total tasks [{}] ",
  
  allPartitions, topologyContext.getThisTaskIndex(), getNumTasks());
  
  return allPartitions;
  
  }
  
  @Override
  
  public List<KafkaTridentSpoutTopicPartition> getPartitionsForTask(int taskId, int numTasks,
  
  List<Map<String, Object>> allPartitionInfo) {
  
  final Set<TopicPartition> assignedTps = kafkaConsumer.assignment();
  
  LOG.debug("Consumer [{}], running on task with index [{}], has assigned topic-partitions {}", kafkaConsumer, taskId, assignedTps);
  
  final List<KafkaTridentSpoutTopicPartition> taskTps = newKafkaTridentSpoutTopicPartitions(assignedTps);
  
  LOG.debug("Returning topic-partitions {} for task with index [{}]", taskTps, taskId);
  
  return taskTps;
  
  }
  
  @Override
  
  public void close() {
  
  kafkaConsumer.close();
  
  LOG.debug("Closed");
  
  }
  
  @Override
  
  public final String toString() {
  
  return super.toString() +
  
  "{kafkaManager=" + kafkaManager +
  
  '}';
  
  }
  
  }
  
  这里的refreshSubscriptionTimer的interval取的是kafkaManager.getKafkaSpoutConfig().getPartitionRefreshPeriodMs(),默认是2000
  
  emitPartitionBatch方法没调用一次都会判断refreshSubscriptionTimer.isExpiredResetOnTrue(),如果时间到了,就会调用kafkaManager.getKafkaSpoutConfig().getSubscription().refreshAssignment()刷新assignment
  
  emitPartitionBatch方法主要是找到与该batch关联的partition,停止从其他parition拉取消息,然后根据firstPollOffsetStrategy以及lastBatchMeta信息,调用kafkaConsumer的seek相关方法seek到指定位置
  
  之后就是用kafkaConsumer.poll(pollTimeoutMs)拉取数据,然后emitTuples;emitTuples方法会是用translator转换数据,然后调用collector.emit发射出去
  
  refreshPartitions方法目前仅仅是trace下日志;getOrderedPartitions方法先将allPartitionInfo的数据从map结构反序列化回来,然后转换为KafkaTridentSpoutTopicPartition返回;getPartitionsForTask方法主要是通过kafkaConsumer.assignment()的信息转换为KafkaTridentSpoutTopicPartition返回
  
  小结
  
  storm-kafka-client提供了KafkaTridentSpoutOpaque这个spout作为trident的kafka spout(旧版的为OpaqueTridentKafkaSpout,在storm-kafka类库中),它实现了IOpaquePartitionedTridentSpout接口
  
  TridentTopology.newStream方法,对于IOpaquePartitionedTridentSpout类型的spout会使用OpaquePartitionedTridentSpoutExecutor来包装;TridentTopologyBuilder.buildTopology会将IOpaquePartitionedTridentSpout(OpaquePartitionedTridentSpoutExecutor)先使用TridentSpoutExecutor包装,然后再使用TridentBoltExecutor包装为bolt
  
  OpaquePartitionedTridentSpoutExecutor的getCoordinator返回的是ITridentSpout.BatchCoordinator,getEmitter返回的是ICommitterTridentSpout.Emitter;他们分别对KafkaTridentSpoutOpaque这个原始spout返回的KafkaTridentSpoutOpaqueCoordinator以及KafkaTridentSpoutEmitter进行包装再处理;其中对coordinator加了debug日志,对emitter则主要多了对EmitterPartitionState的存取

认识Linux文件系统的架构的更多相关文章

  1. 干货长文:Linux 文件系统与持久性内存介绍

    关注「开源Linux」,选择"设为星标" 回复「学习」,有我为您特别筛选的学习资料~ 1.Linux 虚拟文件系统介绍 在 Linux 系统中一切皆文件,除了通常所说的狭义的文件以 ...

  2. [转]分布式文件系统FastDFS架构剖析

    [转]分布式文件系统FastDFS架构剖析 http://www.programmer.com.cn/4380/ 文/余庆 FastDFS是一款类Google FS的开源分布式文件系统,它用纯C语言实 ...

  3. linux驱动程序之电源管理之新版linux系统设备架构中关于电源管理方式的变更

    新版linux系统设备架构中关于电源管理方式的变更 based on linux-2.6.32 一.设备模型各数据结构中电源管理的部分 linux的设备模型通过诸多结构体来联合描述,如struct d ...

  4. XFS:大数据环境下Linux文件系统的未来?

    XFS:大数据环境下Linux文件系统的未来?   XFS开发者Dave Chinner近日声称,他认为更多的用户应当考虑XFS.XFS经常被认为是适合拥有海量数据的用户的文件系统,在空间分配方面的可 ...

  5. Linux设备驱动开发详解-Note(11)--- Linux 文件系统与设备文件系统(3)

    Linux 文件系统与设备文件系统(3) 成于坚持,败于止步 sysfs 文件系统与 Linux 设备模型 1.sysfs 文件系统 Linux 2.6 内核引入了 sysfs 文件系统,sysfs ...

  6. Linux命令及架构部署大全

    1.Linux系统基础知识 Linux 基础优化配置 Linux系统根目录结构介绍 linux系统重要子目录介绍 Linux基础命令(之一)详解 Linux基础命令(之二)详解 Linux文件系统 L ...

  7. Linux文件系统1---概述

      1.引言 本文所述关于文件管理的系列文章主要是对陈莉君老师所讲述的文件系统管理知识讲座的整理.Linux可以支持不同的文件系统,它源于unix文件系统,也是unix文件系统的一大特色. 本文主要先 ...

  8. linux文件系统软链接硬链接

    引子 目前,UNIX的文件系统有很多种实现,例如UFS(基于BSD的UNIX文件系统).ext3.ext4.ZFS和Reiserfs等等. 不论哪一种文件系统,总是需要存储数据.硬盘的最小存储单位是扇 ...

  9. 【转帖】Linux 内核系统架构

    Linux 内核系统架构   描述Linux内核的文章已经有上亿字了 但是对于初学者,还是应该多学习多看,毕竟上亿字不能一下子就明白的. 即使看了所有的Linux 内核文章,估计也还不是很明白,这时候 ...

随机推荐

  1. Coursera:一流大学免费在线课程平台

    https://www.coursera.org/ 微软联合创始人 Bill Gates 从公司退隐后,一直和妻子 Melinda 忙于公益事业.但离开 IT 圈并未改变他穿廉价衬衫和保持学习的习惯— ...

  2. P1851 好朋友

    题目背景 小可可和所有其他同学的手腕上都戴有一个射频识别序列号码牌,这样老师就可以方便的计算出他们的人数.很多同学都有一个“好朋友” .如果 A 的序列号的约数之和恰好等于B 的序列号,那么 A的好朋 ...

  3. oracle PL、SQL(二)

    oracle PL.SQL(基础知识点二) --1,参数 in:表示输入类型,可以省略 :out:输出类型不能省略---------- ----案例1:编写一个过程,可以输入雇员的编号,返回该雇员的姓 ...

  4. unity内存管理

    最近一直在研究unity的内存加载,因为它是游戏运行的重中之重,如果不深入理解和合理运用,很可能导致项目因内存太大而崩溃. 详细说一下细节概念:AssetBundle运行时加载:来自文件就用Creat ...

  5. 最优雅退出 Android 应用程序的 6 种方式

    一.容器式 建立一个全局容器,把所有的Activity存储起来,退出时循环遍历finish所有Activity import java.util.ArrayList; import java.util ...

  6. 一个PHP开发APP接口的视频教程

    感觉php做接口方面的教程很少,无意中搜到了这个视频教程,希望能给一些人带来帮助http://www.imooc.com/learn/163

  7. Python100天打卡-Day10-图形用户界面和游戏开发

    基于tkinter模块的GUIPython默认的GUI开发模块是tkinter(在Python 3以前的版本中名为Tkinter)使用tkinter来开发GUI应用需要以下5个步骤: 导入tkinte ...

  8. uva11925 Generating Permutations

    逆序做,逆序输出 紫书上的描述有点问题 感觉很经典 ans.push_back(2); a.insert(a.begin(),a[n-1]); a.erase(a.end()-1); a.push_b ...

  9. shift Alt + up(down) copy current line ! ctrl + j show the control # vscode key

    shift Alt + up(down) copy current line ! ctrl + j show the control # vscode key

  10. CAD交互绘制圆(网页版)

    CAD绘制图像的过程中,画圆的情况是非常常见的,用户可以在控件视区点取任意一点做为圆心,再动态点取半径绘制圆. 主要用到函数说明: _DMxDrawX::DrawCircle 绘制一个圆.详细说明如下 ...