如果还没看过Flume-ng源码解析之启动流程,可以点击Flume-ng源码解析之启动流程 查看

1 接口介绍

组件的分析顺序是按照上一篇中启动顺序来分析的,首先是Channel,然后是Sink,最后是Source,在开始看组件源码之前我们先来看一下两个重要的接口,一个是LifecycleAware ,另一个是NamedComponent

1.1 LifecycleAware

  1. @InterfaceAudience.Public
  2. @InterfaceStability.Stable
  3. public interface LifecycleAware {
  4. public void start();
  5. public void stop();
  6. public LifecycleState getLifecycleState();
  7. }

非常简单就是三个方法,start()、stop()和getLifecycleState,这个接口是flume好多类都要实现的接口,包括Flume-ng源码解析之启动流程

所中提到PollingPropertiesFileConfigurationProvider(),只要涉及到生命周期的都会实现该接口,当然组件们也是要实现的!

1.2 NamedComponent

  1. @InterfaceAudience.Public
  2. @InterfaceStability.Stable
  3. public interface NamedComponent {
  4. public void setName(String name);
  5. public String getName();
  6. }

这个没什么好讲的,就是用来设置名字的。

2 Channel

作为Flume三大核心组件之一的Channel,我们有必要来看看它的构成:

  1. @InterfaceAudience.Public
  2. @InterfaceStability.Stable
  3. public interface Channel extends LifecycleAware, NamedComponent {
  4. public void put(Event event) throws ChannelException;
  5. public Event take() throws ChannelException;
  6. public Transaction getTransaction();
  7. }

那么从上面的接口中我们可以看到Channel的主要功能就是put()和take(),那么我们就来看一下它的具体实现。这里我们选择MemoryChannel作为例子,但是MemoryChannel太长了,我们就截取一小段来看看

  1. public class MemoryChannel extends BasicChannelSemantics {
  2. private static Logger LOGGER = LoggerFactory.getLogger(MemoryChannel.class);
  3. private static final Integer defaultCapacity = Integer.valueOf(100);
  4. private static final Integer defaultTransCapacity = Integer.valueOf(100);
  5. public MemoryChannel() {
  6. }
  7. ...
  8. }

我们又看到它继承了BasicChannelSemantics ,从名字我们可以看出它是一个基础的Channel,我们继续看看看它的实现

  1. @InterfaceAudience.Public
  2. @InterfaceStability.Stable
  3. public abstract class BasicChannelSemantics extends AbstractChannel {
  4. private ThreadLocal<BasicTransactionSemantics> currentTransaction
  5. = new ThreadLocal<BasicTransactionSemantics>();
  6. private boolean initialized = false;
  7. protected void initialize() {}
  8. protected abstract BasicTransactionSemantics createTransaction();
  9. @Override
  10. public void put(Event event) throws ChannelException {
  11. BasicTransactionSemantics transaction = currentTransaction.get();
  12. Preconditions.checkState(transaction != null,
  13. "No transaction exists for this thread");
  14. transaction.put(event);
  15. }
  16. @Override
  17. public Event take() throws ChannelException {
  18. BasicTransactionSemantics transaction = currentTransaction.get();
  19. Preconditions.checkState(transaction != null,
  20. "No transaction exists for this thread");
  21. return transaction.take();
  22. }
  23. @Override
  24. public Transaction getTransaction() {
  25. if (!initialized) {
  26. synchronized (this) {
  27. if (!initialized) {
  28. initialize();
  29. initialized = true;
  30. }
  31. }
  32. }
  33. BasicTransactionSemantics transaction = currentTransaction.get();
  34. if (transaction == null || transaction.getState().equals(
  35. BasicTransactionSemantics.State.CLOSED)) {
  36. transaction = createTransaction();
  37. currentTransaction.set(transaction);
  38. }
  39. return transaction;
  40. }
  41. }

找了许久,终于发现了put()和take(),但是仔细一看,它们内部调用的是BasicTransactionSemantics 的put()和take(),有点失望,继续来看看BasicTransactionSemantics

  1. public abstract class BasicTransactionSemantics implements Transaction {
  2. private State state;
  3. private long initialThreadId;
  4. protected void doBegin() throws InterruptedException {}
  5. protected abstract void doPut(Event event) throws InterruptedException;
  6. protected abstract Event doTake() throws InterruptedException;
  7. protected abstract void doCommit() throws InterruptedException;
  8. protected abstract void doRollback() throws InterruptedException;
  9. protected void doClose() {}
  10. protected BasicTransactionSemantics() {
  11. state = State.NEW;
  12. initialThreadId = Thread.currentThread().getId();
  13. }
  14. protected void put(Event event) {
  15. Preconditions.checkState(Thread.currentThread().getId() == initialThreadId,
  16. "put() called from different thread than getTransaction()!");
  17. Preconditions.checkState(state.equals(State.OPEN),
  18. "put() called when transaction is %s!", state);
  19. Preconditions.checkArgument(event != null,
  20. "put() called with null event!");
  21. try {
  22. doPut(event);
  23. } catch (InterruptedException e) {
  24. Thread.currentThread().interrupt();
  25. throw new ChannelException(e.toString(), e);
  26. }
  27. }
  28. protected Event take() {
  29. Preconditions.checkState(Thread.currentThread().getId() == initialThreadId,
  30. "take() called from different thread than getTransaction()!");
  31. Preconditions.checkState(state.equals(State.OPEN),
  32. "take() called when transaction is %s!", state);
  33. try {
  34. return doTake();
  35. } catch (InterruptedException e) {
  36. Thread.currentThread().interrupt();
  37. return null;
  38. }
  39. }
  40. protected State getState() {
  41. return state;
  42. }
  43. ...//我们这里只是讨论put和take,所以一些暂时不涉及的方法就被我干掉,有兴趣恩典朋友可以自行阅读
  44. protected static enum State {
  45. NEW, OPEN, COMPLETED, CLOSED
  46. }
  47. }

又是一个抽象类,put()和take()内部调用的还是抽象方法doPut()和doTake(),看到这里,我相信没有耐心的同学已经崩溃了,但是就差最后一步了,既然是抽象类,那么最终Channel所使用的肯定是它的一个实现类,这时候我们可以回到一开始使用的MemoryChannel,到里面找找有没有线索,一看,MemoryChannel中就藏着个内部类

  1. private class MemoryTransaction extends BasicTransactionSemantics {
  2. private LinkedBlockingDeque<Event> takeList;
  3. private LinkedBlockingDeque<Event> putList;
  4. private final ChannelCounter channelCounter;
  5. private int putByteCounter = 0;
  6. private int takeByteCounter = 0;
  7. public MemoryTransaction(int transCapacity, ChannelCounter counter) {
  8. putList = new LinkedBlockingDeque<Event>(transCapacity);
  9. takeList = new LinkedBlockingDeque<Event>(transCapacity);
  10. channelCounter = counter;
  11. }
  12. @Override
  13. protected void doPut(Event event) throws InterruptedException {
  14. channelCounter.incrementEventPutAttemptCount();
  15. int eventByteSize = (int) Math.ceil(estimateEventSize(event) / byteCapacitySlotSize);
  16. if (!putList.offer(event)) {
  17. throw new ChannelException(
  18. "Put queue for MemoryTransaction of capacity " +
  19. putList.size() + " full, consider committing more frequently, " +
  20. "increasing capacity or increasing thread count");
  21. }
  22. putByteCounter += eventByteSize;
  23. }
  24. @Override
  25. protected Event doTake() throws InterruptedException {
  26. channelCounter.incrementEventTakeAttemptCount();
  27. if (takeList.remainingCapacity() == 0) {
  28. throw new ChannelException("Take list for MemoryTransaction, capacity " +
  29. takeList.size() + " full, consider committing more frequently, " +
  30. "increasing capacity, or increasing thread count");
  31. }
  32. if (!queueStored.tryAcquire(keepAlive, TimeUnit.SECONDS)) {
  33. return null;
  34. }
  35. Event event;
  36. synchronized (queueLock) {
  37. event = queue.poll();
  38. }
  39. Preconditions.checkNotNull(event, "Queue.poll returned NULL despite semaphore " +
  40. "signalling existence of entry");
  41. takeList.put(event);
  42. int eventByteSize = (int) Math.ceil(estimateEventSize(event) / byteCapacitySlotSize);
  43. takeByteCounter += eventByteSize;
  44. return event;
  45. }
  46. //...依然删除暂时不需要的方法
  47. }

在这个类中我们可以看到doPut()和doTake()的实现方法,也明白MemoryChannel的put()和take()最终调用的是MemoryTransaction 的doPut()和doTake()。

有朋友看到这里以为这次解析就要结束了,其实好戏还在后头,Channel中还有两个重要的类ChannelProcessor和ChannelSelector,耐心地听我慢慢道来。

3 ChannelProcessor

ChannelProcessor 的作用就是执行put操作,将数据放到channel里面。每个ChannelProcessor实例都会配备一个ChannelSelector来决定event要put到那个channl当中

  1. public class ChannelProcessor implements Configurable {
  2. private static final Logger LOG = LoggerFactory.getLogger(ChannelProcessor.class);
  3. private final ChannelSelector selector;
  4. private final InterceptorChain interceptorChain;
  5. public ChannelProcessor(ChannelSelector selector) {
  6. this.selector = selector;
  7. this.interceptorChain = new InterceptorChain();
  8. }
  9. public void initialize() {
  10. this.interceptorChain.initialize();
  11. }
  12. public void close() {
  13. this.interceptorChain.close();
  14. }
  15. public void configure(Context context) {
  16. this.configureInterceptors(context);
  17. }
  18. private void configureInterceptors(Context context) {
  19. //配置拦截器
  20. }
  21. public ChannelSelector getSelector() {
  22. return this.selector;
  23. }
  24. public void processEventBatch(List<Event> events) {
  25. ...
  26. while(i$.hasNext()) {
  27. Event optChannel = (Event)i$.next();
  28. List tx = this.selector.getRequiredChannels(optChannel);
  29. ...//将event放到Required队列
  30. t1 = this.selector.getOptionalChannels(optChannel);
  31. Object eventQueue;
  32. ...//将event放到Optional队列
  33. }
  34. ...//event的分配操作
  35. }
  36. public void processEvent(Event event) {
  37. event = this.interceptorChain.intercept(event);
  38. if(event != null) {
  39. List requiredChannels = this.selector.getRequiredChannels(event);
  40. Iterator optionalChannels = requiredChannels.iterator();
  41. ...//event的分配操作
  42. List optionalChannels1 = this.selector.getOptionalChannels(event);
  43. Iterator i$1 = optionalChannels1.iterator();
  44. ...//event的分配操作
  45. }
  46. }
  47. }

为了简化代码,我进行了一些删除,只保留需要讲解的部分,说白了Channel中的两个写入方法,都是需要从作为参数传入的selector中获取对应的channel来执行event的put操作。接下来我们来看看ChannelSelector

4 ChannelSelector

ChannelSelector是一个接口,我们可以通过ChannelSelectorFactory来创建它的子类,Flume提供了两个实现类MultiplexingChannelSelector和ReplicatingChannelSelector。

  1. public interface ChannelSelector extends NamedComponent, Configurable {
  2. void setChannels(List<Channel> var1);
  3. List<Channel> getRequiredChannels(Event var1);
  4. List<Channel> getOptionalChannels(Event var1);
  5. List<Channel> getAllChannels();
  6. }

通过ChannelSelectorFactory 的create来创建,create中调用getSelectorForType来获得一个selector,通过配置文件中的type来创建相应的子类

  1. public class ChannelSelectorFactory {
  2. private static final Logger LOGGER = LoggerFactory.getLogger(
  3. ChannelSelectorFactory.class);
  4. public static ChannelSelector create(List<Channel> channels,
  5. Map<String, String> config) {
  6. ...
  7. }
  8. public static ChannelSelector create(List<Channel> channels,
  9. ChannelSelectorConfiguration conf) {
  10. String type = ChannelSelectorType.REPLICATING.toString();
  11. if (conf != null) {
  12. type = conf.getType();
  13. }
  14. ChannelSelector selector = getSelectorForType(type);
  15. selector.setChannels(channels);
  16. Configurables.configure(selector, conf);
  17. return selector;
  18. }
  19. private static ChannelSelector getSelectorForType(String type) {
  20. if (type == null || type.trim().length() == 0) {
  21. return new ReplicatingChannelSelector();
  22. }
  23. String selectorClassName = type;
  24. ChannelSelectorType selectorType = ChannelSelectorType.OTHER;
  25. try {
  26. selectorType = ChannelSelectorType.valueOf(type.toUpperCase(Locale.ENGLISH));
  27. } catch (IllegalArgumentException ex) {
  28. LOGGER.debug("Selector type {} is a custom type", type);
  29. }
  30. if (!selectorType.equals(ChannelSelectorType.OTHER)) {
  31. selectorClassName = selectorType.getChannelSelectorClassName();
  32. }
  33. ChannelSelector selector = null;
  34. try {
  35. @SuppressWarnings("unchecked")
  36. Class<? extends ChannelSelector> selectorClass =
  37. (Class<? extends ChannelSelector>) Class.forName(selectorClassName);
  38. selector = selectorClass.newInstance();
  39. } catch (Exception ex) {
  40. throw new FlumeException("Unable to load selector type: " + type
  41. + ", class: " + selectorClassName, ex);
  42. }
  43. return selector;
  44. }
  45. }

对于这两种Selector简单说一下:

1)MultiplexingChannelSelector

下面是一个channel selector 配置文件

  1. agent_foo.sources.avro-AppSrv-source1.selector.type = multiplexing
  2. agent_foo.sources.avro-AppSrv-source1.selector.header = State
  3. agent_foo.sources.avro-AppSrv-source1.selector.mapping.CA = mem-channel-1
  4. agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
  5. agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY = mem-channel-1 file-channel-2
  6. agent_foo.sources.avro-AppSrv-source1.selector.optional.CA = mem-channel-1 file-channel-2
  7. agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
  8. agent_foo.sources.avro-AppSrv-source1.selector.default = mem-channel-1

MultiplexingChannelSelector类中定义了三个属性,用于存储不同类型的channel

  1. private Map<String, List<Channel>> channelMapping;
  2. private Map<String, List<Channel>> optionalChannels;
  3. private List<Channel> defaultChannels;

那么具体分配原则如下:

  • 如果设置了maping,那么会event肯定会给指定的channel,如果同时设置了optional,也会发送给optionalchannel
  • 如果没有设置maping,设置default,那么event会发送给defaultchannel,如果还同时设置了optional,那么也会发送给optionalchannel
  • 如果maping和default都没指定,如果有指定option,那么会发送给optionalchannel,但是发送给optionalchannel不会进行失败重试

2)ReplicatingChannelSelector

分配原则比较简单

  • 如果是replicating的话,那么如果没有指定optional,那么全部channel都有,如果某个channel指定为option的话,那么就要从requiredChannel移除,只发送给optionalchannel

5 总结:

作为一个承上启下的组件,Channel的作用就是将source来的数据通过自己流向sink,那么ChannelProcessor就起到将event put到分配好的channel中,而分配的规则是由selector决定的,flume提供的selector有multiplexing和replicating两种。所以ChannelProcessor一般都是在Source中被调用。那么Channel的take()肯定是在Sink中调用的。

Flume-ng源码解析之Channel组件的更多相关文章

  1. Flume-ng源码解析之Sink组件

    作为启动流程中第二个启动的组件,我们今天来看看Sink的细节 1 Sink Sink在agent中扮演的角色是消费者,将event输送到特定的位置 首先依然是看代码,由代码我们可以看出Sink是一个接 ...

  2. Flume-ng源码解析之Source组件

    如果你还没看过Flume-ng源码解析系列中的启动流程.Channel组件和Sink组件,可以点击下面链接: Flume-ng源码解析之启动流程 Flume-ng源码解析之Channel组件 Flum ...

  3. rest-framework源码解析和自定义组件----版本

    版本 url中通过GET传参自定义的版本 12345678910111213141516171819202122 from django.http import HttpResponsefrom dj ...

  4. Spring源码解析系列汇总

    相信我,你会收藏这篇文章的 本篇文章是这段时间撸出来的Spring源码解析系列文章的汇总,总共包含以下专题.喜欢的同学可以收藏起来以备不时之需 SpringIOC源码解析(上) 本篇文章搭建了IOC源 ...

  5. [源码解析] 并行分布式任务队列 Celery 之 EventDispatcher & Event 组件

    [源码解析] 并行分布式任务队列 Celery 之 EventDispatcher & Event 组件 目录 [源码解析] 并行分布式任务队列 Celery 之 EventDispatche ...

  6. .Net Core缓存组件(Redis)源码解析

    上一篇文章已经介绍了MemoryCache,MemoryCache存储的数据类型是Object,也说了Redis支持五中数据类型的存储,但是微软的Redis缓存组件只实现了Hash类型的存储.在分析源 ...

  7. .Net Core缓存组件(MemoryCache)源码解析

    一.介绍 由于CPU从内存中读取数据的速度比从磁盘读取快几个数量级,并且存在内存中,减小了数据库访问的压力,所以缓存几乎每个项目都会用到.一般常用的有MemoryCache.Redis.MemoryC ...

  8. admin源码解析以及仿照admin设计stark组件

    ---恢复内容开始--- admin源码解析 一 启动:每个APP下的apps.py文件中. 首先执行每个APP下的admin.py 文件. def autodiscover(): autodisco ...

  9. admin源码解析及自定义stark组件

    admin源码解析 单例模式 单例模式(Singleton Pattern)是一种常用的软件设计模式,该模式的主要目的是确保某一个类只有一个实例存在.当你希望在整个系统中,某个类只能出现一个实例时,单 ...

随机推荐

  1. js原生之一个面向对象的应用

    function IElectricalEquipment() { }        IElectricalEquipment.prototype = {            poweron: fu ...

  2. log4net的分类型输出文件的配置

    <?xml version="1.0" encoding="utf-8" ?> <configuration> <configSe ...

  3. 搜索框(SearchView)的功能与用法

    SearchView是搜索框组件,它可以让用户在文本框内输入汉字,并允许通过监听器监控用户输入,当用户用户输入完成后提交搜索按钮时,也通过监听器执行实际的搜索. 使用SearchView时可以使用如下 ...

  4. ThinkPHP 分组,应用,跳转

    一.多应用配置技巧    在主入口文件index.php同级目录,新建一个 config.php 写入公共的配置项,然后在前后台各自的配置文件config.php中    $arr = include ...

  5. Raphael的transform用法

    Raphael的transform用法 <%@ page language="java" contentType="text/html; charset=UTF-8 ...

  6. FMS配置小结

    官方连接:http://help.adobe.com/en_US/flashmediaserver/configadmin/WS5b3ccc516d4fbf351e63e3d119f2925e64-8 ...

  7. HDU1172(枚举)

    猜数字 Time Limit: 20000/10000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others)Total Submi ...

  8. HDU1166(分块)

    敌兵布阵 Time Limit:1000MS     Memory Limit:32768KB     64bit IO Format:%I64d & %I64u Submit Status ...

  9. JTable 的使用

    JTable是Swing编程中的一种控件. 一.创建表格控件的各种方式:1) 调用无参构造函数. JTable table = new JTable(); 2) 以表头和表数据创建表格. Object ...

  10. .Net程序员学用Oracle系列(11):系统函数(下)

    1.聚合函数 1.1.COUNT 函数 1.2.SUM 函数 1.3.MAX 函数 1.4.MIN 函数 1.5.AVG 函数 2.ROWNUM 函数 2.1.ROWNUM 函数简介 2.2.利用 R ...