Flume有三个组件:Source、Channel 和 Sink。在源码中对应同名的三个接口。

When a Flume source receives an event, it stores it into one or more channels. The channel is a passive store that keeps the event until it’s consumed by a Flume sink.

public interface Source extends LifecycleAware, NamedComponent {

* Specifies which channel processor will handle this source's events.
* @param channelProcessor
public void setChannelProcessor(ChannelProcessor channelProcessor); /**
* Returns the channel processor that will handle this source's events.
public ChannelProcessor getChannelProcessor(); }


public interface Sink extends LifecycleAware, NamedComponent {
public void setChannel(Channel channel);
public Channel getChannel();
public Status process() throws EventDeliveryException;
public static enum Status {



 void close()
 void configure(Context context)
          The Context of the associated Source is passed.
 ChannelSelector getSelector()
 void initialize()
 void processEvent(Event event)
          Attempts to put the given event into each configured channel.
 void processEventBatch(List<Event> events)
          Attempts to put the given events into each configured channel.

通过ChannelProcessor, Flume可以实现下面的消息流

public interface Channel extends LifecycleAware, NamedComponent {
public void put(Event event) throws ChannelException;
public Event take() throws ChannelException;
public Transaction getTransaction();

A channel connects a Source to a Sink. The source acts as producer while the sink acts as a consumer of events. The channel itself is the buffer between the two.

A channel exposes a Transaction interface that can be used by its clients to ensure atomic put and take semantics. This is necessary to guarantee single hop reliability between agents. For instance, a source will successfully produce an event if and only if that event can be committed to the source's associated channel. Similarly, a sink will consume an event if and only if its respective endpoint can accept the event. The extent of transaction support varies for different channel implementations ranging from strong to best-effort semantics.

Channels are associated with unique names that can be used for separating configuration and working namespaces.



Flume uses a transactional approach to guarantee the reliable delivery of the events. The sources and sinks encapsulate in a transaction the storage/retrieval, respectively, of the events placed in or provided by a transaction provided by the channel. This ensures that the set of events are reliably passed from point to point in the flow. In the case of a multi-hop flow, the sink from the previous hop and the source from the next hop both have their transactions running to ensure that the data is safely stored in the channel of the next hop.



