Flume-NG启动过程源码分析(三)(原创)
上一篇文章分析了Flume如何加载配置文件的,动态加载也只是重复运行getConfiguration()。
本篇分析加载配置文件后各个组件是如何运行的?
加载完配置文件订阅者Application类会收到订阅信息执行:
@Subscribe
public synchronized void handleConfigurationEvent(MaterializedConfiguration conf) {
stopAllComponents();
startAllComponents(conf);
}
MaterializedConfiguration conf就是getConfiguration()方法获取的配置信息,是SimpleMaterializedConfiguration的一个实例。
handleConfigurationEvent方法在前面章节(一)中有过大致分析,包括:stopAllComponents()和startAllComponents(conf)。Application中的materializedConfiguration就是MaterializedConfiguration conf,stopAllComponents()方法中的materializedConfiguration是旧的配置信息,需要先停掉旧的组件,然后startAllComponents(conf)将新的配置信息赋给materializedConfiguration并依次启动各个组件。
1、先看startAllComponents(conf)方法。代码如下:
private void startAllComponents(MaterializedConfiguration materializedConfiguration) {//启动所有组件最基本的三大组件
logger.info("Starting new configuration:{}", materializedConfiguration); this.materializedConfiguration = materializedConfiguration; for (Entry<String, Channel> entry :
materializedConfiguration.getChannels().entrySet()) {
try{
logger.info("Starting Channel " + entry.getKey());
supervisor.supervise(entry.getValue(),
new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
} catch (Exception e){
logger.error("Error while starting {}", entry.getValue(), e);
}
} /*
* Wait for all channels to start.等待所有channel启动完毕
*/
for(Channel ch: materializedConfiguration.getChannels().values()){
while(ch.getLifecycleState() != LifecycleState.START
&& !supervisor.isComponentInErrorState(ch)){
try {
logger.info("Waiting for channel: " + ch.getName() +
" to start. Sleeping for 500 ms");
Thread.sleep(500);
} catch (InterruptedException e) {
logger.error("Interrupted while waiting for channel to start.", e);
Throwables.propagate(e);
}
}
} for (Entry<String, SinkRunner> entry : materializedConfiguration.getSinkRunners()
.entrySet()) { //启动所有sink
try{
logger.info("Starting Sink " + entry.getKey());
supervisor.supervise(entry.getValue(),
new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
} catch (Exception e) {
logger.error("Error while starting {}", entry.getValue(), e);
}
} for (Entry<String, SourceRunner> entry : materializedConfiguration
.getSourceRunners().entrySet()) {//启动所有source
try{
logger.info("Starting Source " + entry.getKey());
supervisor.supervise(entry.getValue(),
new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
} catch (Exception e) {
logger.error("Error while starting {}", entry.getValue(), e);
}
} this.loadMonitoring();
}
三大组件都是通过supervisor.supervise(entry.getValue(),new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START)启动的,其中,channel启动之后还要待所有的channel完全启动完毕之后才可再去启动sink和source。如果channel没有启动完毕就去启动另外俩组件,会出现错误,以为一旦sink或者source建立完毕就会立即与channel通信获取数据。稍后会分别分析sink和source的启动。
supervisor是LifecycleSupervisor的一个对象,该类的构造方法会构造一个有10个线程,上限是20的线程池供各大组件使用。构造方法如下:
public LifecycleSupervisor() {
lifecycleState = LifecycleState.IDLE;
supervisedProcesses = new HashMap<LifecycleAware, Supervisoree>();//存储所有历史上的组件及其监控信息
monitorFutures = new HashMap<LifecycleAware, ScheduledFuture<?>>();
monitorService = new ScheduledThreadPoolExecutor(10,
new ThreadFactoryBuilder().setNameFormat(
"lifecycleSupervisor-" + Thread.currentThread().getId() + "-%d")
.build());
monitorService.setMaximumPoolSize(20);
monitorService.setKeepAliveTime(30, TimeUnit.SECONDS);
purger = new Purger();
needToPurge = false;
}
supervise(LifecycleAware lifecycleAware,SupervisorPolicy policy, LifecycleState desiredState)方法则是具体执行启动各个组件的方法。flume的所有组件均实现自
LifecycleAware 接口,如图:,这个接口就三个方法getLifecycleState(返回组件运行状态)、start(组件启动)、stop(停止组件)。supervise方法代码如下:
public synchronized void supervise(LifecycleAware lifecycleAware,
SupervisorPolicy policy, LifecycleState desiredState) {
//检查线程池状态
if(this.monitorService.isShutdown()
|| this.monitorService.isTerminated()
|| this.monitorService.isTerminating()){
throw new FlumeException("Supervise called on " + lifecycleAware + " " +
"after shutdown has been initiated. " + lifecycleAware + " will not" +
" be started");
}
//如果该组件已经在监控,则拒绝二次监控
Preconditions.checkState(!supervisedProcesses.containsKey(lifecycleAware),
"Refusing to supervise " + lifecycleAware + " more than once"); if (logger.isDebugEnabled()) {
logger.debug("Supervising service:{} policy:{} desiredState:{}",
new Object[] { lifecycleAware, policy, desiredState });
}
//新的组件
Supervisoree process = new Supervisoree();
process.status = new Status(); process.policy = policy;
process.status.desiredState = desiredState;
process.status.error = false; MonitorRunnable monitorRunnable = new MonitorRunnable();
monitorRunnable.lifecycleAware = lifecycleAware;//组件
monitorRunnable.supervisoree = process;
monitorRunnable.monitorService = monitorService; supervisedProcesses.put(lifecycleAware, process);
//创建并执行一个在给定初始延迟后首次启用的定期操作,随后,在每一次执行终止和下一次执行开始之间都存在给定的延迟。如果任务的任一执行遇到异常,就会取消后续执行。
ScheduledFuture<?> future = monitorService.scheduleWithFixedDelay(
monitorRunnable, 0, 3, TimeUnit.SECONDS); //启动MonitorRunnable,结束之后3秒再重新启动,可以用于重试
monitorFutures.put(lifecycleAware, future);
}
该方法首先monitorService是否是正常运行状态;然后构造Supervisoree process = new Supervisoree(),进行赋值并构造一个监控进程MonitorRunnable,放入线程池去执行。
MonitorRunnable.run()方法:
public void run() {
logger.debug("checking process:{} supervisoree:{}", lifecycleAware,
supervisoree); long now = System.currentTimeMillis();//获取现在的时间戳 try {
if (supervisoree.status.firstSeen == null) {
logger.debug("first time seeing {}", lifecycleAware);
//如果这个组件是是初次受监控
supervisoree.status.firstSeen = now;
}
//如果这个组件已经监控过
supervisoree.status.lastSeen = now;
synchronized (lifecycleAware) {//锁住组件
if (supervisoree.status.discard) {//该组件已经停止监控
// Unsupervise has already been called on this.
logger.info("Component has already been stopped {}", lifecycleAware);
return;//直接返回
} else if (supervisoree.status.error) {//该组件是错误状态
logger.info("Component {} is in error state, and Flume will not"
+ "attempt to change its state", lifecycleAware);
return;//直接返回
} supervisoree.status.lastSeenState = lifecycleAware.getLifecycleState();//获取组件最新状态,没运行start()方法之前是LifecycleState.IDLE状态 if (!lifecycleAware.getLifecycleState().equals(
supervisoree.status.desiredState)) {//该组件最新状态和期望的状态不一致 logger.debug("Want to transition {} from {} to {} (failures:{})",
new Object[] { lifecycleAware, supervisoree.status.lastSeenState,
supervisoree.status.desiredState,
supervisoree.status.failures }); switch (supervisoree.status.desiredState) {//根据状态执行相应的操作
case START:
try {
lifecycleAware.start(); //启动组件,同时其状态也会变为LifecycleState.START
} catch (Throwable e) {
logger.error("Unable to start " + lifecycleAware
+ " - Exception follows.", e);
if (e instanceof Error) {
// This component can never recover, shut it down.
supervisoree.status.desiredState = LifecycleState.STOP;
try {
lifecycleAware.stop();
logger.warn("Component {} stopped, since it could not be"
+ "successfully started due to missing dependencies",
lifecycleAware);
} catch (Throwable e1) {
logger.error("Unsuccessful attempt to "
+ "shutdown component: {} due to missing dependencies."
+ " Please shutdown the agent"
+ "or disable this component, or the agent will be"
+ "in an undefined state.", e1);
supervisoree.status.error = true;
if (e1 instanceof Error) {
throw (Error) e1;
}
// Set the state to stop, so that the conf poller can
// proceed.
}
}
supervisoree.status.failures++;//启动错误失败次数+1
}
break;
case STOP:
try {
lifecycleAware.stop(); //停止组件
} catch (Throwable e) {
logger.error("Unable to stop " + lifecycleAware
+ " - Exception follows.", e);
if (e instanceof Error) {
throw (Error) e;
}
supervisoree.status.failures++; //组件停止错误,错误次数+1
}
break;
default:
logger.warn("I refuse to acknowledge {} as a desired state",
supervisoree.status.desiredState);
}
//两种SupervisorPolicy(AlwaysRestartPolicy和OnceOnlyPolicy)后者还未使用过,前者表示可以重新启动的组件,后者表示只能运行一次的组件
if (!supervisoree.policy.isValid(lifecycleAware, supervisoree.status)) {
logger.error(
"Policy {} of {} has been violated - supervisor should exit!",
supervisoree.policy, lifecycleAware);
}
}
}
} catch(Throwable t) {
logger.error("Unexpected error", t);
}
logger.debug("Status check complete");
}
上面的 lifecycleAware.stop()和lifecycleAware.start()就是执行的sink、source、channel等的对应方法。
这里的start需要注意如果是channel则是直接执行start方法;如果是sink或者PollableSource的实现类,则会在start()方法中启动一个线程来循环的调用process()方法来从channel拿数据(sink)或者向channel送数据(source);如果是EventDrivenSource的实现类,则没有process()方法,通过执行start()来执行想channel中送数据的操作(可以在此添加线程来实现相应的逻辑)。
2、stopAllComponents()方法。顾名思义,就是停止所有组件的方法。该方法代码如下:
private void stopAllComponents() {
if (this.materializedConfiguration != null) {
logger.info("Shutting down configuration: {}", this.materializedConfiguration);
for (Entry<String, SourceRunner> entry : this.materializedConfiguration
.getSourceRunners().entrySet()) {
try{
logger.info("Stopping Source " + entry.getKey());
supervisor.unsupervise(entry.getValue());
} catch (Exception e){
logger.error("Error while stopping {}", entry.getValue(), e);
}
} for (Entry<String, SinkRunner> entry :
this.materializedConfiguration.getSinkRunners().entrySet()) {
try{
logger.info("Stopping Sink " + entry.getKey());
supervisor.unsupervise(entry.getValue());
} catch (Exception e){
logger.error("Error while stopping {}", entry.getValue(), e);
}
} for (Entry<String, Channel> entry :
this.materializedConfiguration.getChannels().entrySet()) {
try{
logger.info("Stopping Channel " + entry.getKey());
supervisor.unsupervise(entry.getValue());
} catch (Exception e){
logger.error("Error while stopping {}", entry.getValue(), e);
}
}
}
if(monitorServer != null) {
monitorServer.stop();
}
}
首先,需要注意的是,stopAllComponents()放在startAllComponents(MaterializedConfiguration materializedConfiguration)方法之前的原因,由于配置文件的动态加载这一特性的存在,使得每次加载之前都要先把旧的组件停掉,然后才能去加载最新配置文件中的配置;
其次,首次执行stopAllComponents()时,由于配置文件尚未赋值,所以并不会执行停止所有组件的操作以及停止monitorServer。再次加载时会依照顺序依次停止对source、sink以及channel的监控,通过supervisor.unsupervise(entry.getValue())停止对其的监控,然后停止monitorServer。supervisor.unsupervise方法如下:
public synchronized void unsupervise(LifecycleAware lifecycleAware) { Preconditions.checkState(supervisedProcesses.containsKey(lifecycleAware),
"Unaware of " + lifecycleAware + " - can not unsupervise"); logger.debug("Unsupervising service:{}", lifecycleAware); synchronized (lifecycleAware) {
Supervisoree supervisoree = supervisedProcesses.get(lifecycleAware);
supervisoree.status.discard = true;
this.setDesiredState(lifecycleAware, LifecycleState.STOP);
logger.info("Stopping component: {}", lifecycleAware);
lifecycleAware.stop();
}
supervisedProcesses.remove(lifecycleAware);
//We need to do this because a reconfiguration simply unsupervises old
//components and supervises new ones.
monitorFutures.get(lifecycleAware).cancel(false);
//purges are expensive, so it is done only once every 2 hours.
needToPurge = true;
monitorFutures.remove(lifecycleAware);
}
该方法首先会检查正在运行的组件当中是否有此组件supervisedProcesses.containsKey(lifecycleAware);如果存在,则对此组件标记为已取消监控supervisoree.status.discard = true;将状态设置为STOP,并停止组件lifecycleAware.stop();然后从删除此组件的监控记录,包括从记录正在处于监控的组件的结构supervisedProcesses以及记录组件及其对应的运行线程的结构monitorFutures中删除相应的组件信息,并且needToPurge = true会使得两小时执行一次的线程池清理操作。
有一个问题就是,sink和source是如何找到对应的channel的呢??其实前面章节就已经讲解过,分别在AbstractConfigurationProvider.loadSources方法中通过ChannelSelector配置source对应的channel,而在source中通过getChannelProcessor()获取channels,通过channelProcessor.processEventBatch(eventList)将events发送到channel中;而在AbstractConfigurationProvider.loadSinks方法中sink.setChannel(channelComponent.channel)来设置此sink对应的channel,然后在sink的实现类中通过getChannel()获取设置的channel,并使用channel.take()从channel中获取event进行处理。
以上三节是Flume-NG的启动、配置文件的加载、配置文件的动态加载、组件的执行的整个流程。文中的疏漏之处,请各位指教,我依然会后续继续完善这些内容的。
后续还有更精彩的章节。。。。
Flume-NG启动过程源码分析(三)(原创)的更多相关文章
- scrapy 源码解析 (三):启动流程源码分析(三) ExecutionEngine执行引擎
ExecutionEngine执行引擎 上一篇分析了CrawlerProcess和Crawler对象的建立过程,在最终调用CrawlerProcess.start()之前,会首先建立Execution ...
- Flume-NG启动过程源码分析(二)(原创)
在上一节中讲解了——Flume-NG启动过程源码分析(一)(原创) 本节分析配置文件的解析,即PollingPropertiesFileConfigurationProvider.FileWatch ...
- Android Content Provider的启动过程源码分析
本文參考Android应用程序组件Content Provider的启动过程源码分析http://blog.csdn.net/luoshengyang/article/details/6963418和 ...
- 10.4 android输入系统_框架、编写一个万能模拟输入驱动程序、reader/dispatcher线程启动过程源码分析
1. 输入系统框架 android输入系统官方文档 // 需FQhttp://source.android.com/devices/input/index.html <深入理解Android 卷 ...
- Spark(五十一):Spark On YARN(Yarn-Cluster模式)启动流程源码分析(二)
上篇<Spark(四十九):Spark On YARN启动流程源码分析(一)>我们讲到启动SparkContext初始化,ApplicationMaster启动资源中,讲解的内容明显不完整 ...
- Spark(四十九):Spark On YARN启动流程源码分析(一)
引导: 该篇章主要讲解执行spark-submit.sh提交到将任务提交给Yarn阶段代码分析. spark-submit的入口函数 一般提交一个spark作业的方式采用spark-submit来提交 ...
- Activity启动过程源码分析(Android 8.0)
Activity启动过程源码分析 本文来Activity的启动流程,一般我们都是通过startActivity或startActivityForResult来启动目标activity,那么我们就由此出 ...
- Netty入门一:服务端应用搭建 & 启动过程源码分析
最近周末也没啥事就学学Netty,同时打算写一些博客记录一下(写的过程理解更加深刻了) 本文主要从三个方法来呈现:Netty核心组件简介.Netty服务端创建.Netty启动过程源码分析 如果你对Ne ...
- Spring启动过程源码分析基本概念
Spring启动过程源码分析基本概念 本文是通过AnnotationConfigApplicationContext读取配置类来一步一步去了解Spring的启动过程. 在看源码之前,我们要知道某些类的 ...
- Android系统默认Home应用程序(Launcher)的启动过程源码分析
在前面一篇文章中,我们分析了Android系统在启动时安装应用程序的过程,这些应用程序安装好之后,还须要有一个Home应用程序来负责把它们在桌面上展示出来,在Android系统中,这个默认的Home应 ...
随机推荐
- CentOS7上elasticsearch5.0启动失败
CentOS7上elasticsearch5.0启动失败 刚一启动完直接就退出了 $ ./elasticsearch ... ERROR: bootstrap checks failed max fi ...
- EasyGBS国标流媒体服务器GB28181国标方案安装使用文档
EasyGBS - GB28181 国标方案安装使用文档 下载 安装包下载,正式使用需商业授权, 功能一致 在线演示 在线API 架构图 EasySIPCMS SIP 中心信令服务, 单节点, 自带一 ...
- 前端发起resultUrl请求,服务端收到后做逆向处理,校验sign后,执行originUrl逻辑
originUrl=http://test.com:8080/user/alipay_phone?uid=123&amount=21.3第0步:前后端约定32位密钥KEY第一步:对参数按照ke ...
- 对宽度的控制原则 git commit -a -m "M 1、完成less计算得出图片的均分布局;";git push origin master:master
<script> import wepy from 'wepy' import api from '../api/api' export default class recharge ex ...
- QThread与多线程(比较清楚)
QThread类为我们提供了一种平台无关的管理线程的方式.一个QThread对象管理应用程序中的一个线程,该线程从run()函数开始执行.并且,默认情况下,我们可以在run()函数中通过调用QThre ...
- MySQL中有关icp mrr和bka的特性
文辉考我的问题,有关这三个的特性,如果在面试过程中,个人见解可以答以下 icp MyQL数据库会在取出索引的同时,判断是否进行WHERE条件过滤,也就是把WHERE的部分过滤操作放在存储引擎层,在某些 ...
- python构造wireshark可以解析的LTE空口数据
Wireshark是可以解析LTE的空口数据.但是在wireshark的实现中,这些数据都是被封装到UDP报文中.然后根据wireshark的格式文件对LTE的数据加上头信息.头信息的定义参考附件pa ...
- YAMLException: can not read a block mapping entry; a multiline key may not be an implicit key at line 5, column 1:
创建的md文件头部声明中没有加空格.
- new AnnotationConfigApplicationContext(MyBean.class)时,发生了什么?
当我们run一段代码,像下面这样两行.spring究竟做了什么些,让整个容器准备就绪,交付给用户直接可用的各种特性.为了弄清楚,默默梳理记录下来. public static void main (S ...
- redis3.2.8安装过程
1.安装依赖的包yum -y install jemalloc gcc2.解压redis的安装文件tar xf redis-3.2.8.tar.gz3.进入redis-3.2.8目录cd redis- ...