Flink – CEP NFA
看看Flink cep如何将pattern转换为NFA?
当来了一条event,如果在NFA中执行的?
前面的链路,CEP –> PatternStream –> select –> CEPOperatorUtils.createPatternStream
1. 产生NFACompiler.compileFactory,完成pattern到state的转换
final NFACompiler.NFAFactory<T> nfaFactory = NFACompiler.compileFactory(pattern, inputSerializer, false);
final NFAFactoryCompiler<T> nfaFactoryCompiler = new NFAFactoryCompiler<>(pattern);
nfaFactoryCompiler.compileFactory();
return new NFAFactoryImpl<>(inputTypeSerializer, nfaFactoryCompiler.getWindowTime(), nfaFactoryCompiler.getStates(), timeoutHandling);
调用,nfaFactoryCompiler.compileFactory
void compileFactory() {
// we're traversing the pattern from the end to the beginning --> the first state is the final state
State<T> sinkState = createEndingState();
// add all the normal states
sinkState = createMiddleStates(sinkState);
// add the beginning state
createStartState(sinkState);
}
可以看到做的工作,主要是生成state,即把pattern转换为NFA中的state和stateTransition
因为加pattern的是不断往后加,通过private final Pattern<T, ? extends T> previous来指向前面的pattern,所以在遍历pattern的时候只能回溯
先创建最后的final state
private State<T> createEndingState() {
State<T> endState = createState(ENDING_STATE_NAME, State.StateType.Final);
windowTime = currentPattern.getWindowTime() != null ? currentPattern.getWindowTime().toMilliseconds() : 0L;
return endState;
}
很简单,就单纯的创建state
private State<T> createState(String name, State.StateType stateType) {
String stateName = getUniqueInternalStateName(name);
usedNames.add(stateName);
State<T> state = new State<>(stateName, stateType);
states.add(state);
return state;
}
继续加middle的state,
private State<T> createMiddleStates(final State<T> sinkState) {
State<T> lastSink = sinkState; //记录上一个state
while (currentPattern.getPrevious() != null) { checkPatternNameUniqueness(currentPattern.getName());
lastSink = convertPattern(lastSink); //convert pattern到state // we traverse the pattern graph backwards
followingPattern = currentPattern;
currentPattern = currentPattern.getPrevious(); //往前回溯 final Time currentWindowTime = currentPattern.getWindowTime();
if (currentWindowTime != null && currentWindowTime.toMilliseconds() < windowTime) {
// the window time is the global minimum of all window times of each state
windowTime = currentWindowTime.toMilliseconds();
}
}
return lastSink;
}
调用convertPattern,
private State<T> convertPattern(final State<T> sinkState) {
final State<T> lastSink; lastSink = createSingletonState(sinkState); //只看singleton state
addStopStates(lastSink); return lastSink;
}
createSingletonState
private State<T> createSingletonState(final State<T> sinkState, final IterativeCondition<T> ignoreCondition, final boolean isOptional) {
final IterativeCondition<T> currentCondition = (IterativeCondition<T>) currentPattern.getCondition(); //从pattern里面取出condition
final IterativeCondition<T> trueFunction = BooleanConditions.trueFunction(); final State<T> singletonState = createState(currentPattern.getName(), State.StateType.Normal); //对currentPattern创建singletonState
// if event is accepted then all notPatterns previous to the optional states are no longer valid
singletonState.addTake(sink, currentCondition); //设置take StateTransition if (isOptional) {
// if no element accepted the previous nots are still valid.
singletonState.addProceed(sinkState, trueFunction); //如果有Optional,设置Proceed StateTransition
} return singletonState;
}
addTake
addStateTransition
public void addStateTransition(
final StateTransitionAction action,
final State<T> targetState,
final IterativeCondition<T> condition) {
stateTransitions.add(new StateTransition<T>(this, action, targetState, condition));
}
createStartState
private State<T> createStartState(State<T> sinkState) {
checkPatternNameUniqueness(currentPattern.getName());
final State<T> beginningState = convertPattern(sinkState);
beginningState.makeStart();
return beginningState;
}
2. 当event coming,如何处理?
AbstractKeyedCEPPatternOperator.processElement
NFA<IN> nfa = getNFA();
processEvent(nfa, element.getValue(), getProcessingTimeService().getCurrentProcessingTime());
updateNFA(nfa);
如果statebackend里面有就取出来,否则nfaFactory.createNFA
private NFA<IN> getNFA() throws IOException {
NFA<IN> nfa = nfaOperatorState.value();
return nfa != null ? nfa : nfaFactory.createNFA();
}
createNFA
NFA<T> result = new NFA<>(inputTypeSerializer.duplicate(), windowTime, timeoutHandling);
result.addStates(states);
addState
public void addStates(final Collection<State<T>> newStates) {
for (State<T> state: newStates) {
addState(state);
}
} public void addState(final State<T> state) {
states.add(state); if (state.isStart()) {
computationStates.add(ComputationState.createStartState(this, state));
}
}
把states加入到NFA,
start state会加入computationStates,因为pattern的识别总是从start开始
KeyedCEPPatternOperator – > processEvent
protected void processEvent(NFA<IN> nfa, IN event, long timestamp) {
Tuple2<Collection<Map<String, List<IN>>>, Collection<Tuple2<Map<String, List<IN>>, Long>>> patterns =
nfa.process(event, timestamp); emitMatchedSequences(patterns.f0, timestamp);
}
NFA –> process
public Tuple2<Collection<Map<String, List<T>>>, Collection<Tuple2<Map<String, List<T>>, Long>>> process(final T event, final long timestamp) {
final int numberComputationStates = computationStates.size();
final Collection<Map<String, List<T>>> result = new ArrayList<>();
final Collection<Tuple2<Map<String, List<T>>, Long>> timeoutResult = new ArrayList<>(); // iterate over all current computations
for (int i = 0; i < numberComputationStates; i++) { //遍历所有的当前state
ComputationState<T> computationState = computationStates.poll(); //poll一个state final Collection<ComputationState<T>> newComputationStates; newComputationStates = computeNextStates(computationState, event, timestamp); //通过NFA计算下一批的state //delay adding new computation states in case a stop state is reached and we discard the path.
final Collection<ComputationState<T>> statesToRetain = new ArrayList<>(); //newComputationStates中有可能是stop state,所以不一定会放到statesToRetain
//if stop state reached in this path
boolean shouldDiscardPath = false;
for (final ComputationState<T> newComputationState: newComputationStates) {
if (newComputationState.isFinalState()) { //如果是final state,说明完成匹配
// we've reached a final state and can thus retrieve the matching event sequence
Map<String, List<T>> matchedPattern = extractCurrentMatches(newComputationState);
result.add(matchedPattern); // remove found patterns because they are no longer needed
eventSharedBuffer.release(
newComputationState.getPreviousState().getName(),
newComputationState.getEvent(),
newComputationState.getTimestamp(),
computationState.getCounter());
} else if (newComputationState.isStopState()) { //如果是stop state,那么删除该path
//reached stop state. release entry for the stop state
shouldDiscardPath = true;
eventSharedBuffer.release(
newComputationState.getPreviousState().getName(),
newComputationState.getEvent(),
newComputationState.getTimestamp(),
computationState.getCounter());
} else { //中间状态,放入statesToRetain
// add new computation state; it will be processed once the next event arrives
statesToRetain.add(newComputationState);
}
} if (shouldDiscardPath) { //释放discardPath
// a stop state was reached in this branch. release branch which results in removing previous event from
// the buffer
for (final ComputationState<T> state : statesToRetain) {
eventSharedBuffer.release(
state.getPreviousState().getName(),
state.getEvent(),
state.getTimestamp(),
state.getCounter());
}
} else { //将中间state加入computationStates
computationStates.addAll(statesToRetain);
} } // prune shared buffer based on window length
if (windowTime > 0L) { //prune超时过期的pattern
long pruningTimestamp = timestamp - windowTime; if (pruningTimestamp < timestamp) {
// the check is to guard against underflows // remove all elements which are expired
// with respect to the window length
eventSharedBuffer.prune(pruningTimestamp);
}
} return Tuple2.of(result, timeoutResult);
}
computeNextStates
private Collection<ComputationState<T>> computeNextStates(
final ComputationState<T> computationState,
final T event,
final long timestamp) { final OutgoingEdges<T> outgoingEdges = createDecisionGraph(computationState, event); //找出state的所有出边 final List<StateTransition<T>> edges = outgoingEdges.getEdges(); final List<ComputationState<T>> resultingComputationStates = new ArrayList<>();
for (StateTransition<T> edge : edges) {
switch (edge.getAction()) {
case IGNORE: {
if (!computationState.isStartState()) {
final DeweyNumber version;
if (isEquivalentState(edge.getTargetState(), computationState.getState())) {
//Stay in the same state (it can be either looping one or singleton)
final int toIncrease = calculateIncreasingSelfState(
outgoingEdges.getTotalIgnoreBranches(),
outgoingEdges.getTotalTakeBranches());
version = computationState.getVersion().increase(toIncrease);
} else {
//IGNORE after PROCEED
version = computationState.getVersion()
.increase(totalTakeToSkip + ignoreBranchesToVisit)
.addStage();
ignoreBranchesToVisit--;
} addComputationState( //对于ignore state,本身不用take,把target state加到computation state中
resultingComputationStates,
edge.getTargetState(),
computationState.getPreviousState(),
computationState.getEvent(),
computationState.getCounter(),
computationState.getTimestamp(),
version,
computationState.getStartTimestamp()
);
}
}
break;
case TAKE:
final State<T> nextState = edge.getTargetState();
final State<T> currentState = edge.getSourceState();
final State<T> previousState = computationState.getPreviousState(); final T previousEvent = computationState.getEvent(); final int counter;
final long startTimestamp;
//对于take,需要把当前state记录到path里面,即放到eventSharedBuffer
if (computationState.isStartState()) {
startTimestamp = timestamp;
counter = eventSharedBuffer.put(
currentState.getName(),
event,
timestamp,
currentVersion);
} else {
startTimestamp = computationState.getStartTimestamp();
counter = eventSharedBuffer.put(
currentState.getName(),
event,
timestamp,
previousState.getName(),
previousEvent,
computationState.getTimestamp(),
computationState.getCounter(),
currentVersion);
} addComputationState(
resultingComputationStates,
nextState,
currentState,
event,
counter,
timestamp,
nextVersion,
startTimestamp); //check if newly created state is optional (have a PROCEED path to Final state)
final State<T> finalState = findFinalStateAfterProceed(nextState, event, computationState);
if (finalState != null) {
addComputationState(
resultingComputationStates,
finalState,
currentState,
event,
counter,
timestamp,
nextVersion,
startTimestamp);
}
break;
}
} return resultingComputationStates;
}
private OutgoingEdges<T> createDecisionGraph(ComputationState<T> computationState, T event) {
final OutgoingEdges<T> outgoingEdges = new OutgoingEdges<>(computationState.getState()); final Stack<State<T>> states = new Stack<>();
states.push(computationState.getState()); //First create all outgoing edges, so to be able to reason about the Dewey version
while (!states.isEmpty()) {
State<T> currentState = states.pop();
Collection<StateTransition<T>> stateTransitions = currentState.getStateTransitions(); //取出state所有的stateTransitions // check all state transitions for each state
for (StateTransition<T> stateTransition : stateTransitions) {
try {
if (checkFilterCondition(computationState, stateTransition.getCondition(), event)) {
// filter condition is true
switch (stateTransition.getAction()) {
case PROCEED: //如果是proceed,直接跳到下个state
// simply advance the computation state, but apply the current event to it
// PROCEED is equivalent to an epsilon transition
states.push(stateTransition.getTargetState());
break;
case IGNORE:
case TAKE: //default,把stateTransition加入边
outgoingEdges.add(stateTransition);
break;
}
}
} catch (Exception e) {
throw new RuntimeException("Failure happened in filter function.", e);
}
}
}
return outgoingEdges;
}
Flink – CEP NFA的更多相关文章
- Apache Flink CEP 实战
本文根据Apache Flink 实战&进阶篇系列直播课程整理而成,由哈啰出行大数据实时平台资深开发刘博分享.通过一些简单的实际例子,从概念原理,到如何使用,再到功能的扩展,希望能够给打算使用 ...
- Flink cep的初步使用
一.CEP是什么 在应用系统中,总会发生这样或那样的事件,有些事件是用户触发的,有些事件是系统触发的,有些可能是第三方触发的,但它们都可以被看做系统中可观察的状态改变,例如用户登陆应用失败.用户下了一 ...
- Flink/CEP/规则引擎/风控
基于 Apache Flink 和规则引擎的实时风控解决方案 对一个互联网产品来说,典型的风控场景包括:注册风控.登陆风控.交易风控.活动风控等,而风控的最佳效果是防患于未然,所以事前事中和事后三 ...
- 大数据计算引擎之Flink Flink CEP复杂事件编程
原文地址: 大数据计算引擎之Flink Flink CEP复杂事件编程 复杂事件编程(CEP)是一种基于流处理的技术,将系统数据看作不同类型的事件,通过分析事件之间的关系,建立不同的时事件系序列库,并 ...
- FlinkCEP - Complex event processing for Flink
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html 首先目的是匹配pattern sequenc ...
- 8.Flink实时项目之CEP计算访客跳出
1.访客跳出明细介绍 首先要识别哪些是跳出行为,要把这些跳出的访客最后一个访问的页面识别出来.那么就要抓住几个特征: 该页面是用户近期访问的第一个页面,这个可以通过该页面是否有上一个页面(last_p ...
- 流计算技术实战 - CEP
CEP,Complex event processing Wiki定义 "Complex event processing, or CEP, is event processing that ...
- 如何利用Flink实现超大规模用户行为分析
如何利用Flink实现超大规模用户行为分析 各位晚上好,首先感谢大家参与我的这次主题分享,同时也感谢 InfoQ AI 前线组织这次瀚思科技主题月! 瀚思科技成立于 2014 年,按行业划分我们是 ...
- Flink 灵魂两百问,这谁顶得住?
Flink 学习 https://github.com/zhisheng17/flink-learning 麻烦路过的各位亲给这个项目点个 star,太不易了,写了这么多,算是对我坚持下来的一种鼓励吧 ...
随机推荐
- flink 获取上传的Jar源码
package org.apache.flink.runtime.webmonitor.handlers; /** * Handles .jar file uploads. */public clas ...
- Android查询不到电话号码解决方法
貌似联系人有三个数据库,且不同步,另外也有可能是版本问题. 解决方案:https://github.com/codinguser/android_contact_picker 接下来会对其进行一些改造 ...
- spring batch中用到的表
1,批量表的前缀:{prefix}来自类AbstractJdbcBatchMetadataDao中的变量DEFAULT_TABLE_PREFIX 2,{prefix}job_execution:存放j ...
- 【emWin】例程二十八:窗口对象——Menu
简介: MENU 小工具可用于创建若干种菜单.每个菜单项代表一个应用程序命令或 子菜单.MENU 可水平显示和/ 或垂直显示.菜单项可使用分隔符进行分组.水 平菜单和垂直菜单均支持分隔符.选择一个菜单 ...
- (原)测试 Java中Synchronized锁定对象的用法
今天再android_serial_port中看到了关键字 synchronized;因为刚好在学java和android,所以就查了一下它的用法: 于是把代码中的一小段代码拿了出来,做了一下修改,测 ...
- CentOS Nginx网站服务器搭建实例
Nginx是一款开源的高性能HTTP服务器和返向代理服务器. 下载.编译.安装模块: [root@localhost nginx-1.4.0]#wget http://nginx.org/ ...
- Spark学习笔记——数据读取和保存
spark所支持的文件格式 1.文本文件 在 Spark 中读写文本文件很容易. 当我们将一个文本文件读取为 RDD 时,输入的每一行 都会成为 RDD 的 一个元素. 也可以将多个完整的文本文件一次 ...
- Java中创建对象的五种方式
我们总是讨论没有对象就去new一个对象,创建对象的方式在我这里变成了根深蒂固的new方式创建,但是其实创建对象的方式还是有很多种的,不单单有new方式创建对象,还有使用反射机制创建对象,使用clone ...
- Collections.synchronizedMap()与ConcurrentHashMap的区别
前面文章提到Collections.synchronizedMap()与ConcurrentHashM两者都提供了线程同步的功能.那两者的区别在哪呢?我们们先来看到代码例子. 下面代码实现一个线 ...
- NUC972 linux 烧录
节介绍如何刻录uboot.kernel和文件系统到NAND Flash, 并且设定NUC970系列芯片从NAND Flash中开机.本节操作需要windows环境下进行.(初次连接电脑需要安装驱动) ...