Flink – CEP NFA

看看Flink cep如何将pattern转换为NFA？

当来了一条event，如果在NFA中执行的？

前面的链路，CEP –> PatternStream –> select –> CEPOperatorUtils.createPatternStream

1. 产生NFACompiler.compileFactory，完成pattern到state的转换

final NFACompiler.NFAFactory<T> nfaFactory = NFACompiler.compileFactory(pattern, inputSerializer, false);

            final NFAFactoryCompiler<T> nfaFactoryCompiler = new NFAFactoryCompiler<>(pattern);

            nfaFactoryCompiler.compileFactory();

            return new NFAFactoryImpl<>(inputTypeSerializer, nfaFactoryCompiler.getWindowTime(), nfaFactoryCompiler.getStates(), timeoutHandling);

调用，nfaFactoryCompiler.compileFactory

        void compileFactory() {

            // we're traversing the pattern from the end to the beginning --> the first state is the final state

            State<T> sinkState = createEndingState();

            // add all the normal states

            sinkState = createMiddleStates(sinkState);

            // add the beginning state

            createStartState(sinkState);

        }

可以看到做的工作，主要是生成state，即把pattern转换为NFA中的state和stateTransition

因为加pattern的是不断往后加，通过private final Pattern<T, ? extends T> previous来指向前面的pattern，所以在遍历pattern的时候只能回溯

先创建最后的final state

        private State<T> createEndingState() {

            State<T> endState = createState(ENDING_STATE_NAME, State.StateType.Final);

            windowTime = currentPattern.getWindowTime() != null ? currentPattern.getWindowTime().toMilliseconds() : 0L;

            return endState;

        }

很简单，就单纯的创建state

        private State<T> createState(String name, State.StateType stateType) {

            String stateName = getUniqueInternalStateName(name);

            usedNames.add(stateName);

            State<T> state = new State<>(stateName, stateType);

            states.add(state);

            return state;

        }

继续加middle的state，

        private State<T> createMiddleStates(final State<T> sinkState) {

            State<T> lastSink = sinkState; //记录上一个state

            while (currentPattern.getPrevious() != null) {

                checkPatternNameUniqueness(currentPattern.getName());

                lastSink = convertPattern(lastSink); //convert pattern到state

                // we traverse the pattern graph backwards

                followingPattern = currentPattern;

                currentPattern = currentPattern.getPrevious(); //往前回溯

                final Time currentWindowTime = currentPattern.getWindowTime();

                if (currentWindowTime != null && currentWindowTime.toMilliseconds() < windowTime) {

                    // the window time is the global minimum of all window times of each state

                    windowTime = currentWindowTime.toMilliseconds();

                }

            }

            return lastSink;

        }

调用convertPattern，

        private State<T> convertPattern(final State<T> sinkState) {

            final State<T> lastSink;

            lastSink = createSingletonState(sinkState); //只看singleton state

            addStopStates(lastSink);

            return lastSink;

        }

createSingletonState

        private State<T> createSingletonState(final State<T> sinkState, final IterativeCondition<T> ignoreCondition, final boolean isOptional) {

            final IterativeCondition<T> currentCondition = (IterativeCondition<T>) currentPattern.getCondition(); //从pattern里面取出condition

            final IterativeCondition<T> trueFunction = BooleanConditions.trueFunction();

            final State<T> singletonState = createState(currentPattern.getName(), State.StateType.Normal); //对currentPattern创建singletonState

            // if event is accepted then all notPatterns previous to the optional states are no longer valid

            singletonState.addTake(sink, currentCondition); //设置take StateTransition

            if (isOptional) {

                // if no element accepted the previous nots are still valid.

                singletonState.addProceed(sinkState, trueFunction); //如果有Optional，设置Proceed StateTransition

            }

            return singletonState;

        }

addTake

addStateTransition

    public void addStateTransition(

            final StateTransitionAction action,

            final State<T> targetState,

            final IterativeCondition<T> condition) {

        stateTransitions.add(new StateTransition<T>(this, action, targetState, condition));

    }

createStartState

        private State<T> createStartState(State<T> sinkState) {

            checkPatternNameUniqueness(currentPattern.getName());

            final State<T> beginningState = convertPattern(sinkState);

            beginningState.makeStart();

            return beginningState;

        }

2. 当event coming，如何处理？

AbstractKeyedCEPPatternOperator.processElement

            NFA<IN> nfa = getNFA();

            processEvent(nfa, element.getValue(), getProcessingTimeService().getCurrentProcessingTime());

            updateNFA(nfa);

如果statebackend里面有就取出来，否则nfaFactory.createNFA

    private NFA<IN> getNFA() throws IOException {

        NFA<IN> nfa = nfaOperatorState.value();

        return nfa != null ? nfa : nfaFactory.createNFA();

    }

createNFA

    NFA<T> result =  new NFA<>(inputTypeSerializer.duplicate(), windowTime, timeoutHandling);

    result.addStates(states);

addState

    public void addStates(final Collection<State<T>> newStates) {

        for (State<T> state: newStates) {

            addState(state);

        }

    }

    public void addState(final State<T> state) {

        states.add(state);

        if (state.isStart()) {

            computationStates.add(ComputationState.createStartState(this, state));

        }

    }

把states加入到NFA，

start state会加入computationStates，因为pattern的识别总是从start开始

KeyedCEPPatternOperator – > processEvent

    protected void processEvent(NFA<IN> nfa, IN event, long timestamp) {

        Tuple2<Collection<Map<String, List<IN>>>, Collection<Tuple2<Map<String, List<IN>>, Long>>> patterns =

            nfa.process(event, timestamp);

        emitMatchedSequences(patterns.f0, timestamp);

    }

NFA –> process

    public Tuple2<Collection<Map<String, List<T>>>, Collection<Tuple2<Map<String, List<T>>, Long>>> process(final T event, final long timestamp) {

        final int numberComputationStates = computationStates.size();

        final Collection<Map<String, List<T>>> result = new ArrayList<>();

        final Collection<Tuple2<Map<String, List<T>>, Long>> timeoutResult = new ArrayList<>();

        // iterate over all current computations

        for (int i = 0; i < numberComputationStates; i++) { //遍历所有的当前state

            ComputationState<T> computationState = computationStates.poll(); //poll一个state

            final Collection<ComputationState<T>> newComputationStates;

            newComputationStates = computeNextStates(computationState, event, timestamp); //通过NFA计算下一批的state

            //delay adding new computation states in case a stop state is reached and we discard the path.

            final Collection<ComputationState<T>> statesToRetain = new ArrayList<>(); //newComputationStates中有可能是stop state，所以不一定会放到statesToRetain

            //if stop state reached in this path

            boolean shouldDiscardPath = false;

            for (final ComputationState<T> newComputationState: newComputationStates) {

                if (newComputationState.isFinalState()) { //如果是final state，说明完成匹配

                    // we've reached a final state and can thus retrieve the matching event sequence

                    Map<String, List<T>> matchedPattern = extractCurrentMatches(newComputationState);

                    result.add(matchedPattern);

                    // remove found patterns because they are no longer needed

                    eventSharedBuffer.release(

                            newComputationState.getPreviousState().getName(),

                            newComputationState.getEvent(),

                            newComputationState.getTimestamp(),

                            computationState.getCounter());

                } else if (newComputationState.isStopState()) { //如果是stop state，那么删除该path

                    //reached stop state. release entry for the stop state

                    shouldDiscardPath = true;

                    eventSharedBuffer.release(

                        newComputationState.getPreviousState().getName(),

                        newComputationState.getEvent(),

                        newComputationState.getTimestamp(),

                        computationState.getCounter());

                } else { //中间状态，放入statesToRetain

                    // add new computation state; it will be processed once the next event arrives

                    statesToRetain.add(newComputationState);

                }

            }

            if (shouldDiscardPath) { //释放discardPath

                // a stop state was reached in this branch. release branch which results in removing previous event from

                // the buffer

                for (final ComputationState<T> state : statesToRetain) {

                    eventSharedBuffer.release(

                        state.getPreviousState().getName(),

                        state.getEvent(),

                        state.getTimestamp(),

                        state.getCounter());

                }

            } else { //将中间state加入computationStates

                computationStates.addAll(statesToRetain);

            }

        }

        // prune shared buffer based on window length

        if (windowTime > 0L) { //prune超时过期的pattern

            long pruningTimestamp = timestamp - windowTime;

            if (pruningTimestamp < timestamp) {

                // the check is to guard against underflows

                // remove all elements which are expired

                // with respect to the window length

                eventSharedBuffer.prune(pruningTimestamp);

            }

        }

        return Tuple2.of(result, timeoutResult);

    }

computeNextStates

    private Collection<ComputationState<T>> computeNextStates(

            final ComputationState<T> computationState,

            final T event,

            final long timestamp) {

        final OutgoingEdges<T> outgoingEdges = createDecisionGraph(computationState, event); //找出state的所有出边

         final List<StateTransition<T>> edges = outgoingEdges.getEdges();

        final List<ComputationState<T>> resultingComputationStates = new ArrayList<>();

        for (StateTransition<T> edge : edges) {

            switch (edge.getAction()) {

                case IGNORE: {

                    if (!computationState.isStartState()) {

                        final DeweyNumber version;

                        if (isEquivalentState(edge.getTargetState(), computationState.getState())) {

                            //Stay in the same state (it can be either looping one or singleton)

                            final int toIncrease = calculateIncreasingSelfState(

                                outgoingEdges.getTotalIgnoreBranches(),

                                outgoingEdges.getTotalTakeBranches());

                            version = computationState.getVersion().increase(toIncrease);

                        } else {

                            //IGNORE after PROCEED

                            version = computationState.getVersion()

                                .increase(totalTakeToSkip + ignoreBranchesToVisit)

                                .addStage();

                            ignoreBranchesToVisit--;

                        }

                        addComputationState( //对于ignore state，本身不用take，把target state加到computation state中

                                resultingComputationStates,

                                edge.getTargetState(),

                                computationState.getPreviousState(),

                                computationState.getEvent(),

                                computationState.getCounter(),

                                computationState.getTimestamp(),

                                version,

                                computationState.getStartTimestamp()

                        );

                    }

                }

                break;

                case TAKE:

                    final State<T> nextState = edge.getTargetState();

                    final State<T> currentState = edge.getSourceState();

                    final State<T> previousState = computationState.getPreviousState();

                    final T previousEvent = computationState.getEvent();

                    final int counter;

                    final long startTimestamp;

                    //对于take，需要把当前state记录到path里面，即放到eventSharedBuffer

                    if (computationState.isStartState()) {

                        startTimestamp = timestamp;

                        counter = eventSharedBuffer.put(

                            currentState.getName(),

                            event,

                            timestamp,

                            currentVersion);

                    } else {

                        startTimestamp = computationState.getStartTimestamp();

                        counter = eventSharedBuffer.put(

                            currentState.getName(),

                            event,

                            timestamp,

                            previousState.getName(),

                            previousEvent,

                            computationState.getTimestamp(),

                            computationState.getCounter(),

                            currentVersion);

                    }

                    addComputationState(

                            resultingComputationStates,

                            nextState,

                            currentState,

                            event,

                            counter,

                            timestamp,

                            nextVersion,

                            startTimestamp);

                    //check if newly created state is optional (have a PROCEED path to Final state)

                    final State<T> finalState = findFinalStateAfterProceed(nextState, event, computationState);

                    if (finalState != null) {

                        addComputationState(

                                resultingComputationStates,

                                finalState,

                                currentState,

                                event,

                                counter,

                                timestamp,

                                nextVersion,

                                startTimestamp);

                    }

                    break;

            }

        }

        return resultingComputationStates;

    }

private OutgoingEdges<T> createDecisionGraph(ComputationState<T> computationState, T event) {

        final OutgoingEdges<T> outgoingEdges = new OutgoingEdges<>(computationState.getState());

        final Stack<State<T>> states = new Stack<>();

        states.push(computationState.getState());

        //First create all outgoing edges, so to be able to reason about the Dewey version

        while (!states.isEmpty()) {

            State<T> currentState = states.pop();

            Collection<StateTransition<T>> stateTransitions = currentState.getStateTransitions(); //取出state所有的stateTransitions

            // check all state transitions for each state

            for (StateTransition<T> stateTransition : stateTransitions) {

                try {

                    if (checkFilterCondition(computationState, stateTransition.getCondition(), event)) {

                        // filter condition is true

                        switch (stateTransition.getAction()) {

                            case PROCEED:  //如果是proceed，直接跳到下个state

                                // simply advance the computation state, but apply the current event to it

                                // PROCEED is equivalent to an epsilon transition

                                states.push(stateTransition.getTargetState());

                                break;

                            case IGNORE:

                            case TAKE: //default，把stateTransition加入边

                                outgoingEdges.add(stateTransition);

                                break;

                        }

                    }

                } catch (Exception e) {

                    throw new RuntimeException("Failure happened in filter function.", e);

                }

            }

        }

        return outgoingEdges;

    }

Flink – CEP NFA的更多相关文章

Apache Flink CEP 实战
本文根据Apache Flink 实战&进阶篇系列直播课程整理而成,由哈啰出行大数据实时平台资深开发刘博分享.通过一些简单的实际例子,从概念原理,到如何使用,再到功能的扩展,希望能够给打算使用 ...
Flink cep的初步使用
一.CEP是什么在应用系统中,总会发生这样或那样的事件,有些事件是用户触发的,有些事件是系统触发的,有些可能是第三方触发的,但它们都可以被看做系统中可观察的状态改变,例如用户登陆应用失败.用户下了一 ...
Flink/CEP/规则引擎/风控
基于 Apache Flink 和规则引擎的实时风控解决方案对一个互联网产品来说,典型的风控场景包括:注册风控.登陆风控.交易风控.活动风控等,而风控的最佳效果是防患于未然,所以事前事中和事后三 ...
大数据计算引擎之Flink Flink CEP复杂事件编程
原文地址: 大数据计算引擎之Flink Flink CEP复杂事件编程复杂事件编程(CEP)是一种基于流处理的技术,将系统数据看作不同类型的事件,通过分析事件之间的关系,建立不同的时事件系序列库,并 ...
FlinkCEP - Complex event processing for Flink
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html 首先目的是匹配pattern sequenc ...
8.Flink实时项目之CEP计算访客跳出
1.访客跳出明细介绍首先要识别哪些是跳出行为,要把这些跳出的访客最后一个访问的页面识别出来.那么就要抓住几个特征: 该页面是用户近期访问的第一个页面,这个可以通过该页面是否有上一个页面(last_p ...
流计算技术实战 - CEP
CEP,Complex event processing Wiki定义 "Complex event processing, or CEP, is event processing that ...
如何利用Flink实现超大规模用户行为分析
如何利用Flink实现超大规模用户行为分析各位晚上好,首先感谢大家参与我的这次主题分享,同时也感谢 InfoQ AI 前线组织这次瀚思科技主题月! 瀚思科技成立于 2014 年,按行业划分我们是 ...
Flink 灵魂两百问，这谁顶得住？
Flink 学习 https://github.com/zhisheng17/flink-learning 麻烦路过的各位亲给这个项目点个 star,太不易了,写了这么多,算是对我坚持下来的一种鼓励吧 ...

随机推荐

flink 获取上传的Jar源码
package org.apache.flink.runtime.webmonitor.handlers; /** * Handles .jar file uploads. */public clas ...
Android查询不到电话号码解决方法
貌似联系人有三个数据库,且不同步,另外也有可能是版本问题. 解决方案:https://github.com/codinguser/android_contact_picker 接下来会对其进行一些改造 ...
spring batch中用到的表
1,批量表的前缀:{prefix}来自类AbstractJdbcBatchMetadataDao中的变量DEFAULT_TABLE_PREFIX 2,{prefix}job_execution:存放j ...
【emWin】例程二十八：窗口对象——Menu
简介: MENU 小工具可用于创建若干种菜单.每个菜单项代表一个应用程序命令或子菜单.MENU 可水平显示和/ 或垂直显示.菜单项可使用分隔符进行分组.水平菜单和垂直菜单均支持分隔符.选择一个菜单 ...
(原)测试 Java中Synchronized锁定对象的用法
今天再android_serial_port中看到了关键字 synchronized;因为刚好在学java和android,所以就查了一下它的用法: 于是把代码中的一小段代码拿了出来,做了一下修改,测 ...
CentOS Nginx网站服务器搭建实例
Nginx是一款开源的高性能HTTP服务器和返向代理服务器. 下载.编译.安装模块: [root@localhost nginx-1.4.0]#wget http://nginx.org/ ...
Spark学习笔记——数据读取和保存
spark所支持的文件格式 1.文本文件在 Spark 中读写文本文件很容易. 当我们将一个文本文件读取为 RDD 时,输入的每一行都会成为 RDD 的一个元素. 也可以将多个完整的文本文件一次 ...
Java中创建对象的五种方式
我们总是讨论没有对象就去new一个对象,创建对象的方式在我这里变成了根深蒂固的new方式创建,但是其实创建对象的方式还是有很多种的,不单单有new方式创建对象,还有使用反射机制创建对象,使用clone ...
Collections.synchronizedMap()与ConcurrentHashMap的区别
前面文章提到Collections.synchronizedMap()与ConcurrentHashM两者都提供了线程同步的功能.那两者的区别在哪呢?我们们先来看到代码例子. 下面代码实现一个线 ...
NUC972 linux 烧录
节介绍如何刻录uboot.kernel和文件系统到NAND Flash, 并且设定NUC970系列芯片从NAND Flash中开机.本节操作需要windows环境下进行.(初次连接电脑需要安装驱动) ...

Flink – CEP NFA

Flink – CEP NFA的更多相关文章

随机推荐

热门专题