【翻译】The Broadcast State Pattern（广播状态）

Provided APIs （提供的api）
- BroadcastProcessFunction and KeyedBroadcastProcessFunction
Important Considerations （注意事项）

使用State描述运算符状态，该运算符状态在恢复时均匀分布在运算符的并行任务中，或者联合使用，整个状态用于初始化已恢复的并行任务。

第三种支持的运营商状态是广播状态。引入广播状态是为了支持这样的用例，其中来自一个流的一些数据需要被广播到所有下游任务，其中它被本地存储并用于处理另一个流上的所有传入元素。作为广播状态可以作为自然拟合出现的示例，可以想象包含一组规则的低吞吐量流，我们希望针对来自另一个流的所有元素进行评估。考虑到上述类型的用例，广播状态与其他运营商状态的不同之处在于：

　　1. 它是map格式

　　2. 它只适用于特定的操作：有一个广播流和一个非广播流

　　3. 这样的算子可以具有不同名称的多个广播状态。（ such an operator can have multiple broadcast states with different names. ）

Provided APIs

为了描述提供的API，我们将在展示其完整功能之前先举一个示例。作为我们的运行示例，我们将使用具有不同颜色和形状的对象流，并且我们想要找到跟随特定图案的相同颜色的对象组合，例如矩形后跟三角形。我们假设这组有趣的模式随着时间的推移而发展。

在此示例中，第一个流将包含Item类型的元素，其中包含Color和Shape属性。另一个流将包含规则。

从Items流开始，我们只需要通过Color做keyBy，因为我们需要相同颜色的对。这将确保相同颜色的元素最终在同一台物理机器上。

// key the shapes by color

KeyedStream<Item, Color> colorPartitionedStream = shapeStream

                        .keyBy(new KeySelector<Shape, Color>(){...});

继续执行规则，包含它们的流应该被广播到所有下游任务，并且这些任务应该在本地存储它们，以便它们可以针对所有传入的项目对它们进行评估。下面的片段将 1）广播规则流和 2）使用提供的MapStateDescriptor，它将创建存储规则的广播状态。

// a map descriptor to store the name of the rule (string) and the rule itself.

MapStateDescriptor<String, Rule> ruleStateDescriptor = new MapStateDescriptor<>(

            "RulesBroadcastState",

            BasicTypeInfo.STRING_TYPE_INFO,

            TypeInformation.of(new TypeHint<Rule>() {}));

// broadcast the rules and create the broadcast state

BroadcastStream<Rule> ruleBroadcastStream = ruleStream

                        .broadcast(ruleStateDescriptor);

最后，为了根据Item流中的传入元素评估规则，我们需要：

　　1. connect两个流

　　2. 指定我们的匹配检测逻辑

将流（键控或非键控）与BroadcastStream连接可以通过在非广播流上调用connect（），并将BroadcastStream作为参数来完成。这将返回一个BroadcastConnectedStream，我们可以使用特殊类型的CoProcessFunction调用process（）。该函数将包含我们的匹配逻辑。函数的确切类型取决于非广播流的类型：

　　* 如果它（非广播流）是键控的，方法是 KeyedBroadcastProcessFunction

　　* 如果它（非广播流）是非键控的，方法是BroadcastProcessFunction

鉴于我们的非广播流是键控的，以下代码段包含以上调用：

注意：应在非广播流上调用connect，并将Broadcast Stream作为参数。

DataStream<Match> output = colorPartitionedStream

                 .connect(ruleBroadcastStream)

                 .process(

                     // type arguments in our KeyedBroadcastProcessFunction represent:

                     //   1. the key of the keyed stream

                     //   2. the type of elements in the non-broadcast side

                     //   3. the type of elements in the broadcast side

                     //   4. the type of the result, here a string

                     new KeyedBroadcastProcessFunction<Color, Item, Rule, String>() {

                         // my matching logic

                     }

                 )

BroadcastProcessFunction and KeyedBroadcastProcessFunction

与CoProcessFunction的情况一样，这些函数有两种实现方法; processBroadcastElement（）负责处理广播流中的传入元素和processElement（）用于非广播流。这些方法的完整签名如下：

// non-keyed
public abstract class BroadcastProcessFunction<IN1, IN2, OUT> extends BaseBroadcastProcessFunction {

    public abstract void processElement(IN1 value, ReadOnlyContext ctx, Collector<OUT> out) throws Exception;

    public abstract void processBroadcastElement(IN2 value, Context ctx, Collector<OUT> out) throws Exception;

}
// keyed

public abstract class KeyedBroadcastProcessFunction<KS, IN1, IN2, OUT> {

    public abstract void processElement(IN1 value, ReadOnlyContext ctx, Collector<OUT> out) throws Exception;

    public abstract void processBroadcastElement(IN2 value, Context ctx, Collector<OUT> out) throws Exception;

    public void onTimer(long timestamp, OnTimerContext ctx, Collector<OUT> out) throws Exception;

}

首先要注意的是，两个函数都需要执行processBroadcastElement（）方法来处理广播端的元素，而processElement（）则处理非广播端的元素。

这两种方法在提供的上下文中有所不同。非广播侧具有ReadOnlyContext，而广播侧具有Context。

这两个上下文（ctx在以下枚举中）：

　　1. 允许访问广播状态：ctx.getBroadcastState（MapStateDescriptor <K，V> stateDescriptor）

　　2. 允许查询元素的时间戳：ctx.timestamp（）

　　 3. 获取当前水印：ctx.currentWatermark（）

　　 4. 获取当前处理时间：ctx.currentProcessingTime（）

　　 5. 向侧边流发出元素：ctx.output（OutputTag <X> outputTag，X value）

getBroadcastState（）中的stateDescriptor应该与上面的.broadcast（ruleStateDescriptor）中的stateDescriptor相同。

不同之处在于每个流对广播状态的访问类型。广播方对其(广播状态)具有读写访问权限，而非广播方具有只读访问权（因此命名：thus the names）。原因是在Flink中没有跨任务通信。因此，为了保证广播状态中的内容在我们的运算符的所有并行实例中是相同的，我们只对广播端提供读写访问，广播端在所有任务中看到相同的元素，并且我们需要对每个任务进行计算。这一侧的传入元素在所有任务中都是相同的。忽略此规则会破坏状态的一致性保证，从而导致不一致且通常难以调试的结果。

注意：`processBroadcast（）`中实现的逻辑必须在所有并行实例中具有相同的确定性行为！

最后，由于KeyedBroadcastProcessFunction在键控流上运行，它暴露了一些BroadcastProcessFunction不可用的功能。那是：

　　1. processElement（）方法中的ReadOnlyContext可以访问Flink的底层计时器服务，该服务允许注册事件和/或处理时间计时器。当一个计时器触发时，onTimer（）（如上所示）被一个OnTimerContext调用，它暴露了与ReadOnlyContext相同的功能和

　　　　* 询问触发的计时器是事件还是处理时间的能力

　　　　* 查询与计时器关联的key

　　2. processBroadcastElement（）方法中的Context包含方法applyToKeyedState（StateDescriptor <S，VS> stateDescriptor，KeyedStateFunction <KS，S> function）。这允许注册KeyedStateFunction以应用于与提供的stateDescriptor相关联的所有键的所有状态。

注意：只能在`KeyedBroadcastProcessFunction`的`processElement（）`中注册定时器。 在`processBroadcastElement（）`方法中是不可能的，因为没有与广播元素相关联的键。

回到我们的原始示例，我们的KeyedBroadcastProcessFunction可能如下所示：

new KeyedBroadcastProcessFunction<Color, Item, Rule, String>() {

    // store partial matches, i.e. first elements of the pair waiting for their second element

    // we keep a list as we may have many first elements waiting

    private final MapStateDescriptor<String, List<Item>> mapStateDesc =

        new MapStateDescriptor<>(

            "items",

            BasicTypeInfo.STRING_TYPE_INFO,

            new ListTypeInfo<>(Item.class));

    // identical to our ruleStateDescriptor above

    private final MapStateDescriptor<String, Rule> ruleStateDescriptor =

        new MapStateDescriptor<>(

            "RulesBroadcastState",

            BasicTypeInfo.STRING_TYPE_INFO,

            TypeInformation.of(new TypeHint<Rule>() {}));

    @Override

    public void processBroadcastElement(Rule value,

                                        Context ctx,

                                        Collector<String> out) throws Exception {

        ctx.getBroadcastState(ruleStateDescriptor).put(value.name, value);

    }

    @Override

    public void processElement(Item value,

                               ReadOnlyContext ctx,

                               Collector<String> out) throws Exception {

        final MapState<String, List<Item>> state = getRuntimeContext().getMapState(mapStateDesc);

        final Shape shape = value.getShape();

        for (Map.Entry<String, Rule> entry :

                ctx.getBroadcastState(ruleStateDescriptor).immutableEntries()) {

            final String ruleName = entry.getKey();

            final Rule rule = entry.getValue();

            List<Item> stored = state.get(ruleName);

            if (stored == null) {

                stored = new ArrayList<>();

            }

            if (shape == rule.second && !stored.isEmpty()) {

                for (Item i : stored) {

                    out.collect("MATCH: " + i + " - " + value);

                }

                stored.clear();

            }

            // there is no else{} to cover if rule.first == rule.second

            if (shape.equals(rule.first)) {

                stored.add(value);

            }

            if (stored.isEmpty()) {

                state.remove(ruleName);

            } else {

                state.put(ruleName, stored);

            }

        }

    }

}

Important Considerations

在描述提供的API之后，本节重点介绍使用广播状态时要记住的重要事项。这些是：

* There is no cross-task communication: 如前所述，这就是为什么只有（Keyed）-BroadcastProcessFunction的广播端可以修改广播状态的内容。此外，用户必须确保所有任务以相同的方式为每个传入元素修改广播状态的内容。否则，不同的任务可能具有不同的内容，从而导致不一致的结果。

* Order of events in Broadcast State may differ across tasks: 虽然广播流的元素保证所有元素将（最终）转到所有下游任务，但元素可以以不同的顺序到达每个任务。因此，每个传入元素的状态更新绝不取决于传入事件的顺序。

* All tasks checkpoint their broadcast state: 虽然检查点发生时所有任务在广播状态中都具有相同的元素（检查点barriers 不会覆盖元素），但所有任务都会检查其广播状态，而不仅仅是其中一个。这是一个设计决策，以避免在恢复期间从同一文件中读取所有任务（从而避免热点），尽管它的代价是将检查点状态的大小增加p（=并行度）。 Flink保证在恢复/重新缩放时不会出现重复数据，也不会丢失数据。在具有相同或更小并行度的恢复的情况下，每个任务读取其检查点状态。在按比例放大时，每个任务都读取其自己的状态，其余任务（p_new-p_old）以循环方式读取先前任务的检查点。

* No RocksDB state backend: 广播状态在运行时保留在内存中，并且应该相应地进行内存配置。这适用于所有算子的state。

etl实例：基于Broadcast 状态的Flink Etl Demo