在做支付订单宽表的场景,需要关联的表比较多而且支付有可能要延迟很久,这种情况下不太适合使用Flink的表Join,想到的另外一种解决方案是消费多个Topic的数据,再根据订单号进行keyBy,再在逻辑中根据不同Topic处理,所以在接收到的消息中最好能够有topic字段,JSONKeyValueDeserializationSchema就完美的解决了这个问题。

  def getKafkaConsumer(kafkaAddr: String, topicNames: util.ArrayList[String], groupId: String): FlinkKafkaConsumer[ObjectNode] = {
val properties = getKafkaProperties(groupId, kafkaAddr)
val consumer = new FlinkKafkaConsumer[ObjectNode](topicNames, new JSONKeyValueDeserializationSchema(true), properties)
consumer.setStartFromGroupOffsets() // the default behaviour
consumer
}

在这里new JSONKeyValueDeserializationSchema(true)是需要带上元数据信息,false则不带上,源码如下

public class JSONKeyValueDeserializationSchema implements KafkaDeserializationSchema<ObjectNode> {
private static final long serialVersionUID = 1509391548173891955L;
private final static Logger log = LoggerFactory.getLogger(JSONKeyValueDeserializationSchema.class);
private final boolean includeMetadata;
private ObjectMapper mapper; public JSONKeyValueDeserializationSchema(boolean includeMetadata) {
this.includeMetadata = includeMetadata;
} public ObjectNode deserialize(ConsumerRecord<byte[], byte[]> record) {
if (this.mapper == null) {
this.mapper = new ObjectMapper();
}
ObjectNode node = this.mapper.createObjectNode(); if (record.key() != null) {
node.set("key", this.mapper.readValue(record.key(), JsonNode.class));
} if (record.value() != null) {
node.set("value", this.mapper.readValue(record.value(), JsonNode.class));
} if (this.includeMetadata) {
node.putObject("metadata").put("offset", record.offset()).put("topic", record.topic()).put("partition", record.partition());
}return node;
} public boolean isEndOfStream(ObjectNode nextElement) {
return false;
} public TypeInformation<ObjectNode> getProducedType() {
return TypeExtractor.getForClass(ObjectNode.class);
}
}

本来以为到这里就大功告成了,谁不想居然报错了。。每条消息反序列化的都报错。

2019-11-29 19:55:15.401 flink [Source: kafkasource (1/1)] ERROR c.y.b.D.JSONKeyValueDeserializationSchema - Unrecognized token 'xxxxx': was expecting ('true', 'false' or 'null')
at [Source: [B@2e119f0e; line: 1, column: 45]org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'xxxxxxx': was expecting ('true', 'false' or 'null')
at [Source: [B@2e119f0e; line: 1, column: 45]
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1586)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:521)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3464)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleUnexpectedValue(UTF8StreamJsonParser.java:2628)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:854)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:748)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3847)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3792)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2890)
at com.xx.xx.DeserializationSchema.JSONKeyValueDeserializationSchema.deserialize(JSONKeyValueDeserializationSchema.java:33)
at com.xx.xx.DeserializationSchema.JSONKeyValueDeserializationSchema.deserialize(JSONKeyValueDeserializationSchema.java:15)
at org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.runFetchLoop(KafkaFetcher.java:140)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:711)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:745)

因为源码是没有try catch的,无法获取到报错的具体数据,只能直接重写这个方法了

新建一个DeserializationSchema包,再创建JSONKeyValueDeserializationSchema类,然后在getKafkaConsumer重新引用我们自己的JSONKeyValueDeserializationSchema类,再在日志中我们就可以知道是哪些数据无法反序列化

@PublicEvolving
public class JSONKeyValueDeserializationSchema implements KafkaDeserializationSchema<ObjectNode> {
private static final long serialVersionUID = 1509391548173891955L;
private final static Logger log = LoggerFactory.getLogger(JSONKeyValueDeserializationSchema.class);
private final boolean includeMetadata;
private ObjectMapper mapper; public JSONKeyValueDeserializationSchema(boolean includeMetadata) {
this.includeMetadata = includeMetadata;
} public ObjectNode deserialize(ConsumerRecord<byte[], byte[]> record) {
if (this.mapper == null) {
this.mapper = new ObjectMapper();
}
ObjectNode node = this.mapper.createObjectNode();
try {
if (record.key() != null) {
node.set("key", this.mapper.readValue(record.key(), JsonNode.class));
} if (record.value() != null) {
node.set("value", this.mapper.readValue(record.value(), JsonNode.class));
} if (this.includeMetadata) {
node.putObject("metadata").put("offset", record.offset()).put("topic", record.topic()).put("partition", record.partition());
}
} catch (Exception e) {
log.error(e.getMessage(), e);
log.error("JSONKeyValueDeserializationSchema 出错:" + record.toString() + "=====key为" + new String(record.key()) + "=====数据为" + new String(record.value()));
}
return node;
} public boolean isEndOfStream(ObjectNode nextElement) {
return false;
} public TypeInformation<ObjectNode> getProducedType() {
return TypeExtractor.getForClass(ObjectNode.class);
}
}

发现key为一串订单号,因为topic数据不是原生canal json数据,是被加工过的,那应该是上游生产的时候指定的key

那继续修改我们的JSONKeyValueDeserializationSchema代码,因为key用不到,所以直接注释掉,当然也可以将class指定为String

            if (record.key() != null) {
node.set("key", this.mapper.readValue(record.key(), JsonNode.class));
}

try catch在这里我们还是保留并将出错的数据打到日志,修改后的代码如下

@PublicEvolving
public class JSONKeyValueDeserializationSchema implements KafkaDeserializationSchema<ObjectNode> {
private static final long serialVersionUID = 1509391548173891955L;
private final static Logger log = LoggerFactory.getLogger(JSONKeyValueDeserializationSchema.class);
private final boolean includeMetadata;
private ObjectMapper mapper; public JSONKeyValueDeserializationSchema(boolean includeMetadata) {
this.includeMetadata = includeMetadata;
} public ObjectNode deserialize(ConsumerRecord<byte[], byte[]> record) {
if (this.mapper == null) {
this.mapper = new ObjectMapper();
}
ObjectNode node = this.mapper.createObjectNode();
try {
// if (record.key() != null) {
// node.set("key", this.mapper.readValue(record.key(), JsonNode.class));
// } if (record.value() != null) {
node.set("value", this.mapper.readValue(record.value(), JsonNode.class));
} if (this.includeMetadata) {
node.putObject("metadata").put("offset", record.offset()).put("topic", record.topic()).put("partition", record.partition());
}
} catch (Exception e) {
log.error(e.getMessage(), e);
log.error("JSONKeyValueDeserializationSchema 出错:" + record.toString() + "=====key为" + new String(record.key()) + "=====数据为" + new String(record.value()));
}
return node;
} public boolean isEndOfStream(ObjectNode nextElement) {
return false;
} public TypeInformation<ObjectNode> getProducedType() {
return TypeExtractor.getForClass(ObjectNode.class);
}
}

至此,问题解决。

在flink中使用jackson JSONKeyValueDeserializationSchema反序列化Kafka消息报错解决的更多相关文章

  1. cmd命令中运行pytest命令导入模块报错解决方法

    报错截图 ImportError while loading conftest 'E:\python\HuaFansApi\test_case\conftest.py'. test_case\conf ...

  2. 【报错】IntelliJ IDEA中绿色注释扫描飘红报错解决

    几天开机,突然发现自己昨天的项目可以运行,今天就因为绿色注释飘红而不能运行,很是尴尬: 解决办法如下: 1.在IDEA中的setting中搜索:"javadoc" 2.把Javad ...

  3. laravel 迁移文件中修改含有enum字段的表报错解决方法

    解决方法: 在迁移文件中up方法最上方加上下面这一行代码即可: Schema::getConnection()->getDoctrineSchemaManager()->getDataba ...

  4. 在eclipse中使用git的pull功能时报错解决办法

    打开项目的 .git/config文件,参照以下进行编辑 [core] symlinks = false repositoryformatversion = 0 filemode = false lo ...

  5. vuex中的babel编译mapGetters/mapActions报错解决方法

    vex使用...mapActions报错解决办法 vuex2增加了mapGetters和mapActions的方法,借助stage2的Object Rest Operator 所在通过 methods ...

  6. Spring Boot在反序列化过程中:jackson.databind.exc.InvalidDefinitionException cannot deserialize from Object value

    错误场景 用Spring boot写了一个简单的RESTful API,在测试POST请求的时候,request body是一个符合对应实体类要求的json串,post的时候报错. 先贴一段error ...

  7. Flink 使用(一)——从kafka中读取数据写入到HBASE中

    1.前言 本文是在<如何计算实时热门商品>[1]一文上做的扩展,仅在功能上验证了利用Flink消费Kafka数据,把处理后的数据写入到HBase的流程,其具体性能未做调优.此外,文中并未就 ...

  8. Flink学习笔记:Connectors之kafka

    本文为<Flink大数据项目实战>学习笔记,想通过视频系统学习Flink这个最火爆的大数据计算框架的同学,推荐学习课程: Flink大数据项目实战:http://t.cn/EJtKhaz ...

  9. Flink中的Time

    戳更多文章: 1-Flink入门 2-本地环境搭建&构建第一个Flink应用 3-DataSet API 4-DataSteam API 5-集群部署 6-分布式缓存 7-重启策略 8-Fli ...

随机推荐

  1. 精心整理(含图版)|你要的全拿走!(R数据分析,可视化,生信实战)

    本文首发于“生信补给站”公众号,https://mp.weixin.qq.com/s/ZEjaxDifNATeV8fO4krOIQ更多关于R语言,ggplot2绘图,生信分析的内容,敬请关注小号. 为 ...

  2. 如何把链表以k个结点为一组进行翻转

    [MT笔试题] 题目描述: K 链表翻转是指把每K个相邻的结点看成一组进行翻转,如果剩余结点不足 K 个,则保持不变.假设给定链表 1 -> 2 -> 3 -> 4 -> 5 ...

  3. Java ------ 工厂模式、单例模式

    工厂模式 简单工厂模式: 1.创建Car接口 public interface Car { public void drive(); } 2.创建两个实体类,分别实现Car接口 public clas ...

  4. 手把手教你如何在阿里云ECS搭建Python TensorFlow Jupyter

    前段时间在阿里云买了一台服务器,准备部署网站,近期想玩一些深度学习项目,正好拿来用.TensorFlow官网的安装仅提及Ubuntu,但我的ECS操作系统是 CentOS 7.6 64位,搭建Pyth ...

  5. [TCP] TCP协议族的学习 and TCP协议

    1.TCP协议族这个大家庭,每个协议在OSI5层模型中所处的位子 其中,网络层里的 ICMP = Internet Control Message Protocol,即因特网控制报文协议, IGMP ...

  6. .NET Core 获取数据库上下文实例的方法和配置连接字符串

    目录 .NET Core 获取数据库上下文实例的方法和配置连接字符串 ASP.NET Core 注入 .NET Core 注入 无签名上下文 OnConfigure 配置 有签名上下文构造函数和自己n ...

  7. bash6——循环

    for fruit in apple orange pear #写死 do each ${fruit}s done fruits="apple orange pear" #输入变量 ...

  8. Nginx 匹配流程一览

    在 nginx server 模块中,location 的定义长被用来匹配一个标准的 URI, 并根据 URI 的不同做出相应的服务方案. nginx location 匹配的优先级 在 locati ...

  9. 页面加载和图片加载loading

    准备放假了!也是闲着了 ,就来整理之前学到或用到的一下知识点和使用内容,这次记录的是关于加载的友好性loading!!!这里记录一下两种加载方法 1.页面加载的方法,它需要用到js里面两个方法 doc ...

  10. nginx支持wss配置

    nginx证书 nginx.conf配置