在flink中使用jackson JSONKeyValueDeserializationSchema反序列化Kafka消息报错解决
在做支付订单宽表的场景,需要关联的表比较多而且支付有可能要延迟很久,这种情况下不太适合使用Flink的表Join,想到的另外一种解决方案是消费多个Topic的数据,再根据订单号进行keyBy,再在逻辑中根据不同Topic处理,所以在接收到的消息中最好能够有topic字段,JSONKeyValueDeserializationSchema就完美的解决了这个问题。
def getKafkaConsumer(kafkaAddr: String, topicNames: util.ArrayList[String], groupId: String): FlinkKafkaConsumer[ObjectNode] = {
val properties = getKafkaProperties(groupId, kafkaAddr)
val consumer = new FlinkKafkaConsumer[ObjectNode](topicNames, new JSONKeyValueDeserializationSchema(true), properties)
consumer.setStartFromGroupOffsets() // the default behaviour
consumer
}
在这里new JSONKeyValueDeserializationSchema(true)是需要带上元数据信息,false则不带上,源码如下
public class JSONKeyValueDeserializationSchema implements KafkaDeserializationSchema<ObjectNode> {
private static final long serialVersionUID = 1509391548173891955L;
private final static Logger log = LoggerFactory.getLogger(JSONKeyValueDeserializationSchema.class);
private final boolean includeMetadata;
private ObjectMapper mapper; public JSONKeyValueDeserializationSchema(boolean includeMetadata) {
this.includeMetadata = includeMetadata;
} public ObjectNode deserialize(ConsumerRecord<byte[], byte[]> record) {
if (this.mapper == null) {
this.mapper = new ObjectMapper();
}
ObjectNode node = this.mapper.createObjectNode(); if (record.key() != null) {
node.set("key", this.mapper.readValue(record.key(), JsonNode.class));
} if (record.value() != null) {
node.set("value", this.mapper.readValue(record.value(), JsonNode.class));
} if (this.includeMetadata) {
node.putObject("metadata").put("offset", record.offset()).put("topic", record.topic()).put("partition", record.partition());
}return node;
} public boolean isEndOfStream(ObjectNode nextElement) {
return false;
} public TypeInformation<ObjectNode> getProducedType() {
return TypeExtractor.getForClass(ObjectNode.class);
}
}
本来以为到这里就大功告成了,谁不想居然报错了。。每条消息反序列化的都报错。
2019-11-29 19:55:15.401 flink [Source: kafkasource (1/1)] ERROR c.y.b.D.JSONKeyValueDeserializationSchema - Unrecognized token 'xxxxx': was expecting ('true', 'false' or 'null')
at [Source: [B@2e119f0e; line: 1, column: 45]org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'xxxxxxx': was expecting ('true', 'false' or 'null')
at [Source: [B@2e119f0e; line: 1, column: 45]
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1586)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:521)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3464)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleUnexpectedValue(UTF8StreamJsonParser.java:2628)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:854)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:748)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3847)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3792)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2890)
at com.xx.xx.DeserializationSchema.JSONKeyValueDeserializationSchema.deserialize(JSONKeyValueDeserializationSchema.java:33)
at com.xx.xx.DeserializationSchema.JSONKeyValueDeserializationSchema.deserialize(JSONKeyValueDeserializationSchema.java:15)
at org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.runFetchLoop(KafkaFetcher.java:140)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:711)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:745)
因为源码是没有try catch的,无法获取到报错的具体数据,只能直接重写这个方法了
新建一个DeserializationSchema包,再创建JSONKeyValueDeserializationSchema类,然后在getKafkaConsumer重新引用我们自己的JSONKeyValueDeserializationSchema类,再在日志中我们就可以知道是哪些数据无法反序列化
@PublicEvolving
public class JSONKeyValueDeserializationSchema implements KafkaDeserializationSchema<ObjectNode> {
private static final long serialVersionUID = 1509391548173891955L;
private final static Logger log = LoggerFactory.getLogger(JSONKeyValueDeserializationSchema.class);
private final boolean includeMetadata;
private ObjectMapper mapper; public JSONKeyValueDeserializationSchema(boolean includeMetadata) {
this.includeMetadata = includeMetadata;
} public ObjectNode deserialize(ConsumerRecord<byte[], byte[]> record) {
if (this.mapper == null) {
this.mapper = new ObjectMapper();
}
ObjectNode node = this.mapper.createObjectNode();
try {
if (record.key() != null) {
node.set("key", this.mapper.readValue(record.key(), JsonNode.class));
} if (record.value() != null) {
node.set("value", this.mapper.readValue(record.value(), JsonNode.class));
} if (this.includeMetadata) {
node.putObject("metadata").put("offset", record.offset()).put("topic", record.topic()).put("partition", record.partition());
}
} catch (Exception e) {
log.error(e.getMessage(), e);
log.error("JSONKeyValueDeserializationSchema 出错:" + record.toString() + "=====key为" + new String(record.key()) + "=====数据为" + new String(record.value()));
}
return node;
} public boolean isEndOfStream(ObjectNode nextElement) {
return false;
} public TypeInformation<ObjectNode> getProducedType() {
return TypeExtractor.getForClass(ObjectNode.class);
}
}
发现key为一串订单号,因为topic数据不是原生canal json数据,是被加工过的,那应该是上游生产的时候指定的key
那继续修改我们的JSONKeyValueDeserializationSchema代码,因为key用不到,所以直接注释掉,当然也可以将class指定为String
if (record.key() != null) {
node.set("key", this.mapper.readValue(record.key(), JsonNode.class));
}
try catch在这里我们还是保留并将出错的数据打到日志,修改后的代码如下
@PublicEvolving
public class JSONKeyValueDeserializationSchema implements KafkaDeserializationSchema<ObjectNode> {
private static final long serialVersionUID = 1509391548173891955L;
private final static Logger log = LoggerFactory.getLogger(JSONKeyValueDeserializationSchema.class);
private final boolean includeMetadata;
private ObjectMapper mapper; public JSONKeyValueDeserializationSchema(boolean includeMetadata) {
this.includeMetadata = includeMetadata;
} public ObjectNode deserialize(ConsumerRecord<byte[], byte[]> record) {
if (this.mapper == null) {
this.mapper = new ObjectMapper();
}
ObjectNode node = this.mapper.createObjectNode();
try {
// if (record.key() != null) {
// node.set("key", this.mapper.readValue(record.key(), JsonNode.class));
// } if (record.value() != null) {
node.set("value", this.mapper.readValue(record.value(), JsonNode.class));
} if (this.includeMetadata) {
node.putObject("metadata").put("offset", record.offset()).put("topic", record.topic()).put("partition", record.partition());
}
} catch (Exception e) {
log.error(e.getMessage(), e);
log.error("JSONKeyValueDeserializationSchema 出错:" + record.toString() + "=====key为" + new String(record.key()) + "=====数据为" + new String(record.value()));
}
return node;
} public boolean isEndOfStream(ObjectNode nextElement) {
return false;
} public TypeInformation<ObjectNode> getProducedType() {
return TypeExtractor.getForClass(ObjectNode.class);
}
}
至此,问题解决。
在flink中使用jackson JSONKeyValueDeserializationSchema反序列化Kafka消息报错解决的更多相关文章
- cmd命令中运行pytest命令导入模块报错解决方法
报错截图 ImportError while loading conftest 'E:\python\HuaFansApi\test_case\conftest.py'. test_case\conf ...
- 【报错】IntelliJ IDEA中绿色注释扫描飘红报错解决
几天开机,突然发现自己昨天的项目可以运行,今天就因为绿色注释飘红而不能运行,很是尴尬: 解决办法如下: 1.在IDEA中的setting中搜索:"javadoc" 2.把Javad ...
- laravel 迁移文件中修改含有enum字段的表报错解决方法
解决方法: 在迁移文件中up方法最上方加上下面这一行代码即可: Schema::getConnection()->getDoctrineSchemaManager()->getDataba ...
- 在eclipse中使用git的pull功能时报错解决办法
打开项目的 .git/config文件,参照以下进行编辑 [core] symlinks = false repositoryformatversion = 0 filemode = false lo ...
- vuex中的babel编译mapGetters/mapActions报错解决方法
vex使用...mapActions报错解决办法 vuex2增加了mapGetters和mapActions的方法,借助stage2的Object Rest Operator 所在通过 methods ...
- Spring Boot在反序列化过程中:jackson.databind.exc.InvalidDefinitionException cannot deserialize from Object value
错误场景 用Spring boot写了一个简单的RESTful API,在测试POST请求的时候,request body是一个符合对应实体类要求的json串,post的时候报错. 先贴一段error ...
- Flink 使用(一)——从kafka中读取数据写入到HBASE中
1.前言 本文是在<如何计算实时热门商品>[1]一文上做的扩展,仅在功能上验证了利用Flink消费Kafka数据,把处理后的数据写入到HBase的流程,其具体性能未做调优.此外,文中并未就 ...
- Flink学习笔记:Connectors之kafka
本文为<Flink大数据项目实战>学习笔记,想通过视频系统学习Flink这个最火爆的大数据计算框架的同学,推荐学习课程: Flink大数据项目实战:http://t.cn/EJtKhaz ...
- Flink中的Time
戳更多文章: 1-Flink入门 2-本地环境搭建&构建第一个Flink应用 3-DataSet API 4-DataSteam API 5-集群部署 6-分布式缓存 7-重启策略 8-Fli ...
随机推荐
- 精心整理(含图版)|你要的全拿走!(R数据分析,可视化,生信实战)
本文首发于“生信补给站”公众号,https://mp.weixin.qq.com/s/ZEjaxDifNATeV8fO4krOIQ更多关于R语言,ggplot2绘图,生信分析的内容,敬请关注小号. 为 ...
- 如何把链表以k个结点为一组进行翻转
[MT笔试题] 题目描述: K 链表翻转是指把每K个相邻的结点看成一组进行翻转,如果剩余结点不足 K 个,则保持不变.假设给定链表 1 -> 2 -> 3 -> 4 -> 5 ...
- Java ------ 工厂模式、单例模式
工厂模式 简单工厂模式: 1.创建Car接口 public interface Car { public void drive(); } 2.创建两个实体类,分别实现Car接口 public clas ...
- 手把手教你如何在阿里云ECS搭建Python TensorFlow Jupyter
前段时间在阿里云买了一台服务器,准备部署网站,近期想玩一些深度学习项目,正好拿来用.TensorFlow官网的安装仅提及Ubuntu,但我的ECS操作系统是 CentOS 7.6 64位,搭建Pyth ...
- [TCP] TCP协议族的学习 and TCP协议
1.TCP协议族这个大家庭,每个协议在OSI5层模型中所处的位子 其中,网络层里的 ICMP = Internet Control Message Protocol,即因特网控制报文协议, IGMP ...
- .NET Core 获取数据库上下文实例的方法和配置连接字符串
目录 .NET Core 获取数据库上下文实例的方法和配置连接字符串 ASP.NET Core 注入 .NET Core 注入 无签名上下文 OnConfigure 配置 有签名上下文构造函数和自己n ...
- bash6——循环
for fruit in apple orange pear #写死 do each ${fruit}s done fruits="apple orange pear" #输入变量 ...
- Nginx 匹配流程一览
在 nginx server 模块中,location 的定义长被用来匹配一个标准的 URI, 并根据 URI 的不同做出相应的服务方案. nginx location 匹配的优先级 在 locati ...
- 页面加载和图片加载loading
准备放假了!也是闲着了 ,就来整理之前学到或用到的一下知识点和使用内容,这次记录的是关于加载的友好性loading!!!这里记录一下两种加载方法 1.页面加载的方法,它需要用到js里面两个方法 doc ...
- nginx支持wss配置
nginx证书 nginx.conf配置