跟朋友聊天,说输出的时间不对,之前测试没关注到这个,然后就在processing模式下看了下,发现时间确实不正确
  
  然后就debug,看问题在哪,最终分析出了原因,记录如下:
  
    最下面给出了复现方案及原因分析
  
  let me show how to generate the wrong result
  
  background: processing time in tumbling window flink:1.5.0
  
  the invoke stack is as follows:
  
  [1] org.apache.calcite.runtime.SqlFunctions.internalToTimestamp (SqlFunctions.java:1,747)
  
  [2] org.apache.flink.table.runtime.aggregate.TimeWindowPropertyCollector.collect (TimeWindowPropertyCollector.scala:53)
  
  [3] org.apache.flink.table.runtime.aggregate.IncrementalAggregateWindowFunction.apply (IncrementalAggregateWindowFunction.scala:74)
  
  [4] org.apache.flink.table.runtime.aggregate.IncrementalAggregateTimeWindowFunction.apply (IncrementalAggregateTimeWindowFunction.scala:72)
  
  [5] org.apache.flink.table.runtime.aggregate.IncrementalAggregateTimeWindowFunction.apply (IncrementalAggregateTimeWindowFunction.scala:39)
  
  [6] org.apache.flink.streaming.runtime.operators.windowing.functions.InternalSingleValueWindowFunction.process (InternalSingleValueWindowFunction.java:46)
  
  [7] org.apache.flink.www.trgj888.com streaming.runtime.operators.www.gcyL157.com windowing.WindowOperator.emitWindowContents (WindowOperator.java:550)
  
  [8] org.apache.flink.www.mingcheng178.com streaming.runtime.operators.windowing.WindowOperator.onProcessingTime (WindowOperator.java:505)
  
  [9] org.apache.flink.www.yongshiyule178.com streaming.api.operators.HeapInternalTimerService.onProcessingTime (HeapInternalTimerService.java:266)
  
  [10] org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run (SystemProcessingTimeService.java:281)
  
  [11] java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:511)
  
  [12] java.util.concurrent.FutureTask.run (FutureTask.java:266)
  
  [13] java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201 (ScheduledThreadPoolExecutor.java:180)
  
  [14] java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run (ScheduledThreadPoolExecutor.java:293)
  
  [15] java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1,142)
  
  [16] java.util.www.yigouyule2.cn concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
  
  [17] java.lang.Thread.run (Thread.java:www.michenggw.com 748)
  
  now ,we are at [1] org.apache.calcite.runtime.SqlFunctions.internalToTimestamp (SqlFunctions.java:1,747)
  
  and the code is as follows:
  
  public static Timestamp internalToTimestamp(long v) { return new Timestamp(v - LOCAL_TZ.getOffset(v)); }
  
  let us print the value of windowStart:v
  
  print v
  
  v = 1544074830000
  
  let us print the value of windowEnd:v
  
  print v
  
  v = 1544074833000
  
  after this, come back to
  
  [1] org.apache.flink.table.runtime.aggregate.TimeWindowPropertyCollector.collect (TimeWindowPropertyCollector.scala:51)
  
  then,we will execute
  
  `
  
  if (windowStartOffset.isDefined) {
  
  output.setField(www.mhylpt.com
  
  lastFieldPos + windowStartOffset.get,
  
  SqlFunctions.internalToTimestamp(windowStart))
  
  }
  
  if (windowEndOffset.isDefined) {
  
  output.setField(
  
  lastFieldPos + windowEndOffset.get,
  
  SqlFunctions.internalToTimestamp(windowEnd))
  
  }
  
  `
  
  before execute,the output is
  
  output = "pro0,throwable0,ERROR,ip0,1,ymm-appmetric-dev-self1_5_924367729,null,null,null"
  
  after execute,the output is
  
  output = "pro0,throwable0,ERROR,ip0,1,ymm-appmetric-dev-self1_5_924367729,2018-12-06 05:40:30.0,2018-12-06 05:40:33.0,null"
  
  so,do you think the
  
  long value 1544074830000 translated to be 2018-12-06 05:40:30.0
  
  long value 1544074833000 translated to be 2018-12-06 05:40:33.0
  
  would be right?
  
  I am in China, I think the timestamp should be 2018-12-06 13:40:30.0 and 2018-12-06 13:40:33.0
  
  okay,let us continue
  
  now ,the data will be write to kafka,before write ,the data will be serialized
  
  let us see what happened!
  
  the call stack is as follows:
  
  [1] org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ser.std.DateSerializer._timestamp (DateSerializer.java:41) [2] org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ser.std.DateSerializer.serialize (DateSerializer.java:48) [3] org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ser.std.DateSerializer.serialize (DateSerializer.java:15) [4] org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue (DefaultSerializerProvider.java:130) [5] org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper.writeValue (ObjectMapper.java:2,444) [6] org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper.valueToTree (ObjectMapper.java:2,586) [7] org.apache.flink.formats.json.JsonRowSerializationSchema.convert (JsonRowSerializationSchema.java:189) [8] org.apache.flink.formats.json.JsonRowSerializationSchema.convertRow (JsonRowSerializationSchema.java:128) [9] org.apache.flink.formats.json.JsonRowSerializationSchema.serialize (JsonRowSerializationSchema.java:102) [10] org.apache.flink.formats.json.JsonRowSerializationSchema.serialize (JsonRowSerializationSchema.java:51) [11] org.apache.flink.streaming.util.serialization.KeyedSerializationSchemaWrapper.serializeValue (KeyedSerializationSchemaWrapper.java:46) [12] org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer010.invoke (FlinkKafkaProducer010.java:355) [13] org.apache.flink.streaming.api.operators.StreamSink.processElement (StreamSink.java:56) [14] org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator (OperatorChain.java:560) [15] org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect (OperatorChain.java:535) [16] org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect (OperatorChain.java:515) [17] org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect (AbstractStreamOperator.java:679) [18] org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect (AbstractStreamOperator.java:657) [19] org.apache.flink.streaming.api.operators.StreamMap.processElement (StreamMap.java:41) [20] org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator (OperatorChain.java:560) [21] org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect (OperatorChain.java:535) [22] org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect (OperatorChain.java:515) [23] org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect (AbstractStreamOperator.java:679) [24] org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect (AbstractStreamOperator.java:657) [25] org.apache.flink.streaming.api.operators.TimestampedCollector.collect (TimestampedCollector.java:51) [26] org.apache.flink.table.runtime.CRowWrappingCollector.collect (CRowWrappingCollector.scala:37) [27] org.apache.flink.table.runtime.CRowWrappingCollector.collect (CRowWrappingCollector.scala:28) [28] DataStreamCalcRule$88.processElement (null) [29] org.apache.flink.table.runtime.CRowProcessRunner.processElement (CRowProcessRunner.scala:66) [30] org.apache.flink.table.runtime.CRowProcessRunner.processElement (CRowProcessRunner.scala:35) [31] org.apache.flink.streaming.api.operators.ProcessOperator.processElement (ProcessOperator.java:66) [32] org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator (OperatorChain.java:560) [33] org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect (OperatorChain.java:535) [34] org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect (OperatorChain.java:515) [35] org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect (AbstractStreamOperator.java:679) [36] org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect (AbstractStreamOperator.java:657) [37] org.apache.flink.streaming.api.operators.TimestampedCollector.collect (TimestampedCollector.java:51) [38] org.apache.flink.table.runtime.aggregate.TimeWindowPropertyCollector.collect (TimeWindowPropertyCollector.scala:65) [39] org.apache.flink.table.runtime.aggregate.IncrementalAggregateWindowFunction.apply (IncrementalAggregateWindowFunction.scala:74) [40] org.apache.flink.table.runtime.aggregate.IncrementalAggregateTimeWindowFunction.apply (IncrementalAggregateTimeWindowFunction.scala:72) [41] org.apache.flink.table.runtime.aggregate.IncrementalAggregateTimeWindowFunction.apply (IncrementalAggregateTimeWindowFunction.scala:39) [42] org.apache.flink.streaming.runtime.operators.windowing.functions.InternalSingleValueWindowFunction.process (InternalSingleValueWindowFunction.java:46) [43] org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.emitWindowContents (WindowOperator.java:550) [44] org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onProcessingTime (WindowOperator.java:505) [45] org.apache.flink.streaming.api.operators.HeapInternalTimerService.onProcessingTime (HeapInternalTimerService.java:266) [46] org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run (SystemProcessingTimeService.java:281) [47] java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:511) [48] java.util.concurrent.FutureTask.run (FutureTask.java:266) [49] java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201 (ScheduledThreadPoolExecutor.java:180) [50] java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run (ScheduledThreadPoolExecutor.java:293) [51] java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1,142) [52] java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617) [53] java.lang.Thread.run (Thread.java:748)
  
  and the code is as follows:
  
  protected long _timestamp(Date value) { return value == null ? 0L : value.getTime(); }
  
  here,use windowEnd for example,the value is
  
  value = "2018-12-06 05:40:33.0"
  
  value.getTime() = 1544046033000
  
  see,the initial value is 1544074833000 and the final value is 1544046033000
  
  the minus value is 28800000, ---> 8 hours ,because I am in China.
  
  why? the key reason is SqlFunctions.internalToTimestamp
  
  public static Timestamp internalToTimestamp(long v)
  
  {
  
  return new Timestamp(v - LOCAL_TZ.getOffset(v));
  
  }
  
  in the code, It minus the LOCAL_TZ , I think it is redundant!
  
  刚才又看了下,其实根本原因就是时间转换来转换去,没有用同一个类,用了2个类的方法
  
  结果就乱套了,要改的话就是SqlFunctions的那个类

关于flink的时间处理不正确的现象复现&原因分析的更多相关文章

  1. mips64高精度时钟引起ktime_get时间不准,导致饿狗故障原因分析【转】

    转自:http://blog.csdn.net/chenyu105/article/details/7720162 重点关注关中断的情况.临时做了一个版本,在CPU 0上监控所有非0 CPU的时钟中断 ...

  2. Flink的时间类型和watermark机制

    一FlinkTime类型 有3类时间,分别是数据本身的产生时间.进入Flink系统的时间和被处理的时间,在Flink系统中的数据可以有三种时间属性: Event Time 是每条数据在其生产设备上发生 ...

  3. svn :Can't connect to host *.*.*.*': 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。

    Can't connect to host *.*.*.*': 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败. -------------------------------- ...

  4. TensorFlow实现Softmax Regression识别手写数字中"TimeoutError: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败”问题

    出现问题: 在使用TensorFlow实现MNIST手写数字识别时,出现"TimeoutError: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应 ...

  5. 一次scrapy失败的提示信息:由于连接方在一段时间后没有正确答复或连接的主机没有反 应,连接尝试失败

    2017-10-31 19:09:26 [scrapy.extensions.logstats] INFO: Crawled 8096 pages (at 67 pages/min), scraped ...

  6. svn checkout 提示“由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。”解决方法

    安装好之后再windows上checkout项目,一直出错:“由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败”:在尝试了很多次之后找到了最后的问题所在.  在网上找的方法试过了, ...

  7. CENTOS 配置好SVN服务环境后,其他服务器无法访问 Error: Can't connect to host '192.168.1.103': 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。

    CENTOS 配置好SVN服务环境后,其他服务器无法访问   根据 下面的步骤配置好服务后,使用本机可以正常 连接到 SVN 服务, 但是使用局域网的其他服务器访问时出现下面的错误, Error: C ...

  8. Scrapy,终端startproject,显示错误TimeoutError: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。

    F:\python_project\test>scrapy startproject spz Traceback (most recent call last): File "d:\p ...

  9. 可以穿梭时空的实时计算框架——Flink对时间的处理

    Flink对于流处理架构的意义十分重要,Kafka让消息具有了持久化的能力,而处理数据,甚至穿越时间的能力都要靠Flink来完成. 在Streaming-大数据的未来一文中我们知道,对于流式处理最重要 ...

随机推荐

  1. 深入理解java虚拟机学习笔记(二)垃圾回收策略

    上篇文章介绍了JVM内存模型的相关知识,其实还有些内容可以更深入的介绍下,比如运行时常量池的动态插入,直接内存等,后期抽空再完善下上篇博客,今天来介绍下JVM中的一些垃圾回收策略.        一. ...

  2. python爬虫-简单使用xpath下载图片

    首先 1.为方便以下进行 谷歌浏览器里要安装xpath脚本 2.下载一个lmxl     命令:pip install lxml 3. 以下三张图是一个,当时爬的 <糗事百科>里的图片 值 ...

  3. 裸机——Nand

    1.首先需要知道Nand的基础知识 从Nand的芯片手册可以获得 我使用的芯片手册是 K9F2G08 首先从芯片手册的名称可以获得信息: K9F:三星 2G   : 2Gb (256MB) 08    ...

  4. C++基础 C++对类的管理——封装

    1.封装 两层含义: (1)把事物的属性和方法结合成个整体. (2)对类的属性和方法进行访问控制,对不信的进行信息屏蔽. 2.访问控制 控制分为 类的内部,类的外部. public 修饰的成员,可在内 ...

  5. 3、springboot配置文件占位符

    RandomValuePropertySource:配置文件中可以使用随机数 ${random.value}.${random.int}.${random.long}.${random.int(10) ...

  6. JDK及配置

    Java Jdk开发时环境,程序员使用 Jre运行时环境,用户使用 Jdk的配置 1.新建java_home   jdk的安装路径 例:C:\Program Files (x86)\Java\jdk1 ...

  7. 6 Django的视图层

    视图函数 一个视图函数,简称视图,是一个简单的Python 函数,它接受Web请求并且返回Web响应.响应可以是一张网页的HTML内容,一个重定向,一个404错误,一个XML文档,或者一张图片. . ...

  8. Java语言基础---变量与数据类型

    变量的作用域 java用一对大括号“{}”作为语句块的范围,称为作用域.作用域中的变量不能重复定义:离开作用域,变量所分配的内存空间将被JVM所收回. 基本数据类型的包装类 java为基础数据类型提供 ...

  9. Android学习记录(10)—Android之图片颜色处理

    你想做到跟美图秀秀一样可以处理自己的照片,美化自己的照片吗?其实你也可以自己做一个这样的软件,废话不多说了,直接上图,上代码了! 效果图如下: 没处理前: 处理之后: MainActivity.jav ...

  10. 《Cracking the Coding Interview》——第12章:测试——题目1

    2014-04-24 23:10 题目:找出下面代码里的错误. 解法:请看下面. 代码: // 12.1 What's wrong with the following code segment? # ...