• direct memory size

  • netty or oplog

  • 5.5kw * 20

  • 60G worker/ 26G MaxDirectMemorySize

  • 1/2 tasks per worker both error

  • some tasks can work well

  • because of memory and multithreads pattern caused by resource scrambling

  • gc-log:

2018-11-09T14:10:47.973+0800: 7393.560: [CMS-concurrent-sweep: 0.241/0.241 secs] [Times: user=0.48 sys=0.00, real=0.24 secs]
2018-11-09T14:10:47.973+0800: 7393.560: [CMS-concurrent-reset-start]
2018-11-09T14:10:48.038+0800: 7393.625: [CMS-concurrent-reset: 0.065/0.065 secs] [Times: user=0.13 sys=0.00, real=0.07 secs]
2018-11-09T14:10:50.038+0800: 7395.625: [GC (CMS Initial Mark) [1 CMS-initial-mark: 25626226K(26204160K)] 39382689K(40762048K), 0.0139416 secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
2018-11-09T14:10:50.052+0800: 7395.639: [CMS-concurrent-mark-start]
2018-11-09T14:10:50.427+0800: 7396.014: [CMS-concurrent-mark: 0.375/0.375 secs] [Times: user=2.59 sys=0.02, real=0.37 secs]
2018-11-09T14:10:50.427+0800: 7396.014: [CMS-concurrent-preclean-start]
2018-11-09T14:10:50.457+0800: 7396.044: [CMS-concurrent-preclean: 0.030/0.030 secs] [Times: user=0.06 sys=0.00, real=0.03 secs]
2018-11-09T14:10:50.457+0800: 7396.044: [CMS-concurrent-abortable-preclean-start]
2018-11-09T14:10:50.457+0800: 7396.044: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2018-11-09T14:10:50.458+0800: 7396.045: [GC (CMS Final Remark) [YG occupancy: 13756466 K (14557888 K)]2018-11-09T14:10:50.458+0800: 7396.045: [GC (CMS Final Remark) 2018-11-09T14:10:50.458+0800: 7396.045: [ParNew: 13756466K->13756466K(14557888K), 0.0000233 secs] 39382693K->39382693K(40762048K), 0.0000914 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2018-11-09T14:10:50.458+0800: 7396.045: [Rescan (parallel) , 0.0138796 secs]2018-11-09T14:10:50.472+0800: 7396.059: [weak refs processing, 0.0000406 secs]2018-11-09T14:10:50.472+0800: 7396.059: [class unloading, 0.0087389 secs]2018-11-09T14:10:50.481+0800: 7396.068: [scrub symbol table, 0.0055956 secs]2018-11-09T14:10:50.487+0800: 7396.074: [scrub string table, 0.0005615 secs][1 CMS-remark: 25626226K(26204160K)] 39382693K(40762048K), 0.0290641 secs] [Times: user=0.30 sys=0.00, real=0.02 secs]
2018-11-09T14:10:50.488+0800: 7396.075: [CMS-concurrent-sweep-start]
2018-11-09T14:10:50.729+0800: 7396.316: [CMS-concurrent-sweep: 0.241/0.241 secs] [Times: user=0.48 sys=0.00, real=0.24 secs]
2018-11-09T14:10:50.729+0800: 7396.316: [CMS-concurrent-reset-start]
2018-11-09T14:10:50.794+0800: 7396.381: [CMS-concurrent-reset: 0.065/0.065 secs] [Times: user=0.13 sys=0.00, real=0.06 secs]
2018-11-09T14:10:51.734+0800: 7397.321: [GC (Allocation Failure) 2018-11-09T14:10:51.734+0800: 7397.321: [ParNew: 14280769K->14280769K(14557888K), 0.0000297 secs]2018-11-09T14:10:51.734+0800: 7397.321: [CMS: 25626226K->25626226K(26204160K), 8.7144181 secs] 39906995K->39782608K(40762048K), [Metaspace: 37753K->37753K(38912K)], 8.7146944 secs] [Times: user=8.72 sys=0.00, real=8.72 secs]
2018-11-09T14:11:00.449+0800: 7406.036: [Full GC (Allocation Failure) 2018-11-09T14:11:00.449+0800: 7406.036: [CMS: 25626226K->25626196K(26204160K), 6.1291271 secs] 39782608K->39782578K(40762048K), [Metaspace: 37753K->37753K(38912K)], 6.1292957 secs] [Times: user=6.13 sys=0.00, real=6.13 secs]
2018-11-09T14:11:06.579+0800: 7412.166: [GC (CMS Initial Mark) [1 CMS-initial-mark: 25626196K(26204160K)] 39782578K(40762048K), 0.0017634 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
2018-11-09T14:11:06.581+0800: 7412.168: [CMS-concurrent-mark-start]
2018-11-09T14:11:06.840+0800: 7412.427: [Full GC (Allocation Failure) 2018-11-09T14:11:06.840+0800: 7412.427: [CMS2018-11-09T14:11:07.867+0800: 7413.454: [CMS-concurrent-mark: 1.033/1.286 secs] [Times: user=5.11 sys=0.61, real=1.28 secs]
(concurrent mode failure): 26150484K->26150474K(26204160K), 7.8489326 secs] 40314100K->39782414K(40762048K), [Metaspace: 37784K->37784K(38912K)], 7.8491778 secs] [Times: user=11.81 sys=0.39, real=7.85 secs]
2018-11-09T14:11:14.690+0800: 7420.277: [Full GC (Allocation Failure) 2018-11-09T14:11:14.690+0800: 7420.277: [CMS: 26150474K->26150474K(26204160K), 1.2736921 secs] 39782414K->39782404K(40762048K), [Metaspace: 37784K->37784K(38912K)], 1.2738487 secs] [Times: user=1.28 sys=0.00, real=1.27 secs]
  • stdout
2018-11-09 14:09:01,703 INFO [pool-6-thread-1] com.tencent.angel.ml.factorizationmachinesWAIC.FMLearnerWAIC: for one batch with 400036 in 67002 ms..
2018-11-09 14:09:01,703 INFO [pool-6-thread-1] com.tencent.angel.ml.factorizationmachinesWAIC.FMLearnerWAIC: for one batch with stepRead is 0 ..
2018-11-09 14:09:05,694 INFO [pool-6-thread-1] com.tencent.angel.ml.factorizationmachinesWAIC.FMLearnerWAIC: dataBlock read finished with 41660237 ..
2018-11-09 14:09:07,408 INFO [pool-6-thread-1] com.tencent.angel.ml.factorizationmachinesWAIC.FMLearnerWAIC: Calculate Delta for update ...
2018-11-09 14:09:11,398 INFO [pool-6-thread-1] com.tencent.angel.ml.factorizationmachinesWAIC.FMLearnerWAIC: Begin to update parameter in PS ...
2018-11-09 14:09:11,406 INFO [pool-6-thread-1] com.tencent.angel.ml.factorizationmachinesWAIC.FMModel: Start to push w0 from PS ...
2018-11-09 14:11:06,588 FATAL [pool-5-thread-1] com.tencent.angel.psagent.matrix.oplog.cache.MatrixOpLogCache: merge OpLogMergeMessage [update=com.tencent.angel.ml.math.vector.DenseDoubleVector@77d861dc, toString()=OpLogMessage [matrixId=1, type=MERGE, context=com.tencent.angel.psagent.task.TaskContext@16aa8654TaskContext [index=0, matrix clocks=(matrixId=0,clock=2)(matrixId=1,clock=1)(matrixId=2,clock=1)], seqId=17]] falied,
java.lang.OutOfMemoryError: Java heap space
at it.unimi.dsi.fastutil.ints.Int2DoubleOpenHashMap.<init>(Int2DoubleOpenHashMap.java:158)
at it.unimi.dsi.fastutil.ints.Int2DoubleOpenHashMap.<init>(Int2DoubleOpenHashMap.java:169)
at com.tencent.angel.ml.math.vector.SparseDoubleVector.resize(SparseDoubleVector.java:495)
at com.tencent.angel.ml.math.vector.SparseDoubleVector.plusBy(SparseDoubleVector.java:564)
at com.tencent.angel.ml.math.vector.SparseDoubleVector.plusBy(SparseDoubleVector.java:555)
at com.tencent.angel.ml.math.vector.SparseDoubleVector.plusBy(SparseDoubleVector.java:35)
at com.tencent.angel.ml.math.matrix.RowbaseMatrix.plusBy(RowbaseMatrix.java:126)
at com.tencent.angel.psagent.matrix.oplog.cache.MatrixOpLog.merge(MatrixOpLog.java:160)
at com.tencent.angel.psagent.matrix.oplog.cache.MatrixOpLogCache$Merger.run(MatrixOpLogCache.java:444)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2018-11-09 14:11:15,964 FATAL [pool-5-thread-2] com.tencent.angel.psagent.matrix.oplog.cache.MatrixOpLogCache: merge OpLogMergeMessage [update=com.tencent.angel.ml.math.vector.DenseDoubleVector@316dd3c4, toString()=OpLogMessage [matrixId=1, type=MERGE, context=com.tencent.angel.psagent.task.TaskContext@16aa8654TaskContext [index=0, matrix clocks=(matrixId=0,clock=2)(matrixId=1,clock=1)(matrixId=2,clock=1)], seqId=18]] falied,
java.lang.OutOfMemoryError: Java heap space
at it.unimi.dsi.fastutil.ints.Int2DoubleOpenHashMap.<init>(Int2DoubleOpenHashMap.java:158)
at it.unimi.dsi.fastutil.ints.Int2DoubleOpenHashMap.<init>(Int2DoubleOpenHashMap.java:169)
at com.tencent.angel.ml.math.vector.SparseDoubleVector.resize(SparseDoubleVector.java:495)
at com.tencent.angel.ml.math.vector.SparseDoubleVector.plusBy(SparseDoubleVector.java:564)
at com.tencent.angel.ml.math.vector.SparseDoubleVector.plusBy(SparseDoubleVector.java:555)
at com.tencent.angel.ml.math.vector.SparseDoubleVector.plusBy(SparseDoubleVector.java:35)
at com.tencent.angel.ml.math.matrix.RowbaseMatrix.plusBy(RowbaseMatrix.java:126)
at com.tencent.angel.psagent.matrix.oplog.cache.MatrixOpLog.merge(MatrixOpLog.java:160)
at com.tencent.angel.psagent.matrix.oplog.cache.MatrixOpLogCache$Merger.run(MatrixOpLogCache.java:444)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2018-11-09 14:11:15,969 INFO [pool-5-thread-1] com.tencent.angel.psagent.PSAgent: psagent falied
2018-11-09 14:11:15,985 INFO [pool-5-thread-1] com.tencent.angel.worker.Worker: worker failed message : merge OpLogMergeMessage [update=com.tencent.angel.ml.math.vector.DenseDoubleVector@77d861dc, toString()=OpLogMessage [matrixId=1, type=MERGE, context=com.tencent.angel.psagent.task.TaskContext@16aa8654TaskContext [index=0, matrix clocks=(matrixId=0,clock=2)(matrixId=1,clock=1)(matrixId=2,clock=1)], seqId=17]] falied, Java heap space, send it to appmaster success
2018-11-09 14:11:15,985 INFO [pool-5-thread-1] com.tencent.angel.worker.Worker: start to close all modules in worker
2018-11-09 14:11:15,985 INFO [pool-5-thread-1] com.tencent.angel.worker.Worker: stop workerService
2018-11-09 14:11:15,985 INFO [pool-5-thread-1] com.tencent.angel.worker.WorkerService: stop rpc server
2018-11-09 14:11:15,985 INFO [pool-5-thread-1] com.tencent.angel.ipc.NettyServer: Stopping server on 20586
2018-11-09 14:11:15,985 ERROR [Worker Heartbeat] com.tencent.angel.worker.Worker: report to appmaster failed, err:
java.lang.NullPointerException
at com.tencent.angel.worker.Worker.heartbeat(Worker.java:341)
at com.tencent.angel.worker.Worker.access$200(Worker.java:65)
at com.tencent.angel.worker.Worker$1.run(Worker.java:303)
at java.lang.Thread.run(Thread.java:745)

OpLogMergeMessage-OutOfMemoryError-JavaHeapSpace的更多相关文章

  1. OutOfMemoryError系列(1): Java heap space

    每个Java程序都只能使用一定量的内存, 这种限制是由JVM的启动参数决定的.而更复杂的情况在于, Java程序的内存分为两部分: 堆内存(Heap space)和 永久代(Permanent Gen ...

  2. Java常见的几种内存溢出及解决方法

    Java常见的几种内存溢出及解决方法[情况一]:java.lang.OutOfMemoryError:Javaheapspace:这种是java堆内存不够,一个原因是真不够(如递归的层数太多等),另一 ...

  3. 【转】JVM 堆内存设置原理

    堆内存设置 原理 JVM堆内存分为2块:Permanent Space 和 Heap Space. Permanent 即 持久代(Permanent Generation),主要存放的是Java类定 ...

  4. tomcat 启动时参数设置说明

    使用Intellij idea 其发动tomcat时会配置启动vm options :-Xms128m -Xmx768m -XX:PermSize=64M -XX:MaxPermSize=512m. ...

  5. 巧解Tomcat中JVM内存溢出问题

    你对Tomcat 的JVM内存溢出问题的解决方法是否了解,这里和大家分享一下,相信本文介绍一定会让你有所收获. tomcat 的JVM内存溢出问题的解决 最近在熟悉一个开发了有几年的项目,需要把数据库 ...

  6. JVM(Java虚拟机)优化大全和案例实战

    堆内存设置 原理 JVM堆内存分为2块:Permanent Space 和 Heap Space. Permanent 即 持久代(Permanent Generation),主要存放的是Java类定 ...

  7. Tomcat性能优化及JVM内存工作原理

    Java性能优化原则:代码运算性能.内存回收.应用配置(影响Java程序主要原因是垃圾回收,下面会重点介绍这方面) 代码层优化:避免过多循环嵌套.调用和复杂逻辑. Tomcat调优主要内容如下: 1. ...

  8. 关于JVM的垃圾回收(GC) 这可能是你想了解的

    目录 1 JVM中Java对象的分类 2 JVM的GC类型及触发条件 2.1 Young GC 2.2 Full GC 3 Java对象生成时的内存申请过程 3 Oracle JDK中的垃圾收集器 3 ...

  9. Permanent Space 和 Heap Space

      JVM堆内存 JVM堆内存分为2块:Permanent Space 和 Heap Space. Permanent 即 持久代(Permanent Generation),主要存放的是Java类定 ...

  10. Java------------JVM(Java虚拟机)优化大全和案例实战

    JVM(Java虚拟机)优化大全和案例实战 堆内存设置 原理 JVM堆内存分为2块:Permanent Space 和 Heap Space. Permanent 即 持久代(Permanent Ge ...

随机推荐

  1. java包

    首先是java.io java.lang java.util  java.lang.math

  2. safari中input、textarea无法输入的问题

    网址:https://www.cnblogs.com/xiayu25/p/6832748.html * { -webkit-box-sizing: border-box; -moz-box-sizin ...

  3. day 17 - 2 递归函数练习

    1.斐波那契   问第n个斐波那契数是多少 #这个效率是低的,最好不要使用双递归 def fib(n): if n == 1 or n == 2: return 1 return fib(n-1) + ...

  4. python 模块 DButils

    # DButils 为了解决多客户端都需要操作数据库的问题. # import pymysql # from DBUtils.PooledDB import PooledDB # # POOL = P ...

  5. android系统添加预置APP(so库自动释放)

    将APK直接放入系统目录中,会导致APK找不到so文件.正常情况下的安装是使用PackageManager,它会将so文件拷贝到系统读取的so目录(system/lib或system/lib64)下, ...

  6. Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC

    解决Invalid character found in the request target. The valid characters are defined in RFC 7230 and RF ...

  7. 022_word中如何正确的使用正则表达式进行搜索

    一.word中正则表达式详解 https://www3.ntu.edu.sg/home/ehchua/programming/howto/PowerUser_MSOffice.html 实战举例: ( ...

  8. mysql 备份报错mysqldump: [Warning] Using a password on the command line interface can be insecure.

    -------------------------------------------------------------------------------- mysql 备份报错mysqldump ...

  9. storage和memory

    memory:使用的是值传递,默认使用的是memory,传递的是值 storage:引用传递,传过来的是指针,后面一定要加上internal,private pragma solidity ^; co ...

  10. WX支付功能的调用

    1.引入js. 2.微信支付的js.此处有的理解不知对不对,是照着老大的例子整改的~ $('.button').click(function () { var giftId = $('.show'). ...