原因分析

CDH 集群环境没有对 Container分配足够的运行环境(内存)

解决办法

需要修改的配置文件,将具体的配置项修改匹配集群环境资源。如下:
配置文件
配置设置
解释
计算值(参考)
yarn-site.xml
yarn.nodemanager.resource.memory-mb
分配给容器的物理内存数量
= 52 * 2 =104 G
yarn-site.xml
yarn.scheduler.minimum-allocation-mb
容器可以请求的最小物理内存量(以 MiB 为单位)
= 2G
yarn-site.xml
yarn.scheduler.maximum-allocation-mb
为容器请求的最大物理内存数量(以 MiB 为单位)。
= 52 * 2 = 104G
yarn-site.xml (check)
yarn.app.mapreduce.am.resource.mb
ApplicationMaster 的物理内存要求 (MiB)。
= 2 * 2=4G
yarn-site.xml (check)
yarn.app.mapreduce.am.command-opts
传递到 MapReduce ApplicationMaster 的 Java 命令行参数
= 0.8 * 2 * 2=3.2G
yarn-site.xml
yarn.nodemanager.vmem-pmem-ratio
容器内存限制时虚拟内存与物理内存的比率
默认是2.1,根据实际情况调整这个配置项的值
mapred-site.xml
mapreduce.map.memory.mb
为作业的每个 Map 任务分配的物理内存量(MiB)。
= 2G
mapred-site.xml
mapreduce.reduce.memory.mb
为作业的每个 Reduce 任务分配的物理内存量(MiB)。
= 2 * 2=4G
mapred-site.xml
mapreduce.map.java.opts
Map 进程的 Java 选项。
= 0.8 * 2=1.6G
mapred-site.xml
mapreduce.reduce.java.opts
Reduce 进程的 Java 选项。
= 0.8 * 2 * 2=3.2G

异常日志

'PHYSICAL' memory limit. Current usage: 2.1 GB of 2 GB physical memory used; 21.2 GB of 4.2 GB virtual memory used. Killing container.
 
Application application_1543392650432_0855 failed 2 times due to AM Container for appattempt_1543392650432_0855_000002 exited with exitCode: -104
Failing this attempt.Diagnostics: [2018-12-01 14:57:17.762]Container [pid=31682,containerID=container_1543392650432_0855_02_000001] is running 120156160B beyond the 'PHYSICAL' memory limit. Current usage: 2.1 GB of 2 GB physical memory used; 21.2 GB of 4.2 GB virtual memory used. Killing container.
Dump of the process-tree for container_1543392650432_0855_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 1080 31768 31682 31682 (java) 2769 194 3968139264 299128 /usr/java/jdk1.8.0_141-cloudera/bin/java -Dproc_jar -Djava.net.preferIPv4Stack=true -Xmx2147483648 -Djava.net.preferIPv4Stack=true -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/bin/../conf/parquet-logging.properties -Dyarn.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/lib/native -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop -Dhadoop.id.str=chenweidong -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender -Xmx2147483648 -Djava.net.preferIPv4Stack=true -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/bin/../conf/parquet-logging.properties org.apache.hadoop.util.RunJar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hive-exec-2.1.1-cdh6.0.1.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -libjars file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-client.jar,file:/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/auxlib/hive-exec-2.1.1-cdh6.0.1-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-server.jar,file:/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/auxlib/hive-exec-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/lib/hive-hbase-handler-2.1.1-cdh6.0.1.jar -localtask -plan file:/tmp/yarn/cfb1d927-a086-4b93-af4b-9816f2dc9f49/hive_2018-12-01_14-56-08_101_7346185201514786308-1/-local-10010/plan.xml -jobconffile file:/tmp/yarn/cfb1d927-a086-4b93-af4b-9816f2dc9f49/hive_2018-12-01_14-56-08_101_7346185201514786308-1/-local-10011/jobconf.xml
|- 31682 31680 31682 31682 (bash) 0 0 11960320 344 /bin/bash -c /usr/java/jdk1.8.0_141-cloudera/bin/java -Dlog4j.configuration=container-log4j.properties -Dlog4j.debug=true -Dyarn.app.container.log.dir=/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001 -Dyarn.app.container.log.filesize=1048576 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dsubmitter.user=chenweidong org.apache.oozie.action.hadoop.LauncherAM 1>/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001/stdout 2>/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001/stderr
|- 31689 31682 31682 31682 (java) 355 28 14790037504 76787 /usr/java/jdk1.8.0_141-cloudera/bin/java -Dlog4j.configuration=container-log4j.properties -Dlog4j.debug=true -Dyarn.app.container.log.dir=/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001 -Dyarn.app.container.log.filesize=1048576 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dsubmitter.user=chenweidong org.apache.oozie.action.hadoop.LauncherAM
|- 31768 31756 31682 31682 (java) 1750 114 4003151872 176993 /usr/java/jdk1.8.0_141-cloudera/bin/java -Dproc_jar -Djava.net.preferIPv4Stack=true -Xmx2147483648 -Djava.net.preferIPv4Stack=true -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/bin/../conf/parquet-logging.properties -Dyarn.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/lib/native -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop -Dhadoop.id.str=chenweidong -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/lib/hive-cli-2.1.1-cdh6.0.1.jar org.apache.hadoop.hive.cli.CliDriver --hiveconf hive.query.redaction.rules=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/conf/redaction-rules.json --hiveconf hive.exec.query.redactor.hooks=org.cloudera.hadoop.hive.ql.hooks.QueryRedactor --hiveconf hive.aux.jars.path=file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/lib/hive-hbase-handler-2.1.1-cdh6.0.1.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-server.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/auxlib/hive-exec-2.1.1-cdh6.0.1-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/auxlib/hive-exec-core.jar -S -v -e
|- 31756 31689 31682 31682 (initialization_) 0 0 11960320 371 /bin/bash ./initialization_data_step2.sh 20181123 20181129 dwp_order_log_process
 
[2018-12-01 14:57:17.770]Container killed on request. Exit code is 143
[2018-12-01 14:57:17.778]Container exited with a non-zero exit code 143.
For more detailed output, check the application tracking page: https://master.prodcdh.com:8090/cluster/app/application_1543392650432_0855 Then click on links to logs of each attempt.
. Failing the application.

引申参考

https://stackoverflow.com/questions/21005643/container-is-running-beyond-memory-limits

https://yq.aliyun.com/articles/25470

troubleshooting-Container 'PHYSICAL' memory limit的更多相关文章

  1. hadoop的job执行在yarn中内存分配调节————Container [pid=108284,containerID=container_e19_1533108188813_12125_01_000002] is running beyond virtual memory limits. Current usage: 653.1 MB of 2 GB physical memory used

    实际遇到的真实问题,解决方法: 1.调整虚拟内存率yarn.nodemanager.vmem-pmem-ratio (这个hadoop默认是2.1) 2.调整map与reduce的在AM中的大小大于y ...

  2. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十三)kafka+spark streaming打包好的程序提交时提示虚拟内存不足(Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 G)

    异常问题:Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical mem ...

  3. spark运行任务报错:Container [...] is running beyond physical memory limits. Current usage: 3.0 GB of 3 GB physical memory used; 5.0 GB of 6.3 GB virtual memory used. Killing container.

    spark版本:1.6.0 scala版本:2.10 报错日志: Application application_1562341921664_2123 failed 2 times due to AM ...

  4. is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.6 GB of 40 GB virtual memory used

    昨天使用hadoop跑五一的数据,发现报错: Container [pid=,containerID=container_1453101066555_4130018_01_000067] GB phy ...

  5. hive: insert数据时Error during job, obtaining debugging information 以及beyond physical memory limits

    insert overwrite table canal_amt1...... 2014-10-09 10:40:27,368 Stage-1 map = 100%, reduce = 32%, Cu ...

  6. 运行hadoop的时候提示物理内存或虚拟内存溢出的解决方案running beyond physical memory或者beyond vitual memory limits

    当运行中出现Container is running beyond physical memory这个问题出现主要是因为物理内存不足导致的,在执行mapreduce的时候,每个map和reduce都有 ...

  7. Hive-Container killed by YARN for exceeding memory limits. 9.2 GB of 9 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

    Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task times, most recen ...

  8. Min Stack (LeetCode) tweak it to avoid Memory Limit Exceeded

    class MinStack { public: void push(int x) { if(values.empty()) { values.push_back(x); min_indices.pu ...

  9. 1.Zabbix报错信息:It probably means that the systems requires more physical memory.

    点击返回:自学Zabbix之路 1.Zabbix报错信息:It probably means that the systems requires more physical memory. 1.报错信 ...

随机推荐

  1. hdu5692【dfs序】【线段树】

    Snacks Time Limit: 10000/5000 MS (Java/Others)    Memory Limit: 65536/65536 K (Java/Others)Total Sub ...

  2. 查看mobileprovision信息

    security cms -D -i ***********.mobileprovision

  3. struct modbus是大端的

    https://www.cnblogs.com/coser/archive/2011/12/17/2291160.html https://www.cnblogs.com/gala/archive/2 ...

  4. http_build_query

    http_build_query (PHP 5) http_build_query -- 生成 url-encoded 之后的请求字符串描述string http_build_query ( arra ...

  5. java实现树状图

    1.定义测试数据类 VirtualDataGenerator: import java.util.ArrayList;import java.util.HashMap;import java.util ...

  6. Spring Boot打war包

    然后修改下入口: 这样程序既可以以war也可以以jar的形式run. 右键项目properties,找到项目位置,然后: 然后放到tomcat的webapps的目录下: 然后启动tomcat:star ...

  7. Windows五种IO模型性能分析和Linux五种IO模型性能分析

    Windows五种IO模型性能分析和Linux五种IO模型性能分析 http://blog.csdn.net/jay900323/article/details/18141217 http://blo ...

  8. eclipse整合spring+springMVC+Mybatis

    一.新建Maven项目 点击菜单栏File项,选择New->Project,选中Maven Project,如下图: 二.配置pom.xml <?xml version="1.0 ...

  9. MyEclipse中jquery.js文件报missing semicolon的错误解决

    myeclipse的验证问题不影响jquery的应用,如果看着别扭,解决办法如下:选中你想去掉的js文件:右键选择 MyEclipse-->Exclude From Validation :然后 ...

  10. POD类型

    POD类型 POD全称Plain Old Data.通俗的讲,一个类或结构体通过二进制拷贝后还能保持其数据不变,那么它就是一个POD类型. C++11将POD划分为两个基本概念的合集,即:平凡的和标准 ...