问题:

用  spark-submit --master yarn --deploy-mode cluster --driver-memory 2G --num-executors 6 --executor-memory 2G ~~~

提交任务时,最后一个executor 执行时间 超过了 160s 导致 timeout而退出,造成任务重新执行造成用时过长。具体请看下面介绍:

// :: WARN spark.HeartbeatReceiver: Removing executor  with no recent heartbeats:  ms exceeds timeout  ms
// :: ERROR cluster.YarnClusterScheduler: Lost executor on slave10: Executor heartbeat timed out after ms
// :: WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID , slave10): ExecutorLostFailure (executor exited caused by one of the running tasks) Reason: Executor heartbeat timed out after ms
// :: INFO scheduler.DAGScheduler: Executor lost: (epoch )
// :: INFO cluster.YarnClusterSchedulerBackend: Requesting to kill executor(s)
// :: INFO scheduler.TaskSetManager: Starting task 0.1 in stage 0.0 (TID , slave06, partition ,RACK_LOCAL, bytes)
// :: INFO storage.BlockManagerMasterEndpoint: Trying to remove executor from BlockManagerMaster.
// :: INFO storage.BlockManagerMasterEndpoint: Removing block manager BlockManagerId(, slave10, )
// :: INFO storage.BlockManagerMaster: Removed successfully in removeExecutor
// :: INFO scheduler.DAGScheduler: Host added was in lost list earlier: slave10
// :: INFO yarn.ApplicationMaster$AMEndpoint: Driver requested to kill executor(s) .
// :: INFO scheduler.TaskSetManager: Finished task 0.1 in stage 0.0 (TID ) in ms on slave06 (/)
// :: INFO scheduler.DAGScheduler: ResultStage (saveAsNewAPIHadoopFile at DataFrameFunctions.scala:) finished in 162.495 s
初步估计是  因为最后一步用到的计算多,但是 spark的堆外内存配置低 如下所示
spark.yarn.executor.memoryOverhead executorMemory * 0.10, with minimum of 384
故加大配置,如下:
spark-submit --master yarn --deploy-mode cluster --driver-memory 2G --num-executors 6 --executor-memory 2G --conf spark.yarn.executor.memoryOverhead=512 --conf spark.yarn.driver.memoryOverhead=512 经测试上述问题不复存在!
 

spark yarn任务的executor 无故 timeout之原因分析的更多相关文章

  1. spark异常篇-Removing executor 5 with no recent heartbeats: 120504 ms exceeds timeout 120000 ms 可能的解决方案

    问题描述与分析 题目中的问题大致可以描述为: 由于某个 Executor 没有按时向 Driver 发送心跳,而被 Driver 判断该 Executor 已挂掉,此时 Driver 要把 该 Exe ...

  2. Hive-Container killed by YARN for exceeding memory limits. 9.2 GB of 9 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

    Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task times, most recen ...

  3. Spark On Yarn中spark.yarn.jar属性的使用

    今天在测试spark-sql运行在yarn上的过程中,无意间从日志中发现了一个问题: spark-sql --master yarn // :: INFO Client: Requesting a n ...

  4. Spark On Yarn报警告信息 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.

    1 贴出完整日志信息 // :: INFO client.RMProxy: Connecting to ResourceManager at hdp1/ // :: INFO yarn.Client: ...

  5. spark.yarn.jar和spark.yarn.archive的使用

    启动Spark任务时,在没有配置spark.yarn.archive或者spark.yarn.jars时, 会看到不停地上传jar非常耗时:使用spark.yarn.archive可以大大地减少任务的 ...

  6. spark 与 Hadoop 融合后 Neither spark.yarn.jars nor spark.yarn.archive is set

    参考文献: http://blog.csdn.net/lxhandlbb/article/details/54410644 每次提交Spark任务到yarn的时候,总会出现uploading reso ...

  7. 一文读懂spark yarn集群搭建

    文是超简单的spark yarn配置教程: yarn是hadoop的一个子项目,目的是用于管理分布式计算资源,在yarn上面搭建spark集群需要配置好hadoop和spark.我在搭建集群的时候有3 ...

  8. spark运行时出现Neither spark.yarn.jars nor spark.yarn.archive is set错误的解决办法(图文详解)

    不多说,直接上干货! 福利 => 每天都推送 欢迎大家,关注微信扫码并加入我的4个微信公众号:   大数据躺过的坑      Java从入门到架构师      人工智能躺过的坑          ...

  9. spark:neither spark.yarn.jars not spark.yarn.archive is set

    1.Spark启动警告:neither spark.yarn.jars not spark.yarn.archive is set,falling back to uploading librarie ...

随机推荐

  1. Effective C++(12) 复制对象时要复制每一个成员

    问题聚焦: 负责拷贝的两个操作:拷贝构造函数和重载赋值操作符. 一句话总结,确保被拷贝对象的所有成员变量都做一份拷贝. Demo   void logCall(const std::string&am ...

  2. c#中运行时编译时 多态

    c#中运行时编译时 多态   public class aa { } public class bb:aa { } public class cc { public static void Main( ...

  3. C#里CheckListBox的全选

    for (int i = 0; i < chkLSelect.Items.Count; i++)            {                if (chkCheck.Checked ...

  4. 线程:ThreadLocal实现线程范围内共享变量

    在web应用中,一个请求(带有请求参数)就是一个线程,那么如何区分哪些参数属于哪个线程呢?比如struts中,A用户登录,B用户也登录,那么在Action中怎么区分哪个是A用户的数据,哪个是B用户的数 ...

  5. [Usaco2008 Dec]Patting Heads 轻拍牛头[筛法]

    Description   今天是贝茜的生日,为了庆祝自己的生日,贝茜邀你来玩一个游戏.     贝茜让N(1≤N≤100000)头奶牛坐成一个圈.除了1号与N号奶牛外,i号奶牛与i-l号和i+l号奶 ...

  6. [转]Mac OS X local privilege escalation (IOBluetoothFamily)

    Source: http://joystick.artificialstudios.org/2014/10/mac-os-x-local-privilege-escalation.html Nowad ...

  7. iOS 7 beta4 体验

    iOS 7 beta4终于来了,安装后感觉稳定了不少.下面列几点我个人感受比较深得地方. 1.锁屏界面有滑动方向箭头了,而且“滑动来解锁”几个字也有动态颜色变化,让人不再迷惑该往那边滑动了. 2.通知 ...

  8. javascript语言精粹mindmap

    javascript语言精粹mindmap 最近刚刚读完<javascript语言精粹>,感觉其中的内容确实给用js作开发语言的童鞋们提了个醒——js里面坑很多啊 不过,我也并不完全认同书 ...

  9. enode框架step by step之消息队列的设计思路

    enode框架step by step之消息队列的设计思路 enode框架系列step by step文章系列索引: enode框架step by step之开篇 enode框架step by ste ...

  10. about flashback_transaction_query

    详见原文博客链接地址:about flashback_transaction_query