hive on tez 任务失败
最近再hue 集群查询任务经常失败,经过几天的观察,终于找到原因,报错如下
Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1514128895713_0770_1_00, diagnostics=[Task failed, taskId=task_1514128895713_0770_1_00_000006, diagnostics=[TaskAttempt 0 failed, info=[Container container_1514128895713_0770_01_000008 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 1 failed, info=[Container container_1514128895713_0770_01_000026 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 2 failed, info=[Container container_1514128895713_0770_01_000036 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 3 failed, info=[Container container_1514128895713_0770_01_000042 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1514128895713_0770_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1514128895713_0770_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:6, Vertex vertex_1514128895713_0770_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1514128895713_0770_1_00, diagnostics=[Task failed, taskId=task_1514128895713_0770_1_00_000006, diagnostics=[TaskAttempt 0 failed, info=[Container container_1514128895713_0770_01_000008 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 1 failed, info=[Container container_1514128895713_0770_01_000026 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 2 failed, info=[Container container_1514128895713_0770_01_000036 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 3 failed, info=[Container container_1514128895713_0770_01_000042 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1514128895713_0770_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1514128895713_0770_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:6, Vertex vertex_1514128895713_0770_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
分析:
taskId=task_1514128895713_0770_1_00_000006 失败了几次,失败的原因是container被高优先级的任务抢占了。而task最大的失败次数默认是4.当集群上的任务比较多时,比较容易出现这个问题。还有一个原因我认为container设置的内存太小,我这是1.5G 不过没有修改,我设置了以下参数就解决了:
解决方案:
问题解决了,希望帮到你
hive on tez 任务失败的更多相关文章
- hive on tez配置
1.Tez简介 Tez是Hontonworks开源的支持DAG作业的计算框架,它可以将多个有依赖的作业转换为一个作业从而大幅提升MapReduce作业的性能.Tez并不直接面向最终用户--事实上它允许 ...
- hive on tez 错误记录
1.执行过程失败,报 Container killed on request. Exit code is 143 如下图: 分析:造成这种原因是由于总内存不多,而容器在jvm中占比过高,修改tez-s ...
- 配置 Hive On Tez
配置 Hive On Tez 标签(空格分隔): hive Tez 部署底层应用 简单介绍 介绍:tez 是基于hive 之上,可以将sql翻译解析成DAG计算的引擎.基于DAG 与mr 架构本身的优 ...
- hive on spark VS SparkSQL VS hive on tez
http://blog.csdn.net/wtq1993/article/details/52435563 http://blog.csdn.net/yeruby/article/details/51 ...
- Hive on Tez 中 Map 任务的数量计算
Hive on Tez Mapper 数量计算 在Hive 中执行一个query时,我们可以发现Hive 的执行引擎在使用 Tez 与 MR时,两者生成mapper数量差异较大.主要原因在于 Tez ...
- hive on tez
hive运行模式 hive on mapreduce 离线计算(默认) hive on tez YARN之上支持DAG作业的计算框架 hive on spark 内存计算 hive on tez T ...
- Hive执行count函数失败,Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException)
Hive执行count函数失败 1.现象: 0: jdbc:hive2://192.168.137.12:10000> select count(*) from emp; INFO : Numb ...
- hive中创建表失败
使用create table命令创建表失败,如下错误信息: hive> create table test(id int,name string,age int,sex string); FAI ...
- Hive配置Tez引擎踩坑
框架版本 Hadoop 2.7.7 Hive 2.3.7 Tez 0.9.2 保证hadoop集群启动,hive元数据服务启动 上传tez到HDFS tar -zxvf apache-tez-0.9. ...
随机推荐
- Git 实习一个月恍然大悟合集
从开始实习到现在大概有一个月了,这个月时间接触了很多新东西,其中就包括了git版本控制.分支管理等等.我在这段时间里,深深地感受到了git对公司项目代码管理和控制.团队合作带来的益处和其重要性.其实在 ...
- mac 安装 php7 及扩展
mac 版本号:10.12.3 (16D30) 安装内容 php7.0.18(配置apache),composer,phpunit,xdebug扩展,docopts,mongo和redis扩展 php ...
- Centos7:dubbo监控中心安装,配置和使用
制作dubbo-admin.war文件 下载dubbo-admin https://github.com/alibaba/dubbo 注:2.6版本后源码中不包含dubbo-admin工程 在dubb ...
- TKmybatis和mybatisplus哪个好用
文档连接 :http://baomidou.oschina.io/mybatis-plus-doc/#/?id=%E7%AE%80%E4%BB%8B https://gitee.com/hengboy ...
- Swift(二)基础部分
数据类型 Swift 包含了 C 和 Objective-C 上所有基础数据类型.它还增加了 Objective-C 中没有的高阶数据类型比如元组(Tuple) 1.基础类型 Int整形和UInt无符 ...
- PYTHON的程序在LINUX后台运行
1.nohup 命令 nohup nohup 命令 用途:LINUX命令用法,不挂断地运行命令. 语法:nohup Command [ Arg ... ] [ & ] 描述:nohup 命令运 ...
- 什么是DDoS攻击?
本文转载自知道创宇云安全的知乎回答:DDoS 的肉鸡都是哪来的? 说到DDoS攻击,我们就不得不说“肉鸡”. “肉鸡”可谓是DDoS攻击的核心大杀器,作为一个要发起DDoS攻击的黑客来说,没有肉鸡就是 ...
- 关于select的取值
这篇博客,主要是记录我我所犯的错误,或者自己的写法不规范导致了错误,大家可以引以引以为鉴. 我要获取select当前的值,在IE9上我可以直接写document.getElementById(&quo ...
- 点击切换JS
$(function(){ var tabnav = $("#tab-nav ul li"); tabnav.click(function(){ $(this).addClass( ...
- Java面试01
一.谈谈你对java的理解 1.平台无关性,一次编译到处运行 2.GC 3.语言特性 4.面向对象 5.类库 6.异常处理 二.Java如何做到一次编译到处运行?(如何做到平台无关性) 首先我们先来编 ...