hive on tez 任务失败
最近再hue 集群查询任务经常失败,经过几天的观察,终于找到原因,报错如下
Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1514128895713_0770_1_00, diagnostics=[Task failed, taskId=task_1514128895713_0770_1_00_000006, diagnostics=[TaskAttempt 0 failed, info=[Container container_1514128895713_0770_01_000008 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 1 failed, info=[Container container_1514128895713_0770_01_000026 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 2 failed, info=[Container container_1514128895713_0770_01_000036 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 3 failed, info=[Container container_1514128895713_0770_01_000042 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1514128895713_0770_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1514128895713_0770_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:6, Vertex vertex_1514128895713_0770_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1514128895713_0770_1_00, diagnostics=[Task failed, taskId=task_1514128895713_0770_1_00_000006, diagnostics=[TaskAttempt 0 failed, info=[Container container_1514128895713_0770_01_000008 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 1 failed, info=[Container container_1514128895713_0770_01_000026 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 2 failed, info=[Container container_1514128895713_0770_01_000036 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]], TaskAttempt 3 failed, info=[Container container_1514128895713_0770_01_000042 finished with diagnostics set to [Container failed, exitCode=-100. Container released on a *lost* node]]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1514128895713_0770_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1514128895713_0770_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:6, Vertex vertex_1514128895713_0770_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
分析:
taskId=task_1514128895713_0770_1_00_000006 失败了几次,失败的原因是container被高优先级的任务抢占了。而task最大的失败次数默认是4.当集群上的任务比较多时,比较容易出现这个问题。还有一个原因我认为container设置的内存太小,我这是1.5G 不过没有修改,我设置了以下参数就解决了:
解决方案:
问题解决了,希望帮到你
hive on tez 任务失败的更多相关文章
- hive on tez配置
1.Tez简介 Tez是Hontonworks开源的支持DAG作业的计算框架,它可以将多个有依赖的作业转换为一个作业从而大幅提升MapReduce作业的性能.Tez并不直接面向最终用户--事实上它允许 ...
- hive on tez 错误记录
1.执行过程失败,报 Container killed on request. Exit code is 143 如下图: 分析:造成这种原因是由于总内存不多,而容器在jvm中占比过高,修改tez-s ...
- 配置 Hive On Tez
配置 Hive On Tez 标签(空格分隔): hive Tez 部署底层应用 简单介绍 介绍:tez 是基于hive 之上,可以将sql翻译解析成DAG计算的引擎.基于DAG 与mr 架构本身的优 ...
- hive on spark VS SparkSQL VS hive on tez
http://blog.csdn.net/wtq1993/article/details/52435563 http://blog.csdn.net/yeruby/article/details/51 ...
- Hive on Tez 中 Map 任务的数量计算
Hive on Tez Mapper 数量计算 在Hive 中执行一个query时,我们可以发现Hive 的执行引擎在使用 Tez 与 MR时,两者生成mapper数量差异较大.主要原因在于 Tez ...
- hive on tez
hive运行模式 hive on mapreduce 离线计算(默认) hive on tez YARN之上支持DAG作业的计算框架 hive on spark 内存计算 hive on tez T ...
- Hive执行count函数失败,Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException)
Hive执行count函数失败 1.现象: 0: jdbc:hive2://192.168.137.12:10000> select count(*) from emp; INFO : Numb ...
- hive中创建表失败
使用create table命令创建表失败,如下错误信息: hive> create table test(id int,name string,age int,sex string); FAI ...
- Hive配置Tez引擎踩坑
框架版本 Hadoop 2.7.7 Hive 2.3.7 Tez 0.9.2 保证hadoop集群启动,hive元数据服务启动 上传tez到HDFS tar -zxvf apache-tez-0.9. ...
随机推荐
- 封装CURD
<?php include ('ft.php'); $db=Danli::show(); //查询 //$re=$db->table('stree')->where(['name'= ...
- 如何决定使用 HashMap 还是 TreeMap? (转)
问:如何决定使用 HashMap 还是 TreeMap? 介绍 TreeMap<K,V>的Key值是要求实现java.lang.Comparable,所以迭代的时候TreeMap默认是按照 ...
- 附录3:RMA算法原理
RMA算法分三步: 一.背景校正(没精力写了) 二.归一化(没精力写了) 三.计算表达值 假设有5张芯片,这些芯片的某个探针组包含5个探针,它们的表达值如下: GeneChip 4 8 6 9 7 3 ...
- 关于redis的几件小事(七)redis缓存雪崩与穿透
1.缓存雪崩 (1)什么是缓存雪崩 缓存雪崩指的是在同一时刻,缓存大量失效,导致大量的请求直接到了数据库,数据库压力剧增,引起系统崩溃.可能出现的情况有: ①大量的key设置了相同的过期时间,导致在缓 ...
- Python内置函数清单
作者:Vamei 出处:http://www.cnblogs.com/vamei Python内置(built-in)函数随着python解释器的运行而创建.在Python的程序中,你可以随时调用这些 ...
- java网络编程+通讯协议的理解
参考: http://blog.csdn.net/sunyc1990/article/details/50773014 网络编程对于很多的初学者来说,都是很向往的一种编程技能,但是很多的初学者却因为很 ...
- sftp及两种连接模式简介
sftp是ssh内含的协议,只要sshd服务器启动了,它就可用,它本身不需要ftp服务器启动. FTP服务器和客户端要进行文件传输,就需要通过端口来进行.FTP协议需要的端口一般包括两种: 控制链路- ...
- 简单了解 TCP TCP/IP HTTP HTTPS
一. 什么是TCP连接的三次握手 第一次握手:客户端发送syn包(syn=j)到服务器,并进入SYN_SEND状态,等待服务器确认: 第二次握手:服务器收到syn包,必须确认客户的SYN(ack=j+ ...
- composer入门教程
初始化项目 使用composer初始化工作目录,在项目的根目录命令行输入 composer init 安装项目 在composer.json文件所在目录命令行下执行如下命令 php composer. ...
- PAT Basic 1011 A+B 和 C (15 分)
给定区间 [−] 内的 3 个整数 A.B 和 C,请判断 A+B 是否大于 C. 输入格式: 输入第 1 行给出正整数 T (≤),是测试用例的个数.随后给出 T 组测试用例,每组占一行,顺序给出 ...