hive执行结果moveTask操作失败

Apache Hive 2.1.0 ,在执行"INSERT OVERWRITE TABLE ...... select "或者 "insert overwrite directory /tmp/data/hive-test "操作,如果生成的结果文件是多个时,执行结果文件moveTask操作会失败。最新的Apache Hive 2.1.1版本同样有该问题;Apache Hive 1.2.1版本的hive没有该问题。

具体执行的sql如下:

insert overwrite directory '/tmp/fuxin.zhao/hive-test'
select
shippingorderid
,logisticsplatformid
,stockoutorderid
,logisticstypeid
,externalshippingorderno
,packageweight
,freight
,freightstatus
,entertime
,shippingorderstatus
,shippinglog
,shippinglogupdatetime
,shippingstatustime
,confirmreceivetime
,remarks
,createtype
,lastmodifytime
,enteruser
,updatetime
,updateuser
from
(
select *,row_number() over(partition by shippingorderid order by LastModifyTime desc) as rn
from
(select * from ods.m1_shippingorder where dt = '2014-01-01'
union all
select * from fds.m1_shippingorder where dt = '2099-12-31'
) a )b where b.rn = 1

产生的异常如下:

Failed with exception org.apache.hadoop.hdfs.protocol.AclException: Invalid ACL: multiple entries with same scope, type and name.
at org.apache.hadoop.hdfs.server.namenode.AclTransformation.buildAndValidateAcl(AclTransformation.java:285)
at org.apache.hadoop.hdfs.server.namenode.AclTransformation.replaceAclEntries(AclTransformation.java:230)
at org.apache.hadoop.hdfs.server.namenode.FSDirAclOp.unprotectedSetAcl(FSDirAclOp.java:206)
at org.apache.hadoop.hdfs.server.namenode.FSDirAclOp.setAcl(FSDirAclOp.java:146)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setAcl(FSNamesystem.java:7938)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setAcl(NameNodeRpcServer.java:1813)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setAcl(ClientNamenodeProtocolServerSideTranslatorPB.java:1330)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

异常:org.apache.hadoop.hive.ql.exec.MoveTask执行失败

##insert overwrite diretory:
2016-12-13 18:27:15,630 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 81.52 sec
MapReduce Total cumulative CPU time: 1 minutes 21 seconds 520 msec
Ended Job = job_1480497945656_0288
Moving data to directory /tmp/t_FDS/m1_shippingorder/dt=2099-12-31
Failed with exception Unable to move source hdfs://dbmtimehadoop/tmp/t_FDS/m1_shippingorder/dt=2099-12-31/.hive-staging_hive_2016-12-13_18-26-28_695_9094454822676037473-1/-ext-10000 to destination /tmp/t_FDS/m1_shippingorder/dt=2099-12-31
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://dbmtimehadoop/tmp/t_FDS/m1_shippingorder/dt=2099-12-31/.hive-staging_hive_2016-12-13_18-26-28_695_9094454822676037473-1/-ext-10000 to destination /tmp/t_FDS/m1_shippingorder/dt=2099-12-31
MapReduce Jobs Launched:
Stage-Stage-1: Map: 5 Reduce: 4 Cumulative CPU: 81.52 sec HDFS Read: 778870925 HDFS Write: 778698546 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 21 seconds 520 msec

异常:java.util.ConcurrentModificationException

Failed with exception Unable to move source hdfs://dbmtimehadoop/tmp/fuxin.zhao/hive-test/.hive-staging_hive_2016-12-22_11-45-12_256_5450334497172511865-1/-ext-10000 to destination /tmp/fuxin.zhao/hive-test
16/12/22 11:45:59 [main]: ERROR exec.Task: Failed with exception Unable to move source hdfs://dbmtimehadoop/tmp/fuxin.zhao/hive-test/.hive-staging_hive_2016-12-22_11-45-12_256_5450334497172511865-1/-ext-10000 to destination /tmp/fuxin.zhao/hive-test
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://dbmtimehadoop/tmp/fuxin.zhao/hive-test/.hive-staging_hive_2016-12-22_11-45-12_256_5450334497172511865-1/-ext-10000 to destination /tmp/fuxin.zhao/hive-test
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:103)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:255)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.ConcurrentModificationException
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2984)
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFileInDfs(MoveTask.java:119)
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:96)
... 20 more
Caused by: java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
at java.util.ArrayList$Itr.next(ArrayList.java:851)
at java.util.AbstractCollection.toString(AbstractCollection.java:461)
at java.lang.String.valueOf(String.java:2982)
at java.lang.StringBuilder.append(StringBuilder.java:131)
at org.apache.hadoop.fs.permission.AclStatus.toString(AclStatus.java:108)
at org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:75)
at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2961)
at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2953)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://dbmtimehadoop/tmp/fuxin.zhao/hive-test/.hive-staging_hive_2016-12-22_11-45-12_256_5450334497172511865-1/-ext-10000 to destination /tmp/fuxin.zhao/hive-test
16/12/22 11:45:59 [main]: ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

异常:java.lang.ArrayIndexOutOfBoundsException

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2984)
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFileInDfs(MoveTask.java:119)
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:96)
... 20 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at java.util.ArrayList.removeRange(ArrayList.java:634)
at java.util.ArrayList$SubList.removeRange(ArrayList.java:1063)
at java.util.AbstractList.clear(AbstractList.java:234)
at com.google.common.collect.Iterables.removeIfFromRandomAccessList(Iterables.java:209)
at com.google.common.collect.Iterables.removeIf(Iterables.java:180)
at org.apache.hadoop.hive.io.HdfsUtils.removeBaseAclEntries(HdfsUtils.java:155)
at org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:77)
at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2961)
at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2953)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

下面是源码中关于文件权限继承的开关代码:

HiveConf.ConfVars.HIVE_WAREHOUSE_SUBDIR_INHERIT_PERMS);

import org.apache.hadoop.hive.conf.HiveConf;

import org.apache.hadoop.hive.conf.HiveConf.ConfVars;

产生问题的原因:

hive的查询结果在在进行move操作时,需要进行文件权限的授权,多个文件的授权是并发进行的,hive中该源码是在一个线程池中

执行的,该操作在多线程时线程同步有问题的该异常,这是hive的一个bug,目前截止目前的最新版本Apache Hive 2.1.1还没有修复该问题;

可以通过关闭hive的文件权限继承 hive.warehouse.subdir.inherit.perms=false 来规避该问题。

解决方法:

hive.warehouse.subdir.inherit.perms

  <property>
<name>hive.warehouse.subdir.inherit.perms</name>
<value>true</value>
<description>
Set this to false if the table directories should be created
with the permissions derived from dfs umask instead of
inheriting the permission of the warehouse or database directory.
</description>
</property>

hive执行结果moveTask操作失败的更多相关文章

  1. Error: 实例 "ddd" 执行所请求操作失败,实例处于错误状态。: 请稍后再试 [错误: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 6f60bc06-fcb6-4758-a46f-22120ca35a71.].

    Error: 实例 "ddd" 执行所请求操作失败,实例处于错误状态.: 请稍后再试 [错误: Exceeded maximum number of retries. Exhaus ...

  2. 错误: 实例 "ahwater-linux-core" 执行所请求操作失败,实例处于错误状态。: 请稍后再试 [错误: Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 7c1609c9-9d0f-4836-85b3-cefd45f942a7. Last exception: [u

    错误: 实例 "ahwater-linux-core" 执行所请求操作失败,实例处于错误状态.: 请稍后再试 [错误: Exceeded maximum number of ret ...

  3. 错误: 实例 "ruiy" 执行所请求操作失败,实例处于错误状态。: 请稍后再试 [错误: 'ascii' codec can't decode byte 0xe6 in position 0: ordinal not in range(128)].

    错误: 实例 "ruiy" 执行所请求操作失败,实例处于错误状态.: 请稍后再试 [错误: 'ascii' codec can't decode byte 0xe6 in posi ...

  4. Hive执行count函数失败,Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException)

    Hive执行count函数失败 1.现象: 0: jdbc:hive2://192.168.137.12:10000> select count(*) from emp; INFO : Numb ...

  5. hive从本地导入数据时出现「Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask」错误

    现象 通过load data local导入本地文件时报无法导入的错误 hive> load data local inpath '/home/hadoop/out/mid_test.txt' ...

  6. ADO.NET笔记——使用Command执行增删改操作,通过判断ExecuteNonQuery()返回值检查是否操作成功

    相关知识: ExecuteNonQuery()方法:执行CommandText属性所制定的操作,返回受影响的记录条数.该方法一般用来执行SQL中的UPDATE.INSERT和DELETE等操作 对于U ...

  7. 不借助工具在浏览器中通过Web API执行Dynamics 365操作(Action)实例

    摘要: 本人微信和易信公众号: 微软动态CRM专家罗勇 ,回复262或者20170727可方便获取本文,同时可以在第一间得到我发布的最新的博文信息,follow me!我的网站是 www.luoyon ...

  8. Hive中数据加载失败:root:supergroup:drwxr-xr-x

    Hive中数据加载失败:inode=:root:supergroup:drwxr-xr-x 在执行hive,数据加载的时候,遇到了一个错误,如下图: 在执行程序的过程中,遇到权限问题很正常,背后原理也 ...

  9. Apache Hive 执行HQL语句报错 ( 10G )

    # 故障描述: hive > , ) as uuid, count(distinct(request_body["uuid"])) as count from log_bft ...

随机推荐

  1. MySQL:Can't connect to mysql server 10038

    1.防火墙高级设置 2.入站规则,新建规则 3.选择端口 4.输入MySQL端口例如'3306' 5.允许连接 6.下一步 7.自定义规则名称和描述,完成之后重新连接即可.

  2. 高斯白噪声(white Gaussian noise,WGN)

    本文科普一下高斯白噪声(white Gaussian noise,WGN). 百度百科上解释为“高斯白噪声,幅度分布服从高斯分布,功率谱密度服从均匀分布”,听起来有些晦涩难懂,下面结合例子通俗而详细地 ...

  3. 图像特征与描述子(直方图, 聚类, 边缘检测, 兴趣点/关键点, Harris角点, 斑点(Blob), SIFI, 纹理特征)

    1.直方图 用于计算图片特征,表达, 使得数据具有总结性, 颜色直方图对数据空间进行量化,好比10个bin 2. 聚类 类内对象的相关性高 类间对象的相关性差 常用算法:kmeans, EM算法, m ...

  4. 跟我学算法-吴恩达老师(超参数调试, batch归一化, softmax使用,tensorflow框架举例)

    1. 在我们学习中,调试超参数是非常重要的. 超参数的调试可以是a学习率,(β1和β2,ε)在Adam梯度下降中使用, layers层数, hidden units 隐藏层的数目, learning_ ...

  5. 跟我学算法- tensorflow 实现RNN操作

    对一张图片实现rnn操作,主要是通过先得到一个整体,然后进行切分,得到的最后input结果输出*_w[‘out’] + _b['out']  = 最终输出结果 第一步: 数据载入 import ten ...

  6. Angular: Can't bind to 'ngModel' since it isn't a known property of 'input'问题解决

    https://blog.csdn.net/h363659487/article/details/78619225 最初使用 [(ngModel)] 做双向绑定时,如果遇见Angular: Can't ...

  7. 解决SQL将varchar值转换为数据类型为int的列时发生语法错误

    今天遇到一个这样的错误,具体的报错情况如下 解决的方案如下. 数据库MSSQL在比较大小时,出错提示:“将 varchar 值 '24.5' 转换为数据类型为 int 的列时发生语法错!”分析数据库设 ...

  8. zookeeper的ZAB协议

    ZAB协议概述 ZooKeeper并没有完全采用Paxos算法,而是使用了一种称为ZooKeeper Atomic Broadcast(ZAB,zookeeper原子消息广播协议)的协议作为其数据一致 ...

  9. Luogu 3702 [SDOI2017]序列计数

    BZOJ 4818 感觉不难. 首先转化一下题目,“至少有一个质数”$=$“全部方案”$ - $“一个质数也没有”. 注意到$m \leq 2e7$,$[1, m]$内的质数可以直接筛出来. 设$f_ ...

  10. yarn 完美替代 npm

    众所周知,npm是nodejs默认的包管理工具,我们通过npm可以下载安装或者发布包,但是npm其实存在着很多小问题,比如安装速度慢.每次都要在线重新安装等,而yarn也正是为了解决npm当前存在的问 ...