hive执行结果moveTask操作失败

Apache Hive 2.1.0 ,在执行"INSERT OVERWRITE TABLE ...... select "或者 "insert overwrite directory /tmp/data/hive-test "操作,如果生成的结果文件是多个时,执行结果文件moveTask操作会失败。最新的Apache Hive 2.1.1版本同样有该问题;Apache Hive 1.2.1版本的hive没有该问题。

具体执行的sql如下:

  1. insert overwrite directory '/tmp/fuxin.zhao/hive-test'
  2. select
  3. shippingorderid
  4. ,logisticsplatformid
  5. ,stockoutorderid
  6. ,logisticstypeid
  7. ,externalshippingorderno
  8. ,packageweight
  9. ,freight
  10. ,freightstatus
  11. ,entertime
  12. ,shippingorderstatus
  13. ,shippinglog
  14. ,shippinglogupdatetime
  15. ,shippingstatustime
  16. ,confirmreceivetime
  17. ,remarks
  18. ,createtype
  19. ,lastmodifytime
  20. ,enteruser
  21. ,updatetime
  22. ,updateuser
  23. from
  24. (
  25. select *,row_number() over(partition by shippingorderid order by LastModifyTime desc) as rn
  26. from
  27. (select * from ods.m1_shippingorder where dt = '2014-01-01'
  28. union all
  29. select * from fds.m1_shippingorder where dt = '2099-12-31'
  30. ) a )b where b.rn = 1

产生的异常如下:

  1. Failed with exception org.apache.hadoop.hdfs.protocol.AclException: Invalid ACL: multiple entries with same scope, type and name.
  2. at org.apache.hadoop.hdfs.server.namenode.AclTransformation.buildAndValidateAcl(AclTransformation.java:285)
  3. at org.apache.hadoop.hdfs.server.namenode.AclTransformation.replaceAclEntries(AclTransformation.java:230)
  4. at org.apache.hadoop.hdfs.server.namenode.FSDirAclOp.unprotectedSetAcl(FSDirAclOp.java:206)
  5. at org.apache.hadoop.hdfs.server.namenode.FSDirAclOp.setAcl(FSDirAclOp.java:146)
  6. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setAcl(FSNamesystem.java:7938)
  7. at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setAcl(NameNodeRpcServer.java:1813)
  8. at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setAcl(ClientNamenodeProtocolServerSideTranslatorPB.java:1330)
  9. at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  10. at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
  11. at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
  12. at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
  13. at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
  14. at java.security.AccessController.doPrivileged(Native Method)
  15. at javax.security.auth.Subject.doAs(Subject.java:422)
  16. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
  17. at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

异常:org.apache.hadoop.hive.ql.exec.MoveTask执行失败

  1. ##insert overwrite diretory:
  2. 2016-12-13 18:27:15,630 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 81.52 sec
  3. MapReduce Total cumulative CPU time: 1 minutes 21 seconds 520 msec
  4. Ended Job = job_1480497945656_0288
  5. Moving data to directory /tmp/t_FDS/m1_shippingorder/dt=2099-12-31
  6. Failed with exception Unable to move source hdfs://dbmtimehadoop/tmp/t_FDS/m1_shippingorder/dt=2099-12-31/.hive-staging_hive_2016-12-13_18-26-28_695_9094454822676037473-1/-ext-10000 to destination /tmp/t_FDS/m1_shippingorder/dt=2099-12-31
  7. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://dbmtimehadoop/tmp/t_FDS/m1_shippingorder/dt=2099-12-31/.hive-staging_hive_2016-12-13_18-26-28_695_9094454822676037473-1/-ext-10000 to destination /tmp/t_FDS/m1_shippingorder/dt=2099-12-31
  8. MapReduce Jobs Launched:
  9. Stage-Stage-1: Map: 5 Reduce: 4 Cumulative CPU: 81.52 sec HDFS Read: 778870925 HDFS Write: 778698546 SUCCESS
  10. Total MapReduce CPU Time Spent: 1 minutes 21 seconds 520 msec

异常:java.util.ConcurrentModificationException

  1. Failed with exception Unable to move source hdfs://dbmtimehadoop/tmp/fuxin.zhao/hive-test/.hive-staging_hive_2016-12-22_11-45-12_256_5450334497172511865-1/-ext-10000 to destination /tmp/fuxin.zhao/hive-test
  2. 16/12/22 11:45:59 [main]: ERROR exec.Task: Failed with exception Unable to move source hdfs://dbmtimehadoop/tmp/fuxin.zhao/hive-test/.hive-staging_hive_2016-12-22_11-45-12_256_5450334497172511865-1/-ext-10000 to destination /tmp/fuxin.zhao/hive-test
  3. org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://dbmtimehadoop/tmp/fuxin.zhao/hive-test/.hive-staging_hive_2016-12-22_11-45-12_256_5450334497172511865-1/-ext-10000 to destination /tmp/fuxin.zhao/hive-test
  4. at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:103)
  5. at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:255)
  6. at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
  7. at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
  8. at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
  9. at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
  10. at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
  11. at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
  12. at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
  13. at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
  14. at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
  15. at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
  16. at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
  17. at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
  18. at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
  19. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  20. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  21. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  22. at java.lang.reflect.Method.invoke(Method.java:497)
  23. at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
  24. at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
  25. Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.ConcurrentModificationException
  26. at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2984)
  27. at org.apache.hadoop.hive.ql.exec.MoveTask.moveFileInDfs(MoveTask.java:119)
  28. at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:96)
  29. ... 20 more
  30. Caused by: java.util.ConcurrentModificationException
  31. at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
  32. at java.util.ArrayList$Itr.next(ArrayList.java:851)
  33. at java.util.AbstractCollection.toString(AbstractCollection.java:461)
  34. at java.lang.String.valueOf(String.java:2982)
  35. at java.lang.StringBuilder.append(StringBuilder.java:131)
  36. at org.apache.hadoop.fs.permission.AclStatus.toString(AclStatus.java:108)
  37. at org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:75)
  38. at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2961)
  39. at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2953)
  40. at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  41. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  42. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  43. at java.lang.Thread.run(Thread.java:745)
  44. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://dbmtimehadoop/tmp/fuxin.zhao/hive-test/.hive-staging_hive_2016-12-22_11-45-12_256_5450334497172511865-1/-ext-10000 to destination /tmp/fuxin.zhao/hive-test
  45. 16/12/22 11:45:59 [main]: ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

异常:java.lang.ArrayIndexOutOfBoundsException

  1. Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException
  2. at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2984)
  3. at org.apache.hadoop.hive.ql.exec.MoveTask.moveFileInDfs(MoveTask.java:119)
  4. at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:96)
  5. ... 20 more
  6. Caused by: java.lang.ArrayIndexOutOfBoundsException
  7. at java.lang.System.arraycopy(Native Method)
  8. at java.util.ArrayList.removeRange(ArrayList.java:634)
  9. at java.util.ArrayList$SubList.removeRange(ArrayList.java:1063)
  10. at java.util.AbstractList.clear(AbstractList.java:234)
  11. at com.google.common.collect.Iterables.removeIfFromRandomAccessList(Iterables.java:209)
  12. at com.google.common.collect.Iterables.removeIf(Iterables.java:180)
  13. at org.apache.hadoop.hive.io.HdfsUtils.removeBaseAclEntries(HdfsUtils.java:155)
  14. at org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:77)
  15. at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2961)
  16. at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2953)
  17. at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  18. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  19. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  20. at java.lang.Thread.run(Thread.java:745)

下面是源码中关于文件权限继承的开关代码:

HiveConf.ConfVars.HIVE_WAREHOUSE_SUBDIR_INHERIT_PERMS);

import org.apache.hadoop.hive.conf.HiveConf;

import org.apache.hadoop.hive.conf.HiveConf.ConfVars;

产生问题的原因:

hive的查询结果在在进行move操作时,需要进行文件权限的授权,多个文件的授权是并发进行的,hive中该源码是在一个线程池中

执行的,该操作在多线程时线程同步有问题的该异常,这是hive的一个bug,目前截止目前的最新版本Apache Hive 2.1.1还没有修复该问题;

可以通过关闭hive的文件权限继承 hive.warehouse.subdir.inherit.perms=false 来规避该问题。

解决方法:

hive.warehouse.subdir.inherit.perms

  1. <property>
  2. <name>hive.warehouse.subdir.inherit.perms</name>
  3. <value>true</value>
  4. <description>
  5. Set this to false if the table directories should be created
  6. with the permissions derived from dfs umask instead of
  7. inheriting the permission of the warehouse or database directory.
  8. </description>
  9. </property>

hive执行结果moveTask操作失败的更多相关文章

  1. Error: 实例 "ddd" 执行所请求操作失败,实例处于错误状态。: 请稍后再试 [错误: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 6f60bc06-fcb6-4758-a46f-22120ca35a71.].

    Error: 实例 "ddd" 执行所请求操作失败,实例处于错误状态.: 请稍后再试 [错误: Exceeded maximum number of retries. Exhaus ...

  2. 错误: 实例 "ahwater-linux-core" 执行所请求操作失败,实例处于错误状态。: 请稍后再试 [错误: Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 7c1609c9-9d0f-4836-85b3-cefd45f942a7. Last exception: [u

    错误: 实例 "ahwater-linux-core" 执行所请求操作失败,实例处于错误状态.: 请稍后再试 [错误: Exceeded maximum number of ret ...

  3. 错误: 实例 "ruiy" 执行所请求操作失败,实例处于错误状态。: 请稍后再试 [错误: 'ascii' codec can't decode byte 0xe6 in position 0: ordinal not in range(128)].

    错误: 实例 "ruiy" 执行所请求操作失败,实例处于错误状态.: 请稍后再试 [错误: 'ascii' codec can't decode byte 0xe6 in posi ...

  4. Hive执行count函数失败,Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException)

    Hive执行count函数失败 1.现象: 0: jdbc:hive2://192.168.137.12:10000> select count(*) from emp; INFO : Numb ...

  5. hive从本地导入数据时出现「Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask」错误

    现象 通过load data local导入本地文件时报无法导入的错误 hive> load data local inpath '/home/hadoop/out/mid_test.txt' ...

  6. ADO.NET笔记——使用Command执行增删改操作,通过判断ExecuteNonQuery()返回值检查是否操作成功

    相关知识: ExecuteNonQuery()方法:执行CommandText属性所制定的操作,返回受影响的记录条数.该方法一般用来执行SQL中的UPDATE.INSERT和DELETE等操作 对于U ...

  7. 不借助工具在浏览器中通过Web API执行Dynamics 365操作(Action)实例

    摘要: 本人微信和易信公众号: 微软动态CRM专家罗勇 ,回复262或者20170727可方便获取本文,同时可以在第一间得到我发布的最新的博文信息,follow me!我的网站是 www.luoyon ...

  8. Hive中数据加载失败:root:supergroup:drwxr-xr-x

    Hive中数据加载失败:inode=:root:supergroup:drwxr-xr-x 在执行hive,数据加载的时候,遇到了一个错误,如下图: 在执行程序的过程中,遇到权限问题很正常,背后原理也 ...

  9. Apache Hive 执行HQL语句报错 ( 10G )

    # 故障描述: hive > , ) as uuid, count(distinct(request_body["uuid"])) as count from log_bft ...

随机推荐

  1. [nginx]lua操作redis

    local redis = require "resty.redis" local red = redis:new() red:set_timeout() -- sec -- or ...

  2. uedit富文本编辑器及图片上传控件

    微力后台 uedit富文本编辑器及文件上传控件的使用,无时间整理,暂略,参考本地代码.能跑起来.

  3. Java常用的转义字符

    以下为常用的转义字符对照表: 字母前面加上捺斜线"\"来表示常见的那些不能显示的ASCII字符.称为转义字符.如\0,\t,\n等,就称为转义字符. 转义字符 意义 ASCII码值 ...

  4. 跟我学算法-svm支持向量机算法推导

    Svm算法又称为支持向量机,是一种有监督的学习分类算法,目的是为了找到两个支持点,用来使得平面到达这两个支持点的距离最近. 通俗的说:找到一条直线,使得离该线最近的点与该线的距离最远. 我使用手写进行 ...

  5. Python pip配置国内源

    众所周知,Python使用pip方法安装第三方包时,需要从 https://pypi.org/ 资源库中下载,但是会面临下载速度慢,甚至无法下载的尴尬,这时,你就需要知道配置一个国内源有多么重要了,通 ...

  6. 101. Symmetric Tree (Tree, Queue; DFS, WFS)

    Given a binary tree, check whether it is a mirror of itself (ie, symmetric around its center). For e ...

  7. 记录GOPATH在GOLAND中的坑

    首先我的环境已配置好: GO的目录结构是: add.go package calc//函数名必须大写首字母,不然外部包找不到 func Add(a int,b int)(int){ return a+ ...

  8. 静态方法调静态属性用self,$this不可以

  9. [SoapUI] 通过Groovy Script获取当前运行的是哪套Environment

    log.info testRunner.testCase.testSuite.project.getActiveEnvironment().getName()

  10. matplotlib安装错误依赖问题解决

    When install "matplotlib" with "pip", if you get the following error, it means t ...