sparksql读取hive数据报错:java.lang.RuntimeException: serious problem
问题:
Caused by: java.util.concurrent.ExecutionException: java.lang.IndexOutOfBoundsException: Index: 0
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1016)
... 65 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 0
at java.util.Collections$EmptyList.get(Collections.java:4454)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getColumnIndicesFromNames(ReaderImpl.java:651)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getRawDataSizeOfColumns(ReaderImpl.java:634)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:927)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:836)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:702)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
java.lang.RuntimeException: serious problem
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
原因:
sparksql生成的hive表有空文件,但是sparksql读取空文件的时候,因为表示orc格式的,导致sparksql解析orc文件出错。但是用hive却可以正常读取。
解决办法:
设置set spark.sql.hive.convertMetastoreOrc=true
单纯的设置以上参数还是会报错:
java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkIndex(Buffer.java:540)
at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:374)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:316)
at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:187)
at org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$getFileReader$2.apply(OrcFileOperator.scala:75)
at org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$getFileReader$2.apply(OrcFileOperator.scala:73)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.TraversableOnce$class.collectFirst(TraversableOnce.scala:145)
at scala.collection.AbstractIterator.collectFirst(Iterator.scala:1336)
at org.apache.spark.sql.hive.orc.OrcFileOperator$.getFileReader(OrcFileOperator.scala:86)
at org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$readSchema$1.apply(OrcFileOperator.scala:95)
at org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$readSchema$1.apply(OrcFileOperator.scala:95)
at scala.collection.immutable.Stream.flatMap(Stream.scala:489)
需要再设置set spark.sql.orc.impl=native
参考https://issues.apache.org/jira/browse/SPARK-19809
sparksql读取hive数据报错:java.lang.RuntimeException: serious problem的更多相关文章
- hive 报错 java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable ...
- storm supervisor启动报错java.lang.RuntimeException: java.io.EOFException
storm因机器断电或其他异常导致的supervisor意外终止,再次启动时报错: 1. 2013-09-24 09:15:44,361 INFO [main] daemon.supervisor ( ...
- Android Studio 首次安装报错 Java.lang.RuntimeException:java.lang.NullPointerException...错
下次安装报:Java.lang.RuntimeException: java.lang.NullPointerException......错 只需在文件..\Android Studio\bin\i ...
- hive Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata. ...
- 我的Android进阶之旅------>Android中MediaRecorder.stop()报错 java.lang.RuntimeException: stop failed.【转】
本文转载自:http://blog.csdn.net/ouyang_peng/article/details/48048975 今天在调用MediaRecorder.stop(),报错了,java.l ...
- maven报错 java.lang.RuntimeException: com.google.inject.CreationException: Unable to create injector, see the following errors
2 errors java.lang.RuntimeException: com.google.inject.CreationException: Unable to create injector, ...
- java.lang.RuntimeException: Class "org.apache.maven.cli.MavenCli$CliRequest" not found
IDEA版本:14.1 maven版本:apache-maven-3.3.9-bin IDEA的maven项目,在pom文件执行Maven--Reimport,引入jar包依赖,报错java.lang ...
- hive 报错FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient FAILED: Execu
使用hive一段时间以后,今天在使用的时候突然报错,如下: hive> show databases;FAILED: Error in metadata: java.lang.RuntimeEx ...
- Hive 报错:java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
在配置好hive后启动报错信息如下: [walloce@bigdata-study- hive--cdh5.3.6]$ bin/hive Logging initialized using confi ...
随机推荐
- 11月1号开学! 《jmeter性能测试实战》崭新亮相!
课程介绍 第10期<jmeter性能测试实战>课程,11月2号开学!全新改版,和之前的课程框架完全不同 主讲老师:飞天小子 上课方式:每周六周日晚8点到10点,QQ群视频在线直播教学 本期 ...
- Octopus501工作站 安装记录
cmake libreadline-dev 没有运行程序,nvidia-smi查看GPU-Util 达到100% 解决方案:需要把驱动模式设置为常驻内存才可以,设置命令:nvidia-smi -pm ...
- ChIP-seq | ATAC-seq | 数据分析流程
思来想去,还是觉得ENCODE的流程靠谱,所以又花了快一周来调试,终于排除万难,跑成功了.[2019年12月08日] 以下是ATAC生成的结果目录: call-align call-call_peak ...
- SNPsnap | 筛选最佳匹配的SNP | 富集分析 | CP loci
一个矛盾: GWAS得到的SNP做富集分析的话,通常都会有强的偏向性. co-localization of GWAS signals to gene-dense and high linkage d ...
- 关于最新版本的flutter在安卓打包的问题解决方法
1.集成友盟push提示androidx版本号不一致,需在gradle文件中手动选择即可,如下 buildscript { repositories { google() jcenter() mave ...
- Centos7下安装ORACLE 11g,弹窗不显示
Centos7下安装ORACLE 11gR2,弹窗不显示,安装界面显示为灰色. 解决方法:执行安装时带上一下参数 ./runInstaller -jreLoc /etc/alternatives/jr ...
- tp使用ajaxReturn返回二维数组格式的字符串,前台如何获取非乱码
参考: https://www.cnblogs.com/jiqing9006/p/5000849.html https://blog.csdn.net/zengxiangxuan123456/arti ...
- Spring @RequestMapping 参数说明
@RequestMapping 参数说明: value: 指定请求的实际地址, 比如 /action/info之类.method: 指定请求的method类型, GET.POST.PUT.DELE ...
- Android之WebRTC介绍(二)
WebRTC提供了点对点之间的通信,但并不意味着WebRTC不需要服务器.暂且不说基于服务器的一些扩展业务,WebRTC至少有两件事必须要用到服务器: 1. 浏览器之间交换建立通信的元数据(信令)必须 ...
- fastreport 条形码 宽度问题
fastreport 的barcode 如果不设置AutoSize 确实可以控制宽度 但是生成后 基本没办法扫 所以换个思路 直接等比缩小 设置里面的zoom 比例为0.8 针对20位左右的条形码就 ...