hive跑mapreduce报java.lang.RuntimeException: Error in configuring object
写于2016.7月
最近项目需要在hbase上做统计分析,在本机上装了hive,结果跑小批量数据sum时报错:
hive> select count(*) from page_view;
Total jobs =
Launching Job out of
Number of reduce tasks determined at compile time:
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1468481753104_0014, Tracking URL = http://wangkai8:8088/proxy/application_1468481753104_0014/
Kill Command = /Users/wangkai8/app/hadoop-2.3.-cdh5/bin/hadoop job -kill job_1468481753104_0014
Hadoop job information for Stage-: number of mappers: ; number of reducers:
-- ::, Stage- map = %, reduce = %
Ended Job = job_1468481753104_0014 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1468481753104_0014_m_000000 (and more) from job job_1468481753104_0014 Task with the most failures():
-----
Task ID:
task_1468481753104_0014_r_000000 URL:
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1468481753104_0014&tipid=task_1468481753104_0014_r_000000
-----
Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:)
at java.lang.Thread.run(Thread.java:)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:)
at java.lang.reflect.Method.invoke(Method.java:)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:)
... more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:)
... more FAILED: Execution Error, return code from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Job : Map: Reduce: HDFS Read: HDFS Write: FAIL
Total MapReduce CPU Time Spent: msec
hive>
查看yarn日志,一样也是空指针异常,还有个提示是No plan file found: hdfs://...
2016-07-14 15:37:15,211 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring retrieval request: __REDUCE_PLAN__
2016-07-14 15:37:15,211 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: hdfs://wangkai8/tmp/hive-wangkai8/hive_2016-07-14_15-37-06_635_5512340937878595139-1/-mr-10004/1e60df26-2b73-4384-adbc-651342dd48f8/reduce.xml
2016-07-14 15:37:15,211 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring cache key: __REDUCE_PLAN__
2016-07-14 15:37:15,212 WARN [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local (uberized) 'child' : java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:351)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:232)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 7 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:132)
... 12 more 2016-07-14 15:37:15,212 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1468481753104_0002_r_000000_0
2016-07-14 15:37:15,212 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1468481753104_0002_r_000000_0 is : 0.0
2016-07-14 15:37:15,212 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
2016-07-14 15:37:15,212 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from attempt_1468481753104_0002_r_000000_0: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:351)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:232)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 7 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:132)
... 12 more 2016-07-14 15:37:15,213 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1468481753104_0002_01_000001 taskAttempt attempt_1468481753104_0002_m_000000_0
2016-07-14 15:37:15,213 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1468481753104_0002_r_000000_0: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:351)
at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:232)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 7 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:132)
... 12 more
查看hive源码,发现gWork为null,导致空指针异常
//src: org.apache.hadoop.hive.ql.exec.mr.ExecReducer#configure ObjectCache cache = ObjectCacheFactory.getCache(jc);
ReduceWork gWork = (ReduceWork) cache.retrieve(PLAN_KEY);
if (gWork == null) {
gWork = Utilities.getReduceWork(job);
cache.cache(PLAN_KEY, gWork);
} else {
Utilities.setReduceWork(job, gWork);
}
reducer = gWork.getReducer(); //gWork为null,导致空指针
接着查看Utilities.getReduceWork(job)方法,最后跟到org.apache.hadoop.hive.ql.exec.Utilities#getBaseWork:
/**
* Returns the Map or Reduce plan
* Side effect: the BaseWork returned is also placed in the gWorkMap
* @param conf
* @param name
* @return BaseWork based on the name supplied will return null if name is null
* @throws RuntimeException if the configuration files are not proper or if plan can not be loaded
*/
private static BaseWork getBaseWork(Configuration conf, String name) {
BaseWork gWork = null;
Path path = null;
InputStream in = null;
try {
path = getPlanPath(conf, name);
assert path != null;
if (!gWorkMap.containsKey(path)) {
Path localPath;
if (ShimLoader.getHadoopShims().isLocalMode(conf)) {
localPath = path;
} else {
localPath = new Path(name);
} if (HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) {
LOG.debug("Loading plan from string: "+path.toUri().getPath());
String planString = conf.get(path.toUri().getPath());
if (planString == null) {
LOG.info("Could not find plan string in conf");
return null;
}
byte[] planBytes = Base64.decodeBase64(planString);
in = new ByteArrayInputStream(planBytes);
in = new InflaterInputStream(in);
} else {
in = new FileInputStream(localPath.toUri().getPath());
} if(MAP_PLAN_NAME.equals(name)){
if (ExecMapper.class.getName().equals(conf.get(MAPRED_MAPPER_CLASS))){
gWork = deserializePlan(in, MapWork.class, conf);
} else if(RCFileMergeMapper.class.getName().equals(conf.get(MAPRED_MAPPER_CLASS))) {
gWork = deserializePlan(in, MergeWork.class, conf);
} else if(ColumnTruncateMapper.class.getName().equals(conf.get(MAPRED_MAPPER_CLASS))) {
gWork = deserializePlan(in, ColumnTruncateWork.class, conf);
} else if(PartialScanMapper.class.getName().equals(conf.get(MAPRED_MAPPER_CLASS))) {
gWork = deserializePlan(in, PartialScanWork.class,conf);
} else {
throw new RuntimeException("unable to determine work from configuration ."
+ MAPRED_MAPPER_CLASS + " was "+ conf.get(MAPRED_MAPPER_CLASS)) ;
}
} else if (REDUCE_PLAN_NAME.equals(name)) {
if(ExecReducer.class.getName().equals(conf.get(MAPRED_REDUCER_CLASS))) {
gWork = deserializePlan(in, ReduceWork.class, conf);
} else {
throw new RuntimeException("unable to determine work from configuration ."
+ MAPRED_REDUCER_CLASS +" was "+ conf.get(MAPRED_REDUCER_CLASS)) ;
}
}
gWorkMap.put(path, gWork);
} else {
LOG.debug("Found plan in cache.");
gWork = gWorkMap.get(path);
}
return gWork;
} catch (FileNotFoundException fnf) {
// happens. e.g.: no reduce work.
LOG.info("No plan file found: "+path);
return null;
} catch (Exception e) {
LOG.error("Failed to load plan: "+path, e);
throw new RuntimeException(e);
} finally {
if (in != null) {
try {
in.close();
} catch (IOException cantBlameMeForTrying) { }
}
}
}
再结合yarn报错信息:2016-07-14 15:37:15,211 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: hdfs://wangkai8/tmp/hive-wangkai8/hive_2016-07-14_15-37-06_635_5512340937878595139-1/-mr-10004/1e60df26-2b73-4384-adbc-651342dd48f8/reduce.xml
很明显reduce.xml找不到,报FileNotFoundException:No plan file found,但通过dfs fs -ls hdfs://wangkai8/tmp/hive-wangkai8/hive_2016-07-14_15-37-06_635_5512340937878595139-1/-mr-10004/1e60df26-2b73-4384-adbc-651342dd48f8/reduce.xml查看文件,文件的确是存在的,文件存在还报这个错,那这个问题就很诡异了。
接着在hive中映射一张hbase表,此表记录较多,大概50万条,同样执行select count操作:
hive> select count(*) from qry_ins_user;
Total jobs =
Launching Job out of
Number of reduce tasks determined at compile time:
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1468481753104_0016, Tracking URL = http://wangkai8:8088/proxy/application_1468481753104_0016/
Kill Command = /Users/wangkai8/app/hadoop-2.3.-cdh5/bin/hadoop job -kill job_1468481753104_0016
Hadoop job information for Stage-: number of mappers: ; number of reducers:
-- ::, Stage- map = %, reduce = %
-- ::, Stage- map = %, reduce = %
-- ::, Stage- map = %, reduce = %
-- ::, Stage- map = %, reduce = %
Ended Job = job_1468481753104_0016
MapReduce Jobs Launched:
Job : Map: Reduce: HDFS Read: HDFS Write: SUCCESS
Total MapReduce CPU Time Spent: msec
OK Time taken: 34.023 seconds, Fetched: row(s)
hive>
这个居然能执行成功,3条记录执行就报错?这个问题不能不说更诡异了。后面各种调试,最后从AppMaster日志中得到关键信息:
Log Type: syslog
Log Length: 155414
2016-07-14 18:40:17,962 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1468481753104_0015_000001
2016-07-14 18:40:18,211 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-07-14 18:40:18,230 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-07-14 18:40:18,327 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-07-14 18:40:18,340 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2016-07-14 18:40:18,340 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@6b7a5e49)
2016-07-14 18:40:18,363 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: The specific max attempts: 2 for application: 15. Attempt num: 1 is last retry: false
2016-07-14 18:40:18,452 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: dfs.datanode.data.dir; Ignoring.
2016-07-14 18:40:18,454 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-07-14 18:40:18,460 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: dfs.namenode.name.dir; Ignoring.
2016-07-14 18:40:18,464 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-07-14 18:40:18,837 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config org.apache.hadoop.hive.shims.HadoopShimsSecure$NullOutputCommitter
2016-07-14 18:40:18,839 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.hive.shims.HadoopShimsSecure$NullOutputCommitter
2016-07-14 18:40:18,876 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
2016-07-14 18:40:18,877 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
2016-07-14 18:40:18,877 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
2016-07-14 18:40:18,878 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
2016-07-14 18:40:18,878 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
2016-07-14 18:40:18,882 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
2016-07-14 18:40:18,882 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
2016-07-14 18:40:18,883 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
2016-07-14 18:40:18,940 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
2016-07-14 18:40:19,105 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2016-07-14 18:40:19,147 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-07-14 18:40:19,147 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
2016-07-14 18:40:19,153 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1468481753104_0015 to jobTokenSecretManager
2016-07-14 18:40:19,239 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Uberizing job job_1468481753104_0015: 1m+1r tasks (132 input bytes) will run sequentially on single node.
2016-07-14 18:40:19,257 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1468481753104_0015 = 132. Number of splits = 1
2016-07-14 18:40:19,258 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1468481753104_0015 = 1
2016-07-14 18:40:19,258 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1468481753104_0015Job Transitioned from NEW to INITED
2016-07-14 18:40:19,258 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster uberizing job job_1468481753104_0015 in local container ("uber-AM") on node wangkai8:59734.
2016-07-14 18:40:19,280 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-07-14 18:40:19,286 INFO [Socket Reader #1 for port 57486] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 57486
2016-07-14 18:40:19,301 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2016-07-14 18:40:19,258 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster uberizing job job_1468481753104_0015 in local container ("uber-AM") on node wangkai8:59734.
reduce跑在local模式,难道50万记录的表跑在集群模式?查看50万条记录对应AppMaster日志:
Log Type: syslog
Log Length: 47618
2016-07-14 19:42:13,007 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1468481753104_0016_000001
2016-07-14 19:42:13,254 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-07-14 19:42:13,277 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-07-14 19:42:13,366 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-07-14 19:42:13,380 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2016-07-14 19:42:13,380 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@5f4829f5)
2016-07-14 19:42:13,407 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: The specific max attempts: 2 for application: 16. Attempt num: 1 is last retry: false
2016-07-14 19:42:13,499 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: dfs.datanode.data.dir; Ignoring.
2016-07-14 19:42:13,502 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-07-14 19:42:13,508 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: dfs.namenode.name.dir; Ignoring.
2016-07-14 19:42:13,511 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-07-14 19:42:14,025 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config org.apache.hadoop.hive.shims.HadoopShimsSecure$NullOutputCommitter
2016-07-14 19:42:14,027 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.hive.shims.HadoopShimsSecure$NullOutputCommitter
2016-07-14 19:42:14,071 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
2016-07-14 19:42:14,072 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
2016-07-14 19:42:14,073 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
2016-07-14 19:42:14,073 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
2016-07-14 19:42:14,074 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
2016-07-14 19:42:14,080 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
2016-07-14 19:42:14,081 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
2016-07-14 19:42:14,082 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
2016-07-14 19:42:14,159 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
2016-07-14 19:42:14,409 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2016-07-14 19:42:14,480 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-07-14 19:42:14,480 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
2016-07-14 19:42:14,487 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1468481753104_0016 to jobTokenSecretManager
2016-07-14 19:42:14,603 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1468481753104_0016 because: too much input;
2016-07-14 19:42:14,628 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1468481753104_0016 = 446693376. Number of splits = 2
2016-07-14 19:42:14,630 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1468481753104_0016 = 1
2016-07-14 19:42:14,630 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1468481753104_0016Job Transitioned from NEW to INITED
2016-07-14 19:42:14,631 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1468481753104_0016.
2016-07-14 19:42:14,666 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-07-14 19:42:14,676 INFO [Socket Reader #1 for port 59418] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 59418
2016-07-14 19:42:14,698 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2016-07-14 19:42:14,603 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1468481753104_0016 because: too much input;
好吧,果然是集群模式,那就查看org.apache.hadoop.mapreduce.v2.app.job.impl.JobImp源码,看看local、集群模式是怎么决定的:
/**
* Decide whether job can be run in uber mode based on various criteria.
* @param dataInputLength Total length for all splits
*/
private void makeUberDecision(long dataInputLength) {
//FIXME: need new memory criterion for uber-decision (oops, too late here;
// until AM-resizing supported,
// must depend on job client to pass fat-slot needs)
// these are no longer "system" settings, necessarily; user may override
int sysMaxMaps = conf.getInt(MRJobConfig.JOB_UBERTASK_MAXMAPS, 9); int sysMaxReduces = conf.getInt(MRJobConfig.JOB_UBERTASK_MAXREDUCES, 1); long sysMaxBytes = conf.getLong(MRJobConfig.JOB_UBERTASK_MAXBYTES,
fs.getDefaultBlockSize(this.remoteJobSubmitDir)); // FIXME: this is wrong; get FS from
// [File?]InputFormat and default block size
// from that long sysMemSizeForUberSlot =
conf.getInt(MRJobConfig.MR_AM_VMEM_MB,
MRJobConfig.DEFAULT_MR_AM_VMEM_MB); long sysCPUSizeForUberSlot =
conf.getInt(MRJobConfig.MR_AM_CPU_VCORES,
MRJobConfig.DEFAULT_MR_AM_CPU_VCORES); boolean uberEnabled =
conf.getBoolean(MRJobConfig.JOB_UBERTASK_ENABLE, false);
boolean smallNumMapTasks = (numMapTasks <= sysMaxMaps);
boolean smallNumReduceTasks = (numReduceTasks <= sysMaxReduces);
boolean smallInput = (dataInputLength <= sysMaxBytes);
// ignoring overhead due to UberAM and statics as negligible here:
boolean smallMemory =
( (Math.max(conf.getLong(MRJobConfig.MAP_MEMORY_MB, 0),
conf.getLong(MRJobConfig.REDUCE_MEMORY_MB, 0))
<= sysMemSizeForUberSlot)
|| (sysMemSizeForUberSlot == JobConf.DISABLED_MEMORY_LIMIT));
boolean smallCpu =
(
Math.max(
conf.getInt(
MRJobConfig.MAP_CPU_VCORES,
MRJobConfig.DEFAULT_MAP_CPU_VCORES),
conf.getInt(
MRJobConfig.REDUCE_CPU_VCORES,
MRJobConfig.DEFAULT_REDUCE_CPU_VCORES))
<= sysCPUSizeForUberSlot
);
boolean notChainJob = !isChainJob(conf); // User has overall veto power over uberization, or user can modify
// limits (overriding system settings and potentially shooting
// themselves in the head). Note that ChainMapper/Reducer are
// fundamentally incompatible with MR-1220; they employ a blocking
// queue between the maps/reduces and thus require parallel execution,
// while "uber-AM" (MR AM + LocalContainerLauncher) loops over tasks
// and thus requires sequential execution.
isUber = uberEnabled && smallNumMapTasks && smallNumReduceTasks
&& smallInput && smallMemory && smallCpu
&& notChainJob; if (isUber) {
LOG.info("Uberizing job " + jobId + ": " + numMapTasks + "m+"
+ numReduceTasks + "r tasks (" + dataInputLength
+ " input bytes) will run sequentially on single node."); // make sure reduces are scheduled only after all map are completed
conf.setFloat(MRJobConfig.COMPLETED_MAPS_FOR_REDUCE_SLOWSTART,
1.0f);
// uber-subtask attempts all get launched on same node; if one fails,
// probably should retry elsewhere, i.e., move entire uber-AM: ergo,
// limit attempts to 1 (or at most 2? probably not...)
conf.setInt(MRJobConfig.MAP_MAX_ATTEMPTS, 1);
conf.setInt(MRJobConfig.REDUCE_MAX_ATTEMPTS, 1); // disable speculation
conf.setBoolean(MRJobConfig.MAP_SPECULATIVE, false);
conf.setBoolean(MRJobConfig.REDUCE_SPECULATIVE, false);
} else {
StringBuilder msg = new StringBuilder();
msg.append("Not uberizing ").append(jobId).append(" because:");
if (!uberEnabled)
msg.append(" not enabled;");
if (!smallNumMapTasks)
msg.append(" too many maps;");
if (!smallNumReduceTasks)
msg.append(" too many reduces;");
if (!smallInput)
msg.append(" too much input;");
if (!smallMemory)
msg.append(" too much RAM;");
if (!notChainJob)
msg.append(" chainjob;");
LOG.info(msg.toString());
}
}
mapreduce是否跑在local模式由7个变量决定:uberEnabled && smallNumMapTasks && smallNumReduceTasks&& smallInput && smallMemory && smallCpu && notChainJob,只有这7个变量同时满足才能跑在local模式。
首先要满足uberEnabled为true,看看uberEnable定义:uberEnabled = conf.getBoolean(MRJobConfig.JOB_UBERTASK_ENABLE, false),MRJobConfig.JOB_UBERTASK_ENABLE对应配置项为mapreduce.job.ubertask.enable(mapred-site.xml),说明mapreduce.job.ubertask.enable配置为true,查看mapred-site.xml配置文件,果然是true
<property>
<name>mapreduce.job.ubertask.enable</name>
<value>true</value>
<description>Whether to enable the small-jobs "ubertask" optimization,
which runs "sufficiently small" jobs sequentially within a single JVM.
"Small" is defined by the following maxmaps, maxreduces, and maxbytes
settings. Users may override this value.
</description>
</property>
问题找到了就好办,第一种办法直接把mapreduce.job.ubertask.enable改为false,都走集群模式;
第二种办法是mapreduce.job.ubertask.enable还是为true,继续分析reduce.xml文件找不到的原因,还是查看读取reduce.xml那块:org.apache.hadoop.hive.ql.exec.Utilities#getBaseWork
if (HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) {
LOG.debug("Loading plan from string: "+path.toUri().getPath());
String planString = conf.get(path.toUri().getPath());
if (planString == null) {
LOG.info("Could not find plan string in conf");
return null;
}
byte[] planBytes = Base64.decodeBase64(planString);
in = new ByteArrayInputStream(planBytes);
in = new InflaterInputStream(in);
} else {
in = new FileInputStream(localPath.toUri().getPath());
}
可见hive执行计划文件由ConfVars.HIVE_RPC_QUERY_PLAN决定是通过rpc读集群上的文件还是读本地文件,默认为本地文件。
hive跑mapreduce报java.lang.RuntimeException: Error in configuring object的更多相关文章
- hive脚本出现Error: java.lang.RuntimeException: Error in configuring object和Caused by: java.lang.IndexOutOfBoundsException: Index: 9, Size: 9
是在reduce阶段报的错误,详细错误信息是 朱传豪 19:04:48 Diagnostic Messages for this Task: Error: java.lang.RuntimeExcep ...
- nested exception is java.lang.RuntimeException: Error parsing Mapper XML. Cause: java.lang.IllegalArgumentException: Result Maps collection already contains value for
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'daoSupport': ...
- sparksql读取hive数据报错:java.lang.RuntimeException: serious problem
问题: Caused by: java.util.concurrent.ExecutionException: java.lang.IndexOutOfBoundsException: Index: ...
- jdk1.8+SpringAOP注解报java.lang.IllegalArgumentException: error at ::0 can't find referenced pointcut select错误的不知原因的解决办法[仅供参考]
先说办法:如果Aspectweaver-1.*.*jar这三个包版本比较低, 比如1.5.0这一层次的,可以找版本高一点的包替换低版本的包,问题可以得到解决 jar包的下载地址:https://mvn ...
- grade配置添加java库导致报 java.lang.RuntimeException: java.lang.RuntimeException: com.android.builder.dexing.DexArchiveMerger
原因是导入的第三方库中也引入了项目中存在的相同名称的库,导致产生冲突
- hive Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata. ...
- 使用CXF 2.7.5出现的java.lang.RuntimeException: Cannot create a secure XMLInputFactory错误解决
昨天启动工程测试webservice服务,结果发现服务一调用就报java.lang.RuntimeException: Cannot create a secure XMLInputFactory j ...
- 类型转换异常处理java.lang.RuntimeException
前几天在做一个安卓项目的时候一直报java.lang.RuntimeException错,一直调试不出来,今天突然又灵感是不是文件配置出错了,果然在清单文件中少了一句 android:name=&qu ...
- java.lang.RuntimeException: Cannot create a secure XMLInputFactory 解决
客户端调用服务端cxf,服务端报 java.lang.RuntimeException: Cannot create a secure XMLInputFactory 我的cxf 版本 为 3.0. ...
随机推荐
- 记一次生产环境presto删表失败的问题
场景,开发用java程序连接presto创建一个表,这个表在hdfs的权限为: 然后用presto去删除这个表 报错,没有权限删除,查看上一级目录权限,发现权限正常 直连hive删表 发现正常. 然后 ...
- CentOS7 配置NFS(Network File System)及其使用
1. 服务端配置 1.1. 安装NFS yum -y install nfs* 1.2. 查看是否安装了NFS与RPCBIND rpm -qa | grep nfs rpm - ...
- 架构师成长之路5.5-Saltstack配置管理(状态间关系)
点击架构师成长之路 架构师成长之路5.5-Saltstack配置管理(状态间关系) 配置管理工具: Pupper:1. 采用ruby编程语言:2. 安装环境相对较复杂:3.不支持远程执行,需要FUNC ...
- Spring笔记之配置数据源
任何DAO访问数据库,最开始都需要配置数据源,数据源中定义了访问数据库的常用配置.有了数据源才能创建数据模板,然后把数据模板注入到DAO中,通过调用数据访问模板中的相应方法来对数据库进行相关操作. 常 ...
- Java异常try里面有return,finally代码会执行吗
try{}里有一个return语句,那么紧跟在这个try后的finally{}里的code会不会被执行,什么时候被执行,在return前还是后? 肯定会执行.finally{}块的代码只有在try{} ...
- js 字符串格式化为时间格式
首先介绍一下我遇到的坑,找了几个关于字符串转时间的,他们都可以就我用的时候不行. 我的原因,我的字符串是MYSQL拿出来的不是标准的时间格式,是不会转成功的. 解决思路:先将字符串转为标准时间格式的字 ...
- java 获取bean的方式
我们知道可以通过ApplicationContext的getBean方法来获取Spring容器中已初始化的bean.getBean一共有以下四种方法原型: l getBean(String name) ...
- mysqli在php7中的使用
mysqli这个库还是比较繁杂的,这其中又分mysqli ,mysqli_stmt,mysqli_result......一堆类,特别乱 这里奉上thinkphp5.1中使用mysqli扩展的查询用法 ...
- 【转载】Java项目中常用的异常处理情况总结
一,JDK中与异常相关的类 分析: Java中的异常分类: Throwable类有两个直接子类: Exception:出现的问题是可以被捕获的: Error:系统错误,通常由JVM处理. 可捕获的异常 ...
- TCP下的套接字服务端实现并发 作业题
# 服务端 import socket from threading import Thread """ 服务端 1.要有固定的IP和PORT 2.24小时不间断提供服务 ...