Hadoop推测执行机制问题
问题描述:MultipleOutputs使用时hdfs报错
// :: INFO mapreduce.Job: Task Id : attempt_1525336138932_1106_m_000000_1, Status : FAILED
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - Error: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedExcepti
on): Failed to create file [/result/rshadoop/hjb/tmp2/--/-/-m-.gz.parquet] for [DFSClient_attempt_1525336138932_1106_m_000000_1_1358354177_1] for client [120.210.209.141], because this file
is already being created by [DFSClient_attempt_1525336138932_1106_m_000000_0_-1048713570_1] on [120.210.209.137]
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:
)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNam
enodeProtocolServerSideTranslatorPB.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$.cal
lBlockingMethod(ClientNamenodeProtocolProtos.java)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:
)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at java.security.AccessController.doPrivileged(Native Method)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at javax.security.auth.Subject.doAs(Subject.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.ipc.Server$Handler.run(Server.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.ipc.Client.call(Client.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.ipc.Client.call(Client.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at com.sun.proxy.$Proxy12.create(Unknown Source)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProto
colTranslatorPB.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at java.lang.reflect.Method.invoke(Method.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at com.sun.proxy.$Proxy13.create(Unknown Source)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.DistributedFileSystem$.doCall(DistributedFileSystem.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.DistributedFileSystem$.doCall(DistributedFileSystem.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:
)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at com.rs.java.mapreduce.dnsSave.DnsSaveMR$DSMapper2.map(DnsSaveMR.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at com.rs.java.mapreduce.dnsSave.DnsSaveMR$DSMapper2.map(DnsSaveMR.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.mapred.MapTask.run(MapTask.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.mapred.YarnChild$.run(YarnChild.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at java.security.AccessController.doPrivileged(Native Method)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at javax.security.auth.Subject.doAs(Subject.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:)
-- ::, [DefaultQuartzScheduler_Worker-] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -
解决方式:
(1)文件流冲突。
一般创建文件时都会打开一个供写入的文件流。而我们希望是追加,所以如果使用了错误的API ,就有可能引起上述问题。以FileSystem类为例,如果使用create()方法之后再调用append()方法,就会抛出上述异常。所以最好使用createNewFile方法,只创建文件,不打开流。
(2)mapreduce推测执行机制
mapreduce 为了提高效率,会在一个任务启动之后,同时启动一些相同的任务(attempt),其中有一个attempt成功完成之后,视为整个task完成,其结果 作为最终结果,并且杀掉那些较慢的attempt。集群一般会开启此选项以优化性能(以空间换时间)。但在本问题环境下推测执行却不太合适。因为我们一般希望一个task 用来处理一个文件,但如果启动推测执行,会有几个attempt同时试图操作同一个文件,就会引发异常。所以最好关掉此选项,将 mapred.reduce.max.attempts 设为1,或者将mapred.reduce.tasks.speculative.execution设为false.
但此时仍有可能会出现问题。因为如果一个task的唯一attempt出现问题,在被kill掉之后,task仍会另起一个attempt,此时因为前一个attempt异常终止,仍有可能会影响到新起的attempt的文件操作,引发异常。所以最安全的方法是,借鉴推测执行的机制(每个attempt各自生成自己的结果,最终选择一个作为最终结果),以每个attempt的id号为后缀附加到所操作的文件上,同时捕获所有文件操作的异常并处理,这样可避免文件的读写冲突。Context可以用来获取运行时的一些上下文信息,可以很容易得到attempt的id号。注意,此时如果开启推测执行也可以,但是会生成很多相同的文件(每个attempt一份),仍然不是最好的解决方法。
同时,我们可以利用reduce的输出来记录运行“不正常的” key.这些task大多数是attempt_0被杀掉而重启了一个attempt_1,所以下面的文件一般为两份。可以对这些情况的key输出(文件异常或者attemptID > 0),并进行一些后续处理,比如文件重命名,或者紧对这些key重新写入。因为此种情况的key一般只占极少数,所以并不影响总体的效率。
2.文件异常处理
最好能将mapreduce中的所有文件操作都设置好异常处理。不然一个文件异常就有可能会使整个job失败。所以从效率来讲,最好是在文件发生异常时将其key作为reduce的输出以进行记录。因为同时mapreduce会重启一个task attempts重新进行文件读写,可保证我们得到最终的数据,最后所需的只是对那些异常的key进行一些简单的文件重命名操作即可。
3.多目录以及文件拼接
如果我们将key的种类设为1000万,上述方法会生成太多的小文件从而影响hdfs的性能,另外,因为所有文件都在同一个目录下,会导致同一个目录下文件数目过多而影响访问效率。
在创建文件的同时建立多个子目录,一个有用的方法是以reduce的taskid来建立子目录。这样有多少个reduce就可以建立多少个子目录,不会有文件冲突。同一个reduce处理的key都会在同一个目录下。
文件拼接要考虑的一个索引的问题。为了将文件索引建立的尽量简单,应该尽量保证同一个key的所有数据都在同一个大文件中。这可以利用key的hashCode来实现。如果我们想在每个目录下建立1000个文件,只需将hashCode对1000取余即可。
19/01/25 00:04:20 INFO mapreduce.Job: Task Id : attempt_1525336138932_1106_m_000000_1, Status : FAILED2019-01-25 00:04:20,075 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - Error: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to create file [/result/rshadoop/hjb/tmp2/2019-01-22/23-0/2395-m-00000.gz.parquet] for [DFSClient_attempt_1525336138932_1106_m_000000_1_1358354177_1] for client [120.210.209.141], because this file is already being created by [DFSClient_attempt_1525336138932_1106_m_000000_0_-1048713570_1] on [120.210.209.137]2019-01-25 00:04:20,075 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3035)2019-01-25 00:04:20,075 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2737)2019-01-25 00:04:20,075 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2632)2019-01-25 00:04:20,076 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2519)2019-01-25 00:04:20,076 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:566)2019-01-25 00:04:20,076 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:394)2019-01-25 00:04:20,076 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)2019-01-25 00:04:20,076 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)2019-01-25 00:04:20,076 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)2019-01-25 00:04:20,077 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)2019-01-25 00:04:20,077 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)2019-01-25 00:04:20,077 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at java.security.AccessController.doPrivileged(Native Method)2019-01-25 00:04:20,077 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at javax.security.auth.Subject.doAs(Subject.java:415)2019-01-25 00:04:20,077 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)2019-01-25 00:04:20,077 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)2019-01-25 00:04:20,077 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob - 2019-01-25 00:04:20,078 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.ipc.Client.call(Client.java:1468)2019-01-25 00:04:20,078 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.ipc.Client.call(Client.java:1399)2019-01-25 00:04:20,078 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)2019-01-25 00:04:20,078 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at com.sun.proxy.$Proxy12.create(Unknown Source)2019-01-25 00:04:20,078 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)2019-01-25 00:04:20,078 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)2019-01-25 00:04:20,079 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)2019-01-25 00:04:20,079 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)2019-01-25 00:04:20,079 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at java.lang.reflect.Method.invoke(Method.java:606)2019-01-25 00:04:20,079 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)2019-01-25 00:04:20,079 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)2019-01-25 00:04:20,079 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at com.sun.proxy.$Proxy13.create(Unknown Source)2019-01-25 00:04:20,079 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1725)2019-01-25 00:04:20,080 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1668)2019-01-25 00:04:20,080 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1593)2019-01-25 00:04:20,080 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)2019-01-25 00:04:20,080 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)2019-01-25 00:04:20,080 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)2019-01-25 00:04:20,080 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)2019-01-25 00:04:20,081 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)2019-01-25 00:04:20,081 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)2019-01-25 00:04:20,081 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)2019-01-25 00:04:20,081 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)2019-01-25 00:04:20,081 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:176)2019-01-25 00:04:20,081 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:160)2019-01-25 00:04:20,081 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:289)2019-01-25 00:04:20,082 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:262)2019-01-25 00:04:20,082 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)2019-01-25 00:04:20,082 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)2019-01-25 00:04:20,082 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at com.rs.java.mapreduce.dnsSave.DnsSaveMR$DSMapper2.map(DnsSaveMR.java:643)2019-01-25 00:04:20,082 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at com.rs.java.mapreduce.dnsSave.DnsSaveMR$DSMapper2.map(DnsSaveMR.java:595)2019-01-25 00:04:20,082 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)2019-01-25 00:04:20,082 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)2019-01-25 00:04:20,083 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)2019-01-25 00:04:20,083 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)2019-01-25 00:04:20,083 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at java.security.AccessController.doPrivileged(Native Method)2019-01-25 00:04:20,083 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at javax.security.auth.Subject.doAs(Subject.java:415)2019-01-25 00:04:20,083 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)2019-01-25 00:04:20,083 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)2019-01-25 00:04:20,083 [DefaultQuartzScheduler_Worker-1] INFO com.rs.java.job.dnsSave.ExcuteDnsSaveJob -
Hadoop推测执行机制问题的更多相关文章
- hadoop 错误处理机制
hadoop 错误处理机制 1.硬件故障 硬件故障是指jobtracker故障或TaskTracker 故障 jobtracker是单点,若发生故障,目前hadoop 还无法处理,唯有选择最牢靠的硬件 ...
- HADOOP高可用机制
HADOOP高可用机制 HA运作机制 什么是HA HADOOP如何实现HA HDFS-HA详解 HA集群搭建 目标: 掌握分布式系统中HA机制的思想 掌握HADOOP内置HA的运作机制 掌握HADOO ...
- hadoop的RPC机制 -源码分析
这些天一直奔波于长沙和武汉之间,忙着腾讯的笔试.面试,以至于对hadoop RPC(Remote Procedure Call Protocol ,远程过程调用协议,它是一种通过网络从远程计算机程序上 ...
- Hadoop的RPC机制源码分析
分析对象: hadoop版本:hadoop 0.20.203.0 必备技术点: 1. 动态代理(参考 :http://www.cnblogs.com/sh425/p/6893662.html )2. ...
- 源码级强力分析hadoop的RPC机制
分析对象: hadoop版本:hadoop 0.20.203.0 必备技术点: 1. 动态代理(参考 :http://weixiaolu.iteye.com/blog/1477774 )2. Java ...
- JavaScript定时器与执行机制解析
从JS执行机制说起 浏览器(或者说JS引擎)执行JS的机制是基于事件循环. 由于JS是单线程,所以同一时间只能执行一个任务,其他任务就得排队,后续任务必须等到前一个任务结束才能开始执行. 为了避免因为 ...
- 深入理解JVM--类的执行机制
在完成将class文件信息加载到JVM并产生class对象之后,就可以执行Class对象的静态方法或者实例方法对对象进行调用了.JVM在源代码编译阶段将源代码编译为字节码文件,字节码是一种中间代码的方 ...
- linux上应用程序的执行机制
linux上应用程序的执行机制 执行文件是如何在shell中被"执行"的.本文中尽可能少用一些源码,免得太过于无 聊,主要讲清这个过程,感兴趣的同学可以去查看相应的源码了解更多的信 ...
- java执行机制
java代码编译是由Java源码编译器来完成,流程图如下所示: Java字节码的执行是由JVM执行引擎来完成,流程图如下所示: Java代码编译和执行的整个过程包含了以下三个重要的机制: Java源码 ...
随机推荐
- Educational Codeforces Round 23 A-F 补题
A Treasure Hunt 注意负数和0的特殊处理.. 水题.. 然而又被Hack了 吗的智障 #include<bits/stdc++.h> using namespace std; ...
- Java Socket实战之四:传输压缩对象
转自:http://developer.51cto.com/art/201202/317546.htm 上一篇文章说到了用Java Socket来传输对象,但是在有些情况下比如网络环境不好或者对象比较 ...
- python3 + selenum 环境搭建
一.安装python3 打开python3官网https://www.python.org/,选择下载相应平台版本. 下载完成之后,根绝需要安装.注意:在安装时需勾选左下角“add python to ...
- bzoj 1609[Usaco2008 Feb]Eating Together麻烦的聚餐【dp】
设up[i][j]为第i位升序为j的最小修改数,down为降序 #include<iostream> #include<stdio.h> using namespace std ...
- bzoj 2811: [Apio2012]Guard【线段树+贪心】
关于没有忍者的区间用线段树判就好啦 然后把剩下的区间改一改:l/r数组表示最左/最右没被删的点,然后删掉修改后的左边大于右边的:l升r降排个序,把包含完整区间的区间删掉: 然后设f/g数组表示i前/后 ...
- bzoj 4785: [Zjoi2017]树状数组【树套树】
参考:https://www.cnblogs.com/ljh2000-jump/p/6686960.html 由于操作反过来了,所以显然树状数组维护后缀和,所以本来想查询(1,r)-(1,l-1),现 ...
- 11.2NOIP模拟赛
/* 根右左遍历后最长上升子序列 */ #include<iostream> #include<cstdio> #include<cstring> #include ...
- 洛谷P3239 [HNOI2015]亚瑟王(期望dp)
传送门 stdcall大佬好强 期望的姿势不是很高……据大佬说期望有一个线性性质,也就是说可以把每一张牌的期望伤害算出来然后再加起来就是总的期望伤害 因为每一张牌只能用一次,我们设$dp[i]$表示第 ...
- Pycharm的安装教学
Python环境搭建—安利Python小白的Python和Pycharm安装详细教程 人生苦短,我用Python.众所周知,Python目前越来越火,学习Python的小伙伴也越来越多.最近看到群里的 ...
- 《Windows核心编程系列》十二谈谈Windows内存体系结构
Windows内存体系结构 理解Windows内存体系结构是每一个励志成为优秀的Windows程序员所必须的. 进程虚拟地址空间 每个进程都有自己的虚拟地址空间.对于32位操作系统来说,它的地址空间是 ...