Windows平台开发Mapreduce程序远程调用运行在Hadoop集群—Yarn调度引擎异常
共享原因:虽然用一篇博文写问题感觉有点奢侈,但是搜索百度,相关文章太少了,苦苦探寻日志才找到解决方案。
遇到问题:在windows平台上开发的mapreduce程序,运行迟迟没有结果。
Mapreduce程序
public class Test {
public static void main(String [] args) throws Exception{
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://master:9000/");conf.set(</span>"mapreduce.job.jar", "D:/intelij-workspace/aaron-bigdata/aaorn-mapreduce/target/aaorn-mapreduce-1.0-SNAPSHOT.jar"<span style="color: #000000;">.trim());
conf.set(</span>"mapreduce.framework.name", "yarn"<span style="color: #000000;">);
conf.set(</span>"yarn.resourcemanager.hostname", "master"<span style="color: #000000;">);
conf.set(</span>"mapreduce.app-submission.cross-platform", "true"<span style="color: #000000;">);
Job job </span>=<span style="color: #000000;"> Job.getInstance(conf);
job.setMapperClass(WordCountMapper.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);
job.setReducerClass(WordCountReducer.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">); job.setMapOutputKeyClass(Text.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);
job.setMapOutputValueClass(LongWritable.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">); job.setOutputKeyClass(Text.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);
job.setOutputValueClass(LongWritable.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">); FileInputFormat.setInputPaths(job,</span>"hdfs://master:9000/input/"<span style="color: #000000;">);
FileOutputFormat.setOutputPath(job,</span><span style="color: #0000ff;">new</span> Path("hdfs://master:9000/output3/"<span style="color: #000000;">)); job.waitForCompletion(</span><span style="color: #0000ff;">true</span><span style="color: #000000;">);
}
}
运行结果
[QC] INFO [main] org.apache.hadoop.yarn.client.RMProxy.createRMProxy(98) | Connecting to ResourceManager at master/192.168.56.100:8032
[QC] WARN [main] org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(64) | Hadoop
command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
[QC] INFO [main] org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(283) | Total input paths to process : 2
[QC] INFO [main] org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(198) | number of splits:2
[QC] INFO [main] org.apache.hadoop.mapreduce.JobSubmitter.printTokens(287) | Submitting tokens for job: job_1496627557122_0004
[QC] INFO [main] org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(273) | Submitted application application_1496627557122_0004
[QC] INFO [main] org.apache.hadoop.mapreduce.Job.submit(1294) | The url to track the job: http://master:8088/proxy/application_1496627557122_0004/
[QC] INFO [main] org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(1339) | Running job: job_1496627557122_0004
Master(NameNode)日志
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.hadoop.ipc.Server.channelRead(Server.java:2603)
at org.apache.hadoop.ipc.Server.access$2800(Server.java:136)
at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1481)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:771)
at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:637)
at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:608
2017-06-05 09:49:40,464 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:41,464 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:42,465 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:43,467 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:44,468 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:45,470 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:46,472 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:47,474 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
Windows平台开发Mapreduce程序远程调用运行在Hadoop集群—Yarn调度引擎异常的更多相关文章
- [MapReduce_add_1] Windows 下开发 MapReduce 程序部署到集群
0. 说明 Windows 下开发 MapReduce 程序部署到集群 1. 前提 在本地开发的时候保证 resource 中包含以下配置文件,从集群的配置文件中拷贝 在 resource 中新建 ...
- hadoop 把mapreduce任务从本地提交到hadoop集群上运行
MapReduce任务有三种运行方式: 1.windows(linux)本地调试运行,需要本地hadoop环境支持 2.本地编译成jar包,手动发送到hadoop集群上用hadoop jar或者yar ...
- 在windows远程提交任务给Hadoop集群(Hadoop 2.6)
我使用3台Centos虚拟机搭建了一个Hadoop2.6的集群.希望在windows7上面使用IDEA开发mapreduce程序,然后提交的远程的Hadoop集群上执行.经过不懈的google终于搞定 ...
- 使用Windows Azure的VM安装和配置CDH搭建Hadoop集群
本文主要内容是使用Windows Azure的VIRTUAL MACHINES和NETWORKS服务安装CDH (Cloudera Distribution Including Apache Hado ...
- windows环境:idea或者eclipse指定用户名操作hadoop集群
方法 在系统的环境变量或java JVM变量添加HADOOP_USER_NAME(具体值视情况而定). 比如:idea里面可以如下添加HADOOP_USER_NAME=hdfs 原理:直接看源码 /h ...
- 运行基准测试hadoop集群中的问题:org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /benchmarks/TestDFSIO/io_data/test_
在master(即:host2)中执行 hadoop jar hadoop-test-1.1.2.jar DFSCIOTest -write -nrFiles 12 -fileSize 10240 - ...
- dotNET使用DRPC远程调用运行在Storm上的Topology
Distributed RPC(DRPC)是Storm构建在Thrift协议上的RPC的实现,DRPC使得你可以通过多种语言远程的使用Storm集群的计算能力.DRPC并非Storm的基础特性,但它确 ...
- Hadoop集群运行JNI程序
要在Hadoop集群运行上运行JNI程序,首先要在单机上调试程序直到可以正确运行JNI程序,之后移植到Hadoop集群就是水到渠成的事情. Hadoop运行程序的方式是通过jar包,所以我们需要将所有 ...
- 本地idea开发mapreduce程序提交到远程hadoop集群执行
https://www.codetd.com/article/664330 https://blog.csdn.net/dream_an/article/details/84342770 通过idea ...
随机推荐
- 20155232 2016-2017-3 《Java程序设计》第8周学习总结
20155232 2016-2017-3 <Java程序设计>第8周学习总结 教材学习内容总结 第十四章NIO与NIO2 NIO使用频道来衔接数据结点,在处理数据时,NIO可以让你设定缓冲 ...
- c++刷题(37/100)笔试题2
4道题2小时,又是一道,不过这次的比较难,但第二道不应该的,又是审题不仔细导致没过 题目1: 给定一个字符串,请你将字符串重新编码,将连续的字符替换成“连续出现的个数+字符”.比如字符串AAAABCC ...
- Anaconda+django写出第一个web app(十)
今天继续学习外键的使用. 当我们有了category.series和很多tutorials时,我们查看某个tutorial,可能需要这样的路径http://127.0.0.1:8000/categor ...
- 优化MySQL的21个建议 – MySQL Life【转】
今天一个朋友向我咨询怎么去优化 MySQL,我按着思维整理了一下,大概粗的可以分为21个方向. 还有一些细节东西(table cache, 表设计,索引设计,程序端缓存之类的)先不列了,对一个系统,初 ...
- spring mvc file upload
文件上传 1.需要导入两个jar包 2.在SpringMVC配置文件中加入 1 2 3 4 <!-- upload settings --> <bean id="multi ...
- django Rest Framework---缓存通过drf-extensions扩展来实现
什么情况下使用缓存 1.不经常更新的数据 2.用户经常访问的一些页面,比如商品列表页.商品详情页等 3.用户经常修改的一些操作:购物车.订单中心等 关于DRF缓存扩展可以参考文档:http://chi ...
- public private protect
public 公有 使用public意味着声明public之后的成员对每个人都是可用的 private 私有 除非必须公开底层实现细目,否则就应该将所有的域指定为private protect 继 ...
- python控制selenium点击登录按钮时报错 unknown error: Element is not clickable at point
利用python控制selenium进行一个网页的登录时报错: C:\Users\Desktop\selenium\chrome>python chrome.py selenium.common ...
- Hive项目开发环境搭建(Eclipse\MyEclipse + Maven)
写在前面的话 可详细参考,一定得去看 HBase 开发环境搭建(Eclipse\MyEclipse + Maven) Zookeeper项目开发环境搭建(Eclipse\MyEclipse + Mav ...
- 007.KVM虚机时间-快照管理
一 快照管理 1.1 创建快照 [root@kvm-host ~]# virsh snapshot-create vm03-centos6.8 [root@kvm-host ~]# virsh sna ...