序:本以为今天花点时间将WordCount例子完全理解到,但高估自己了,更别说我只是在大学选修一学期的java,之后再也没碰过java语言了

总的来说,从宏观上能理解具体的程序思路,但具体到每个代码有什么作用,什么原理,那还需要花点时间,毕竟需要一点java基础和hadoop的运行机制的知识

首先启动hadoop;

[hadoop@hadoop01 eclipse]$ cd ~/hadoop-3.2.0
[hadoop@hadoop01 hadoop-3.2.0]$ sbin/start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [hadoop01]
Starting datanodes
Starting secondary namenodes [hadoop01]
Starting resourcemanager
Starting nodemanagers
[hadoop@hadoop01 hadoop-3.2.0]$ jps
8497 NameNode
9121 ResourceManager
8868 SecondaryNameNode
9268 NodeManager
9630 Jps

然后,进入root权限打开eclipse;

[hadoop@hadoop01 hadoop-3.2.0]$ su root
Password:
[root@hadoop01 hadoop-3.2.0]# cd ..
[root@hadoop01 hadoop]# cd eclipse
[root@hadoop01 eclipse]# ./eclipse

在eclipse的window里面show view打开terminal;

在eclipse中点击打开open a terminal,在终端中输入命令:gedit input.txt;

在文档中任意输入内容;

在终端中输入命令:hadoop fs -put /home/hadoop/input.txt /test/;

最后,file--new--project--MapReduce project并取项目名“Wordcount”,再从创建的文件下src中new--package并为包取名“com.hadoop”,又在src下new--class并为类取名“Wordcount”,然后将下面的代码粘贴进去。

然后可以run as hadoop,成功运行得到计算结果

注:若package下无log4j.properties,会报错,需在该文件下手动添加该文件。
内容 如下:

# Configure logging for testing: optionally with log file

#log4j.rootLogger=debug,appender
log4j.rootLogger=info,appender
#log4j.rootLogger=error,appender #\u8F93\u51FA\u5230\u63A7\u5236\u53F0
log4j.appender.appender=org.apache.log4j.ConsoleAppender
#\u6837\u5F0F\u4E3ATTCCLayout
log4j.appender.appender.layout=org.apache.log4j.TTCCLayout

附代码:

/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.examples; import java.io.IOException;
import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser; public class WordCount { public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1);
private Text word = new Text(); public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
} public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
} public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length < 2) {
System.err.println("Usage: wordcount <in> [<in>...] <out>");
System.exit(2);
}
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
for (int i = 0; i < otherArgs.length - 1; ++i) {
FileInputFormat.addInputPath(job, new Path(otherArgs[i]));
}
FileOutputFormat.setOutputPath(job,
new Path(otherArgs[otherArgs.length - 1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

【hadoop】在eclipse上运行WordCount的操作过程的更多相关文章

  1. linux下在eclipse上运行hadoop自带例子wordcount

    启动eclipse:打开windows->open perspective->other->map/reduce 可以看到map/reduce开发视图.设置Hadoop locati ...

  2. MapReduce编程入门实例之WordCount:分别在Eclipse和Hadoop集群上运行

    上一篇博文如何在Eclipse下搭建Hadoop开发环境,今天给大家介绍一下如何分别分别在Eclipse和Hadoop集群上运行我们的MapReduce程序! 1. 在Eclipse环境下运行MapR ...

  3. 在Eclipse上运行Spark(Standalone,Yarn-Client)

    欢迎转载,且请注明出处,在文章页面明显位置给出原文连接. 原文链接:http://www.cnblogs.com/zdfjf/p/5175566.html 我们知道有eclipse的Hadoop插件, ...

  4. Spark源码编译并在YARN上运行WordCount实例

    在学习一门新语言时,想必我们都是"Hello World"程序开始,类似地,分布式计算框架的一个典型实例就是WordCount程序,接触过Hadoop的人肯定都知道用MapRedu ...

  5. mac上eclipse上运行word count

    1.打开eclipse之后,建立wordcount项目 package wordcount; import java.io.IOException; import java.util.StringTo ...

  6. 解决在windows的eclipse上面运行WordCount程序出现的一系列问题详解

    一.简介 要在Windows下的 Eclipse上调试Hadoop2代码,所以我们在windows下的Eclipse配置hadoop-eclipse-plugin- 2.6.0.jar插件,并在运行H ...

  7. 在Hadoop 2.3上运行C++程序各种疑难杂症(Hadoop Pipes选择、错误集锦、Hadoop2.3编译等)

    首记 感觉Hadoop是一个坑,打着大数据最佳解决方案的旗帜到处坑害良民.记得以前看过一篇文章,说1TB以下的数据就不要用Hadoop了,体现不 出太大的优势,有时候反而会成为累赘.因此Hadoop的 ...

  8. hadoop 把mapreduce任务从本地提交到hadoop集群上运行

    MapReduce任务有三种运行方式: 1.windows(linux)本地调试运行,需要本地hadoop环境支持 2.本地编译成jar包,手动发送到hadoop集群上用hadoop jar或者yar ...

  9. Hadoop在window上运行 user=Administrator, access=WRITE, inode="hadoop"

    win7下eclipse中错误的详细描述如下: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.securit ...

随机推荐

  1. SQLServer ROW_NUMBER()函数使用方法 分区排序

    #ROW_NUMBER() over()能干什么? 既可满足分区的需求,也可以根据一定的顺序来排序. #细细说 select ROW_NUMBER() over(partition by xm Ord ...

  2. Linux_CentOS中Mongodb4.x 安装调试、远程管理、配置 mongodb 管理员密码

    Mongodb4.x 安装 官方文档:https://docs.mongodb.com/manual/tutorial/install-mongodb-on-red-hat/ 1.配置 yum 源 1 ...

  3. linux系统上传下载命令rz和sz的教程

    (一)安装方法汇总(注意:一下命令如果没有权限的需要在每个命令前面加一个sudo) 1.安装方法(推荐) sudo yum install lrzsz 2.在安装Linux系统时选中“DialupNe ...

  4. sysfile20191122

    ass_s_ccp_ft:-108; ass_s_ccp_all:-108; ass_tag_ft:-105; ass_tag_all:-105; rept_port:9000; Q_value:0. ...

  5. 泡泡一分钟:Learning Motion Planning Policies in Uncertain Environments through Repeated Task Executions

    张宁  Learning Motion Planning Policies in Uncertain Environments through Repeated Task Executions 通过重 ...

  6. 离线安装pycharm数据库驱动

    这个数据库驱动,不是python的链接包 而是打开pycharm pro版后的数据库浏览器驱动. 也就是专业版比社区版方便的一个地方,可以直接边写代码,边看数据库结构,还可以拖动一些变量. 在线安装挺 ...

  7. iOS:Xcode代码块,提升敲代码的效率

    一.代码块在哪里? 看下图 或者 快捷键:command+shift+L 长这样: 二.如何创建代码块: 1.先选中要创建的代码片段,然后点击右键,选中 Create Code Snippet 然后会 ...

  8. 细数那些Java程序员最容易犯那些错

    java作为最受欢迎程度榜榜首语言,自然是广大开发者使用最多的语言.正因为有如此广泛的使用性,java开发中发生异常也比比皆是,接下来我们就来看看那些java开发中最容易出现的那些错误. 1.重复造轮 ...

  9. Mac打开原生NTFS功能

    一.在 terminal 里输入 diskutil list 查看 U 盘的 NAME diskutil list 二.执行以下命令,输入密码 sudo nano /etc/fstab 三.把U盘信息 ...

  10. [转帖]Socat 入门教程

    https://www.hi-linux.com/posts/61543.html 现在安装k8s 必须带 socat 今天看一下socat 到底是啥东西呢. Socat 是 Linux 下的一个多功 ...