Hadoop2.8.2 运行wordcount

1 例子jar位置

[hadoop@hadoop02 mapreduce]$ pwd

/hadoop/hadoop-2.8.2/share/hadoop/mapreduce

[hadoop@hadoop02 mapreduce]$ ls -lrt

总用量 5084

drwxr-xr-x 2 hadoop hadoop    4096 10月 20 05:11 lib

drwxr-xr-x 2 hadoop hadoop    4096 10月 20 05:11 jdiff

-rw-r--r-- 1 hadoop hadoop  301936 10月 20 05:11 hadoop-mapreduce-examples-2.8.2.jar

-rw-r--r-- 1 hadoop hadoop   77142 10月 20 05:11 hadoop-mapreduce-client-shuffle-2.8.2.jar

-rw-r--r-- 1 hadoop hadoop 1588114 10月 20 05:11 hadoop-mapreduce-client-jobclient-2.8.2-tests.jar

-rw-r--r-- 1 hadoop hadoop   67003 10月 20 05:11 hadoop-mapreduce-client-jobclient-2.8.2.jar

-rw-r--r-- 1 hadoop hadoop   31535 10月 20 05:11 hadoop-mapreduce-client-hs-plugins-2.8.2.jar

-rw-r--r-- 1 hadoop hadoop  195052 10月 20 05:11 hadoop-mapreduce-client-hs-2.8.2.jar

-rw-r--r-- 1 hadoop hadoop 1571759 10月 20 05:11 hadoop-mapreduce-client-core-2.8.2.jar

-rw-r--r-- 1 hadoop hadoop  782757 10月 20 05:11 hadoop-mapreduce-client-common-2.8.2.jar

-rw-r--r-- 1 hadoop hadoop  563771 10月 20 05:11 hadoop-mapreduce-client-app-2.8.2.jar

drwxr-xr-x 2 hadoop hadoop    4096 10月 20 05:11 sources

drwxr-xr-x 2 hadoop hadoop      29 10月 20 05:11 lib-examples

2 生成数据文件

[hadoop@hadoop01 ~]$ echo "Hello World">>word.txt

[hadoop@hadoop01 ~]$ echo "Hello Hadoop">>word.txt

[hadoop@hadoop01 ~]$ echo "Hello Hive">>word.txt

3 创建HDFS目录

[hadoop@hadoop01 ~]$ hadoop dfs -mkdir /work/data/input

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.

[hadoop@hadoop01 ~]$ hadoop dfs -lsr /work/data

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.

lsr: DEPRECATED: Please use 'ls -R' instead.

drwxr-xr-x   - hadoop supergroup          0 2017-11-12 09:00 /work/data/input

[hadoop@hadoop01 ~]$

4 将数据文件word.txt上传以HDFS /work/data/input目录下

[hadoop@hadoop01 ~]$ hadoop dfs -copyFromLocal word.txt /work/data/input

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.

[hadoop@hadoop01 ~]$ hadoop dfs -text /work/data/input/word.txt

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.

Hello World

Hello Hadoop

Hello Hive

[hadoop@hadoop01 ~]$

5 运行wordcount例子

[hadoop@hadoop01 hadoop-2.8.2]$ pwd

/hadoop/hadoop-2.8.2

[hadoop@hadoop01 hadoop-2.8.2]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.2.jar wordcount /work/data/input /work/data/output

17/11/12 09:05:14 INFO client.RMProxy: Connecting to ResourceManager at hadoop02/192.168.169.102:8032

17/11/12 09:05:15 INFO input.FileInputFormat: Total input files to process : 1

17/11/12 09:05:15 INFO mapreduce.JobSubmitter: number of splits:1

17/11/12 09:05:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1510447239720_0001

17/11/12 09:05:16 INFO impl.YarnClientImpl: Submitted application application_1510447239720_0001

17/11/12 09:05:16 INFO mapreduce.Job: The url to track the job: http://hadoop02:8088/proxy/application_1510447239720_0001/

17/11/12 09:05:16 INFO mapreduce.Job: Running job: job_1510447239720_0001

17/11/12 09:05:25 INFO mapreduce.Job: Job job_1510447239720_0001 running in uber mode : false

17/11/12 09:05:25 INFO mapreduce.Job:  map 0% reduce 0%

17/11/12 09:05:35 INFO mapreduce.Job:  map 100% reduce 0%

17/11/12 09:05:40 INFO mapreduce.Job:  map 100% reduce 100%

17/11/12 09:05:41 INFO mapreduce.Job: Job job_1510447239720_0001 completed successfully

17/11/12 09:05:41 INFO mapreduce.Job: Counters: 49

	File System Counters

		FILE: Number of bytes read=53

		FILE: Number of bytes written=276955

		FILE: Number of read operations=0

		FILE: Number of large read operations=0

		FILE: Number of write operations=0

		HDFS: Number of bytes read=152

		HDFS: Number of bytes written=31

		HDFS: Number of read operations=6

		HDFS: Number of large read operations=0

		HDFS: Number of write operations=2

	Job Counters

		Launched map tasks=1

		Launched reduce tasks=1

		Data-local map tasks=1

		Total time spent by all maps in occupied slots (ms)=5860

		Total time spent by all reduces in occupied slots (ms)=3296

		Total time spent by all map tasks (ms)=5860

		Total time spent by all reduce tasks (ms)=3296

		Total vcore-milliseconds taken by all map tasks=5860

		Total vcore-milliseconds taken by all reduce tasks=3296

		Total megabyte-milliseconds taken by all map tasks=6000640

		Total megabyte-milliseconds taken by all reduce tasks=3375104

	Map-Reduce Framework

		Map input records=3

		Map output records=6

		Map output bytes=59

		Map output materialized bytes=53

		Input split bytes=117

		Combine input records=6

		Combine output records=4

		Reduce input groups=4

		Reduce shuffle bytes=53

		Reduce input records=4

		Reduce output records=4

		Spilled Records=8

		Shuffled Maps =1

		Failed Shuffles=0

		Merged Map outputs=1

		GC time elapsed (ms)=224

		CPU time spent (ms)=2190

		Physical memory (bytes) snapshot=443719680

		Virtual memory (bytes) snapshot=4207517696

		Total committed heap usage (bytes)=293076992

	Shuffle Errors

		BAD_ID=0

		CONNECTION=0

		IO_ERROR=0

		WRONG_LENGTH=0

		WRONG_MAP=0

		WRONG_REDUCE=0

	File Input Format Counters

		Bytes Read=35

	File Output Format Counters

		Bytes Written=31

[hadoop@hadoop01 hadoop-2.8.2]$

6 查看结果

[hadoop@hadoop01 hadoop-2.8.2]$ hadoop dfs -lsr /work/data/output

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.

lsr: DEPRECATED: Please use 'ls -R' instead.

-rw-r--r--   2 hadoop supergroup          0 2017-11-12 09:05 /work/data/output/_SUCCESS

-rw-r--r--   2 hadoop supergroup         31 2017-11-12 09:05 /work/data/output/part-r-00000

[hadoop@hadoop01 hadoop-2.8.2]$ hadoop dfs -text /work/data/output/part-r-00000

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.

Hadoop	1

Hello	3

Hive	1

World	1

[hadoop@hadoop01 hadoop-2.8.2]$

Hadoop2.8.2 运行wordcount的更多相关文章

hadoop2.6.4运行wordcount
hadoop用户登录,启动服务: start-dfs.sh && start-yarn.sh 创建输入目录: hadoop df -mkdir /input 把测试文件导入/input ...
hadoop2.6.5运行wordcount实例
运行wordcount实例在/tmp目录下生成两个文本文件,上面随便写两个单词. cd /tmp/ mkdir file cd file/ echo "Hello world" ...
hadoop2.7.x运行wordcount程序卡住在INFO mapreduce.Job: Running job:job _1469603958907_0002
一.抛出问题 Hadoop集群(全分布式)配置好后,运行wordcount程序测试,发现每次运行都会卡住在Running job处,然后程序就呈现出卡死的状态. wordcount运行命令:[hado ...
CentOS上安装Hadoop2.7，添加数据节点，运行wordcount
安装hadoop的步骤比较繁琐,但是并不难. 在CentOS上安装Hadoop2.7 1. 安装 CentOS,注:图形界面并无必要 2. 在CentOS里设置静态IP,手工编辑如下4个文件 /etc ...
win10+eclipse+hadoop2.7.2+maven+local模式直接通过Run as Java Application运行wordcount
一.准备工作 (1)Hadoop2.7.2 在linux部署完毕,成功启动dfs和yarn,通过jps查看,进程都存在 (2)安装maven 二.最终效果在windows系统中,直接通过Run as ...
Spark源码编译并在YARN上运行WordCount实例
在学习一门新语言时,想必我们都是"Hello World"程序开始,类似地,分布式计算框架的一个典型实例就是WordCount程序,接触过Hadoop的人肯定都知道用MapRedu ...
解决在windows的eclipse上面运行WordCount程序出现的一系列问题详解
一．简介要在Windows下的 Eclipse上调试Hadoop2代码,所以我们在windows下的Eclipse配置hadoop-eclipse-plugin- 2.6.0.jar插件,并在运行H ...
Spark on YARN简介与运行wordcount（master、slave1和slave2）（博主推荐）
前期博客 Spark on YARN模式的安装(spark-1.6.1-bin-hadoop2.6.tgz +hadoop-2.6.0.tar.gz)(master.slave1和slave2)(博主 ...
Spark standalone简介与运行wordcount（master、slave1和slave2）
前期博客 Spark standalone模式的安装(spark-1.6.1-bin-hadoop2.6.tgz)(master.slave1和slave2) Spark运行模式概述 1. Stan ...

随机推荐

jenkins中使用变量
查看jenkins内置变量: 1.新建一个job: 2.构建-增加构建步骤-执行shell: 3.点击可用的环境变量列表即可查看如WORKSPACE : 作为工作空间分配给构建目录的绝对路径 ...
pycharm中拉取新分支代码
将本地代码由主分支切换到新分支切换成功
spring boot 中的路径映射
在spring boot中集成thymeleaf后,我们知道thymeleaf的默认的html的路径为classpath:/templates也就是resources/templates,那如何访问这 ...
玩转OneNET物联网平台之MQTT服务② —— 远程控制LED
1.理论基础参考博主线上博文: 玩转PubSubClient MQTT库玩转OneNET物联网平台之简介玩转OneNET物联网平台之MQTT服务① 2.远程控制LED 2.1 实验材料 ...
Spring 基础知识学习
Spring 总结在Spring框架的发布版本中,共包含了20个不同的模块,可以划分为6类不同的功能. Spring整体架构图为了降低Java开发的复杂性,Spring采取了以下4种关键策略: 基 ...
数据结构（四十二）散列表查找（Hash Table）
一.散列表查找的基础知识 1.散列表查找的定义散列技术是在记录的存储位置和它的关键字之间建立一个确定的对应关系f,使得每个关键字key对应一个存储位置f(key).查找时,根据这个确定的对应关系找到 ...
$.ajax.html
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <link rel= ...
mysql全局变量和局部变量
全局变量和局部变量在服务器启动时,会将每个全局变量初始化为其默认值(可以通过命令行或选项文件中指定的选项更改这些默认值).然后服务器还为每个连接的客户端维护一组会话变量,客户端的会话变量在连接时使用 ...
CSAPP：代码优化【矩阵运算】
编程除了使程序在所有可能的情况下都正确工作,还需要考虑程序的运行效率,上一节主要介绍了关于读写的优化,本节将对运算的优化进行分析.读写优化编写高效程序需要做到以下两点: 选择一组合适的算法和数据结构 ...
TICK技术栈（二）Telegraf安装及使用
1.什么是Telegraf? Telegraf是一个用Go语言开发的代理程序,可用于收集和报告指标.Telegraf插件直接从其运行的系统中获取各种指标,从第三方API中提取指标,甚至通过StatsD ...

Hadoop2.8.2 运行wordcount

Hadoop2.8.2 运行wordcount的更多相关文章

随机推荐

热门专题