Hadoop2.4.1入门实例：MaxTemperature

注意：以下内容在2.x版本与1.x版本同样适用，已在2.4.1与1.2.0进行测试。

一、前期准备

1、创建伪分布Hadoop环境，请参考官方文档。或者http://blog.csdn.net/jediael_lu/article/details/38637277

2、准备数据文件如下sample.txt：

123456798676231190101234567986762311901012345679867623119010123456798676231190101234561+00121534567890356

123456798676231190101234567986762311901012345679867623119010123456798676231190101234562+01122934567890456

123456798676231190201234567986762311901012345679867623119010123456798676231190101234562+02120234567893456

123456798676231190401234567986762311901012345679867623119010123456798676231190101234561+00321234567803456

123456798676231190101234567986762311902012345679867623119010123456798676231190101234561+00429234567903456

123456798676231190501234567986762311902012345679867623119010123456798676231190101234561+01021134568903456

123456798676231190201234567986762311902012345679867623119010123456798676231190101234561+01124234578903456

123456798676231190301234567986762311905012345679867623119010123456798676231190101234561+04121234678903456

123456798676231190301234567986762311905012345679867623119010123456798676231190101234561+00821235678903456

二、编写代码

1、创建Map

package org.jediael.hadoopDemo.maxtemperature;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Mapper;

public class MaxTemperatureMapper extends

		Mapper<LongWritable, Text, Text, IntWritable> {

	private static final int MISSING = 9999;

	@Override

	public void map(LongWritable key, Text value, Context context)

			throws IOException, InterruptedException {

		String line = value.toString();

		String year = line.substring(15, 19);

		int airTemperature;

		if (line.charAt(87) == '+') { // parseInt doesn't like leading plus

										// signs

			airTemperature = Integer.parseInt(line.substring(88, 92));

		} else {

			airTemperature = Integer.parseInt(line.substring(87, 92));

		}

		String quality = line.substring(92, 93);

		if (airTemperature != MISSING && quality.matches("[01459]")) {

			context.write(new Text(year), new IntWritable(airTemperature));

		}

	}

}

2、创建Reduce

package org.jediael.hadoopDemo.maxtemperature;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Reducer;

public class MaxTemperatureReducer extends

		Reducer<Text, IntWritable, Text, IntWritable> {

	@Override

	public void reduce(Text key, Iterable<IntWritable> values, Context context)

			throws IOException, InterruptedException {

		int maxValue = Integer.MIN_VALUE;

		for (IntWritable value : values) {

			maxValue = Math.max(maxValue, value.get());

		}

		context.write(key, new IntWritable(maxValue));

	}

}

3、创建main方法

package org.jediael.hadoopDemo.maxtemperature;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class MaxTemperature {

	public static void main(String[] args) throws Exception {

		if (args.length != 2) {

			System.err

					.println("Usage: MaxTemperature <input path> <output path>");

			System.exit(-1);

		}

		Job job = new Job();

		job.setJarByClass(MaxTemperature.class);

		job.setJobName("Max temperature");

		FileInputFormat.addInputPath(job, new Path(args[0]));

		FileOutputFormat.setOutputPath(job, new Path(args[1]));

		job.setMapperClass(MaxTemperatureMapper.class);

		job.setReducerClass(MaxTemperatureReducer.class);

		job.setOutputKeyClass(Text.class);

		job.setOutputValueClass(IntWritable.class);

		System.exit(job.waitForCompletion(true) ? 0 : 1);

	}

}

4、导出成MaxTemp.jar，并上传至运行程序的服务器。

三、运行程序

1、创建input目录并将sample.txt复制到input目录

hadoop fs -put sample.txt /

2、运行程序

export HADOOP_CLASSPATH=MaxTemp.jar

hadoop org.jediael.hadoopDemo.maxtemperature.MaxTemperature /sample.txt output10

注意输出目录不能已经存在，否则会创建失败。

3、查看结果

（1）查看结果

[jediael@jediael44 code]$ hadoop fs -cat output10/*

14/07/09 14:51:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

1901 42

1902 212

1903 412

1904 32

1905 102

（2）运行时输出

[jediael@jediael44 code]$ hadoop org.jediael.hadoopDemo.maxtemperature.MaxTemperature /sample.txt output10

14/07/09 14:50:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

14/07/09 14:50:41 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032

14/07/09 14:50:42 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

14/07/09 14:50:43 INFO input.FileInputFormat: Total input paths to process : 1

14/07/09 14:50:43 INFO mapreduce.JobSubmitter: number of splits:1

14/07/09 14:50:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1404888618764_0001

14/07/09 14:50:44 INFO impl.YarnClientImpl: Submitted application application_1404888618764_0001

14/07/09 14:50:44 INFO mapreduce.Job: The url to track the job: http://jediael44:8088/proxy/application_1404888618764_0001/

14/07/09 14:50:44 INFO mapreduce.Job: Running job: job_1404888618764_0001

14/07/09 14:50:57 INFO mapreduce.Job: Job job_1404888618764_0001 running in uber mode : false

14/07/09 14:50:57 INFO mapreduce.Job: map 0% reduce 0%

14/07/09 14:51:05 INFO mapreduce.Job: map 100% reduce 0%

14/07/09 14:51:15 INFO mapreduce.Job: map 100% reduce 100%

14/07/09 14:51:15 INFO mapreduce.Job: Job job_1404888618764_0001 completed successfully

14/07/09 14:51:16 INFO mapreduce.Job: Counters: 49

File System Counters

FILE: Number of bytes read=94

FILE: Number of bytes written=185387

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=1051

HDFS: Number of bytes written=43

HDFS: Number of read operations=6

HDFS: Number of large read operations=0

HDFS: Number of write operations=2

Job Counters

Launched map tasks=1

Launched reduce tasks=1

Data-local map tasks=1

Total time spent by all maps in occupied slots (ms)=5812

Total time spent by all reduces in occupied slots (ms)=7023

Total time spent by all map tasks (ms)=5812

Total time spent by all reduce tasks (ms)=7023

Total vcore-seconds taken by all map tasks=5812

Total vcore-seconds taken by all reduce tasks=7023

Total megabyte-seconds taken by all map tasks=5951488

Total megabyte-seconds taken by all reduce tasks=7191552

Map-Reduce Framework

Map input records=9

Map output records=8

Map output bytes=72

Map output materialized bytes=94

Input split bytes=97

Combine input records=0

Combine output records=0

Reduce input groups=5

Reduce shuffle bytes=94

Reduce input records=8

Reduce output records=5

Spilled Records=16

Shuffled Maps =1

Failed Shuffles=0

Merged Map outputs=1

GC time elapsed (ms)=154

CPU time spent (ms)=1450

Physical memory (bytes) snapshot=303112192

Virtual memory (bytes) snapshot=1685733376

Total committed heap usage (bytes)=136515584

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=954

File Output Format Counters

Bytes Written=43

Hadoop2.4.1入门实例：MaxTemperature的更多相关文章

React 入门实例教程(转载)
本人转载自: React 入门实例教程
struts入门实例
入门实例 1 .下载struts-2.3.16.3-all .不摆了.看哈就会下载了. 2 . 解压后找到 apps 文件夹. 3. 打开后将 struts2-blank.war ...
Vue.js2.0从入门到放弃---入门实例
最近,vue.js越来越火.在这样的大浪潮下,我也开始进入vue的学习行列中,在网上也搜了很多教程,按着教程来做,也总会出现这样那样的问题(坑啊,由于网上那些教程都是Vue.js 1.x版本的,现在用 ...
wxPython中文教程入门实例
这篇文章主要为大家分享下python编程中有关wxPython的中文教程,分享一些wxPython入门实例,有需要的朋友参考下 wxPython中文教程入门实例 wx.Window 是一个基类 ...
Omnet++ 4.0 入门实例教程
http://blog.sina.com.cn/s/blog_8a2bb17d01018npf.html 在网上找到的一个讲解omnet++的实例, 是4.0下面实现的. 我在4.2上试了试,可以用. ...
Spring中IoC的入门实例
Spring中IoC的入门实例 Spring的模块化是很强的,各个功能模块都是独立的,我们可以选择的使用.这一章先从Spring的IoC开始.所谓IoC就是一个用XML来定义生成对象的模式,我们看看如 ...
Node.js入门实例程序
在使用Node.js创建实际“Hello, World!”应用程序之前,让我们看看Node.js的应用程序的部分.Node.js应用程序由以下三个重要组成部分: 导入需要模块: 我们使用require ...
Java AIO 入门实例（转）
Java7 AIO入门实例,首先是服务端实现: 服务端代码 SimpleServer: public class SimpleServer { public SimpleServer(int port ...
Akka入门实例
Akka入门实例 Akka 是一个用 Scala 编写的库,用于简化编写容错的.高可伸缩性的 Java 和 Scala 的 Actor 模型应用. Actor模型并非什么新鲜事物,它由Carl Hew ...

随机推荐

MySQL数据库恢复(使用mysqlbinlog命令)
binlog是通过记录二进制文件方式来备份数据,然后在从二进制文件将数据恢复到某一时段或某一操作点. 1:开启binlog日志记录修改mysql配置文件mysql.ini,在[mysqld]节点下添 ...
jquery-1.10.2 获取checkbox的checked属性总是undefined
项目中用的jquery-1.10.2 需要检测一个checkbox的选中状态,想当然的用 .attr("checked") ,结果发现,无论是否选中,这个值都是 undefined ...
帮小黎解决问题C++巩固获得数字每个位置上的数
现在有一个数字 a= 12345; 想要取得这个数字上的没一个数字使用除法 +模除的方法可以获得原理:除(/)得到的是商模除(%)的到的是余数采用这种方式,先将要求的数的某一位 ...
【GNOME 边框】GNOME窗口无边框
今天我新装了一台LINUX UBUNTU12.10,顺手就把UNITY换成了GNOME,但是发现边框消失. 大概原因:metacity 不同版本配置之间的冲突. 解决办法:删除home目录下的旧配置文 ...
Qt带进度条的启动界面（继承QSplashScreen，然后使用定时器）
通过继承QSplashScreen类,得到CMySplashScreen类,然后在CMySplashScreen中定义QProgressBar变量,该变量以CMySplashScreen为父类,这样就 ...
《Programming WPF》翻译第5章 2.内嵌样式
原文:<Programming WPF>翻译第5章 2.内嵌样式每一个“可样式化”的WPF元素都有一个Style属性,可以在内部设置这个属性--使用XAML属性-元素的语法(在第一章讨 ...
Intel 凌动 D525 产品参数Intel 凌动 Z3735F 产品参数
https://item.taobao.com/item.htm?spm=a230r.1.14.8.kauehT&id=40450541158&ns=1&abbucket=19 ...
【转】（总结）Linux下su与su -命令的本质区别
原文网址:http://www.ha97.com/4001.html 本人以前一直习惯直接使用root,很少使用su,前几天才发现su与su -命令是有着本质区别的! 大部分Linux发行版的默认账户 ...
unix c 06
文件操作 fcntl-> 复制文件描述符/取文件状态/文件锁文件一系列函数-> access/chmod/truncate/... 目录操作相关函数:mkdir/rmdir/telld ...
unix c 03
C程序员的错误处理 errno/perror/strerror 都是系统设计好的自定义函数中的错误处理 1 可以返回-1 代表错误 2 指针类型可以用 NULL 代表错误 ...

Hadoop2.4.1入门实例：MaxTemperature

Hadoop2.4.1入门实例：MaxTemperature的更多相关文章

随机推荐

热门专题