hadoop2.2编程:MRUnit——Test MaxTemperatureMapper
继承关系1
1. java.lang.Object |__ org.apache.hadoop.mapreduce.JobContext |__org.apache.hadoop.mapreduce.TaskAttemptContext |__ org.apache.hadoop.mapreduce.TaskInputOutputContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> |__org.apache.hadoop.mapreduce.MapContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> |__ org.apache.hadoop.mapreduce.Mapper.Context Description: public class Mapper.Context extends MapContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> Constructor Summary: Mapper.Context(Configuration conf, TaskAttemptID taskid, RecordReader<KEYIN,VALUEIN> reader, RecordWriter<KEYOUT,VALUEOUT> writer, OutputCommitter committer, StatusReporter reporter, InputSplit split) Method Summary: Methods inherited from class org.apache.hadoop.mapreduce.MapContext getCurrentKey, getCurrentValue, getInputSplit, nextKeyValue Methods inherited from class org.apache.hadoop.mapreduce.TaskInputOutputContext getCounter, getCounter, getOutputCommitter, progress, setStatus, write Methods inherited from class org.apache.hadoop.mapreduce.TaskAttemptContext getStatus, getTaskAttemptID Methods inherited from class org.apache.hadoop.mapreduce.JobContext getCombinerClass, getConfiguration, getCredentials, getGroupingComparator, getInputFormatClass, getJar, getJobID, getJobName, getMapOutputKeyClass, getMapOutputValueClass, getMapperClass, getNumReduceTasks, getOutputFormatClass, getOutputKeyClass, getOutputValueClass, getPartitionerClass, getReducerClass, getSortComparator, getWorkingDirectory Methods inherited from class java.lang.Object clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait 2. java.lang.Object org.apache.hadoop.mapreduce.JobContext |_ org.apache.hadoop.mapreduce.TaskAttemptContext |_ org.apache.hadoop.mapreduce.TaskInputOutputContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> |_ org.apache.hadoop.mapreduce.ReduceContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> |_ org.apache.hadoop.mapreduce.Reducer.Context Description: public class Reducer.Contextextends ReduceContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> Constructor Summary: Reducer.Context(Configuration conf, TaskAttemptID taskid, RawKeyValueIterator input, Counter inputKeyCounter, Counter inputValueCounter, RecordWriter<KEYOUT,VALUEOUT> output, OutputCommitter committer, StatusReporter reporter, RawComparator<KEYIN> comparator, Class<KEYIN> keyClass, Class<VALUEIN> valueClass) Method Summary: Methods inherited from class org.apache.hadoop.mapreduce.ReduceContext getCurrentKey, getCurrentValue, getValues, nextKey, nextKeyValue Methods inherited from class org.apache.hadoop.mapreduce.TaskInputOutputContext getCounter, getCounter, getOutputCommitter, progress, setStatus, write Methods inherited from class org.apache.hadoop.mapreduce.TaskAttemptContext getStatus, getTaskAttemptID Methods inherited from class org.apache.hadoop.mapreduce.JobContext getCombinerClass, getConfiguration, getCredentials, getGroupingComparator, getInputFormatClass, getJar, getJobID, getJobName, getMapOutputKeyClass, getMapOutputValueClass, getMapperClass, getNumReduceTasks, getOutputFormatClass, getOutputKeyClass, getOutputValueClass, getPartitionerClass, getReducerClass, getSortComparator, getWorkingDirectory Methods inherited from class java.lang.Object clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
继承关系2
- org.junit.Test (implements java.lang.annotation.Annotation)
- org.junit.Rule (implements java.lang.annotation.Annotation)
- org.junit.Ignore (implements java.lang.annotation.Annotation)
- org.junit.ClassRule (implements java.lang.annotation.Annotation)
- org.junit.BeforeClass (implements java.lang.annotation.Annotation)
- org.junit.Before (implements java.lang.annotation.Annotation)
- org.junit.AfterClass (implements java.lang.annotation.Annotation)
- org.junit.After (implements java.lang.annotation.Annotation)
Code
1.MaxTemperatureMapper.java
import java.io.IOException; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; public class MaxTemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable> { @Override public void map(LongWritable key, Text, value, Context context) throws IOException, InterruptedExceptioin { String line = value.toString(); String year = line.subString(15,19); int airTemperature = Integer.parseInt(line.subString(87,92)); context.write(new Text(year), new IntWritable(airTemperature)); } }
2.MaxTemperatureMapperTest.java
import java.io.IOException; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.junit.Test; import org.apache.hadoop.mrunit.mapreduce.MapDriver; public class MaxTemperatureMapperTest { @Test public void processesValidRecord() throws IOException { Text value = new Text("0043011990999991950051518004+68750+023550FM-12+0382" + // Year ^^^^ "99999V0203201N00261220001CN9999999N9-00111+99999999999"); // Temperature ^^^^^ new MapDriver<LongWritable, Text, Text, IntWritable>() .withMapper(new MaxTemperatureMapper()) .withInput(new LongWritable(1), value) .withOutput(new Text("1950"), new IntWritable(-11)) .runTest(); } }
注意一些deprecated的class和methods:
org.apache.hadoop.mrunit.MapDriver<K1,V1,K2,V2>被弃用应该可以理解,此类是为mapreduce的旧API(比如org.apache.hadoop.mapred)写的,比如其中一个方法
MapDriver<K1,V1,K2,V2> |
withMapper(org.apache.hadoop.mapred.Mapper<K1,V1,K2,V2> m) |
mapreduce的新API为org.apache.hadoop.mapreduce.*; 与之对应MRUnit的MapDriver(包括ReduceDriver)为:
org.apache.hadoop.mrunit.mapreduce.MapDriver<K1,V1,K2,V2>, 同样的,上述方法变为:
MapDriver<K1,V1,K2,V2> |
withCounters(org.apache.hadoop.mapreduce.Counters ctrs) |
MapDriverBase class中的T withInputValue(V1 val) 被弃用,改为T withInput(K1 key, V1 val) ,还有很多,不详列。
执行步骤:
注意: 需要下载MRUnit并编译,在/home/user/.bashrc下设置MRUnit_HOME变量, 之后修改$HADOOP_HOME/libexec/hadoop-config.sh,将$MRUnit_HOME/lib/*.jar添加进去, 之后source $HADOOP_HOME/libexec/hadoop-config.sh,再执行下面操作:
javac -d class/ MaxTemperatureMapper.java MaxTemperatureMapperTest.java jar -cvf test.jar -C class ./ java -cp test.jar:$CLASSPATH org.junit.runner.JUnitCore MaxTemperatureMapperTest # or yarn -cp test.jar:$CLASSPATH org.junit.runner.JUnitCore MaxTemperatureMapperTest
hadoop2.2编程:MRUnit——Test MaxTemperatureMapper的更多相关文章
- hadoop2.2编程:MRUnit测试
引用地址:http://www.cnblogs.com/lucius/p/3442381.html examples: Overview This document explains how to w ...
- hadoop2.2编程:MRUnit
examples: Overview This document explains how to write unit tests for your map reduce code, and test ...
- hadoop2.2编程:各种API
hadoop2.2 API http://hadoop.apache.org/docs/r0.23.9/api/index.html junit API http://junit.org/javado ...
- hadoop2.2编程:使用MapReduce编程实例(转)
原文链接:http://www.cnblogs.com/xia520pi/archive/2012/06/04/2534533.html 从网上搜到的一篇hadoop的编程实例,对于初学者真是帮助太大 ...
- hadoop2.2编程:DFS API 操作
1. Reading data from a hadoop URL 说明:想要让java从hadoop的dfs里读取数据,则java 必须能够识别hadoop hdfs URL schema, 因此我 ...
- hadoop2.2编程: 重写comparactor
要点: 类型比较在hadoop的mapreduce中非常重要,主要用来比较keys; hadoop中的RawComparator<T>接口继承自java的comparator, 主要用来比 ...
- hadoop2.2编程: SequenceFileWritDemo
import java.io.IOException; import java.net.URI; import org.apache.hadoop.fs.FileSystem; import org. ...
- hadoop2.2编程:从default mapreduce program 来理解mapreduce
下面写一个default mapreduce 的程序: import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapr ...
- Hadoop2.2编程:新旧API的区别
Hadoop最新版本的MapReduce Release 0.20.0的API包括了一个全新的Mapreduce JAVA API,有时候也称为上下文对象. 新的API类型上不兼容以前的API,所以, ...
随机推荐
- 虚拟光驱 DAEMON Tools Lite ——安装与入门
DAEMON Tools Lite 是什么?它不仅仅是虚拟光驱.是的,你可以使用它制作.加载光盘映像,但是 DAEMON Tools 产品那么多,Lite版与其他版本究竟有什么不同呢?或者说,是什么让 ...
- 07_控制线程_join_线程插队
[join线程简述] join()方法:Thread提供的让一个线程去等待另一个线程完成.当在某个程序执行流中(如main线程)调用其它线程(如t2线程)的join方法(t2.join()),调用线程 ...
- BYTE、WORD与DWORD类型
Original Link: http://hi.baidu.com/vnxuaqndtncrxyr/item/f67c83872cf80cd65e0ec10d Author: 厚积薄发 在Visu ...
- POJ 2559 Largest Rectangle in a Histogram -- 动态规划
题目地址:http://poj.org/problem?id=2559 Description A histogram is a polygon composed of a sequence of r ...
- treeview OnSelectedNodeChanged js的方法
可以在OnSelectedNodeChanged的cs中,对node赋值如此: nod.Text = "<span onclick=''>" + node名称 + &q ...
- DataGridView几个基本属性
DataGridView 经常用到,但是很多东西都不熟悉,以至于总去上网查,这次我整理一下,全部都记下来,再用就方便了. 1.禁止用户新建行,就是去掉最后那个行标题上带星号的那个行 dataGridV ...
- 为什么Laravel是最成功的PHP框架?
Laravel 是一个有着美好前景的年轻框架,它的社区充满着活力,相关的文档和教程完整而清晰,并为快速.安全地开发现代应用程序提供了必要的功能.在近几年对PHP 框架流行度的统计中,Laravel始终 ...
- nuget的使用总结
使用NuGet发布自己的类库包(Library Package) from:http://blog.csdn.net/gulijiang2008/article/details/41724927 使用 ...
- 数据库获取前N条记录SQL Server与SQLite的区别
在使用sql语句进行前20条记录查询时SQL Server可以这样写: 1: select top 20 * from [table] order by ids desc 2: select top ...
- iOS最新上线流程+续费 2015-7-20更新
一.程序上线前准备 确认图标是否⻬全,应⽤的icon图标 在以前图⽚片直接命名 为icon就可以了,在xcode5以后,苹果加 ⼊入了images.xcasset这个⽂文件夹,所有的 图标全都在这⾥里 ...