关于MapReduce中自定义带比较key类、比较器类(二)——初学者从源码查看其原理
/**
* Define the comparator that controls
* how the keys are sorted before they
* are passed to the {@link Reducer}.
* @param cls the raw comparator
* @see #setCombinerKeyGroupingComparatorClass(Class)
*/
publicvoid setSortComparatorClass(Class<? extends RawComparator> cls
) throws IllegalStateException{
ensureState(JobState.DEFINE);
conf.setOutputKeyComparatorClass(cls);
}
Define the comparator that controls
how the keys are sorted before they
/**
* Set the {@link RawComparator} comparator used to compare keys.
* @param theClass the {@link RawComparator} comparator used to
* compare keys.
* @see #setOutputValueGroupingComparator(Class)
*/
设定用于比较key的比较器,theClass参数就是那个比较器啦
publicvoid setOutputKeyComparatorClass(Class<?extendsRawComparator> theClass){
setClass(JobContext.KEY_COMPARATOR,
theClass,RawComparator.class);
}
Set the {@link RawComparator} comparator used to compare keys.
* @param theClass the {@link RawComparator} comparator used to
* compare keys.
setClass(JobContext.KEY_COMPARATOR,
theClass,RawComparator.class);
/**
* Get the {@link RawComparator} comparator used to compare keys.
获取到一个用于比较key的比较器,并返回,返回类型是RawComparator
* @return the {@link RawComparator} comparator used to compare keys.
*/
publicRawComparator getOutputKeyComparator(){
Class<? extends RawComparator> theClass = getClass(
JobContext.KEY_COMPARATOR, null,RawComparator.class);
如果KEY_COMPARATOR属性中没值,则返回null
if(theClass != null)
returnReflectionUtils.newInstance(theClass,this);
如果不为空,则就通过反射创建theClass
否则,使用默认的
returnWritableComparator.get(getMapOutputKeyClass().
asSubclass(WritableComparable.class),this);
}
if(theClass != null)
returnReflectionUtils.newInstance(theClass,this);
/**
* Compare logical range, st i, j MOD offset capacity.
* Compare by partition, then by key.
* @see IndexedSortable#compare
*/
publicint compare(final int mi, final int mj){
final int kvi = offsetFor(mi % maxRec);
final int kvj = offsetFor(mj % maxRec);
final int kvip = kvmeta.get(kvi + PARTITION);
final int kvjp = kvmeta.get(kvj + PARTITION);
// sort by partition
if(kvip != kvjp){
return kvip - kvjp;
}
// sort by key
return comparator.compare(kvbuffer,
kvmeta.get(kvi + KEYSTART),
kvmeta.get(kvi + VALSTART)- kvmeta.get(kvi + KEYSTART),
kvbuffer,
kvmeta.get(kvj + KEYSTART),
kvmeta.get(kvj + VALSTART)- kvmeta.get(kvj + KEYSTART));
- }
/** Optimization hook. Override this to make SequenceFile.Sorter's scream.
*
* <p>The default implementation reads the data into two {@link
* WritableComparable}s (using {@link
* Writable#readFields(DataInput)}, then calls {@link
* #compare(WritableComparable,WritableComparable)}.
*/
@Override
publicint compare(byte[] b1,int s1,int l1, byte[] b2,int s2,int l2){
try{
buffer.reset(b1, s1, l1); // parse key1
key1.readFields(buffer);
buffer.reset(b2, s2, l2); // parse key2
key2.readFields(buffer);
}catch(IOException e){
thrownewRuntimeException(e);
}
return compare(key1, key2); // compare them
}
/** Compare two WritableComparables.
* <p> The default implementation uses the natural ordering, calling {@link
* Comparable#compareTo(Object)}. */
@SuppressWarnings("unchecked")
publicint compare(WritableComparable a,WritableComparable b){
return a.compareTo(b);
}
/**
* Get the key class for the map output data. If it is not set, use the
* (final) output key class. This allows the map output key class to be
* different than the final output key class.
*
* @return the map output key class.
*/
publicClass<?> getMapOutputKeyClass(){
Class<?> retv = getClass(JobContext.MAP_OUTPUT_KEY_CLASS, null,Object.class);
if(retv == null){
retv = getOutputKeyClass();
}
return retv;
}
public interface WritableComparable<T> extends Writable,Comparable<T>
/**
* A serializable object which implements a simple, efficient, serialization
* protocol, based on {@link DataInput} and {@link DataOutput}.
一个实现了一个简单高效的序列化协议(基于....)的可序列化的对象
* <p>Any <code>key</code> or <code>value</code> type in the Hadoop Map-Reduce
* framework implements this interface.</p>
在hadoop mp框架中。任何一个key或者value类型实现该接口
(意思就是说,任意键和值所属的类型应该实现该接口咯)- 比如Text,IntWritable我们查看查看Text类的源码验证之
publicclassText extends BinaryComparable
implements WritableComparable<BinaryComparable>{}
*<p>Implementations typically implement a static<code>read(DataInput)</code>
* method which constructs a new instance, calls {@link#readFields(DataInput)}
* and returns the instance.</p>
实现类通常实现一个静态的read方法——它构建一个新的实例,调用readFields,返回实例
<p>Example:</p>
*<p><blockquote><pre>
* publicclassMyWritableComparable implements WritableComparable<MyWritableComparable>{
* // Some data
* privateint counter;
* privatelong timestamp;
*
* publicvoid write(DataOutput out) throws IOException{
* out.writeInt(counter);
* out.writeLong(timestamp);
* }
*
* publicvoid readFields(DataInput in) throws IOException{
* counter = in.readInt();
* timestamp = in.readLong();
* }
*
* publicint compareTo(MyWritableComparable o){
* int thisValue =this.value;
* int thatValue = o.value;
* return(thisValue < thatValue ?-1:(thisValue==thatValue ?0:1));
* }
*
* publicint hashCode(){
* final int prime =31;
* int result =1;
* result = prime * result + counter;
* result = prime * result +(int)(timestamp ^(timestamp >>>32));
* return result
* }
* }
classWritableComparator implements RawComparator,Configurable
A Comparatorfor{@linkWritableComparable}s.
*<p>This base implemenation uses the natural ordering. To define alternate
* orderings, override {@link#compare(WritableComparable,WritableComparable)}.
*<p>One may optimize compare-intensive operations by overriding
*{@link#compare(byte[],int,int,byte[],int,int)}. Static utility methods are
* provided to assist in optimized implementations of this method.
关于MapReduce中自定义带比较key类、比较器类(二)——初学者从源码查看其原理的更多相关文章
- MapReduce中一次reduce方法的调用中key的值不断变化分析及源码解析
摘要:mapreduce中执行reduce(KEYIN key, Iterable<VALUEIN> values, Context context),调用一次reduce方法,迭代val ...
- 别翻了,这篇文章绝对让你深刻理解java类的加载以及ClassLoader源码分析【JVM篇二】
目录 1.什么是类的加载(类初始化) 2.类的生命周期 3.接口的加载过程 4.解开开篇的面试题 5.理解首次主动使用 6.类加载器 7.关于命名空间 8.JVM类加载机制 9.双亲委派模型 10.C ...
- React key究竟有什么作用?深入源码不背概念,五个问题刷新你对于key的认知
壹 ❀ 引 我在[react]什么是fiber?fiber解决了什么问题?从源码角度深入了解fiber运行机制与diff执行一文中介绍了react对于fiber处理的协调与提交两个阶段,而在介绍协调时 ...
- Tomcat源码分析——请求原理分析(中)
前言 在<TOMCAT源码分析——请求原理分析(上)>一文中已经介绍了关于Tomcat7.0处理请求前作的初始化和准备工作,请读者在阅读本文前确保掌握<TOMCAT源码分析——请求原 ...
- 关于MapReduce中自定义分组类(三)
Job类 /** * Define the comparator that controls which keys are grouped together * for a single ...
- Android源码分析(十二)-----Android源码中如何自定义TextView实现滚动效果
一:如何自定义TextView实现滚动效果 继承TextView基类 重写构造方法 修改isFocused()方法,获取焦点. /* * Copyright (C) 2015 The Android ...
- Springboot+Redisson自定义注解一次解决重复提交问题(含源码)
前言 项目中经常会出现重复提交的问题,而接口幂等性也一直以来是做任何项目都要关注的疑难点,网上可以查到非常多的方案,我归纳了几点如下: 1).数据库层面,对责任字段设置唯一索引,这是最直接有效 ...
- 外部配置属性值是如何被绑定到XxxProperties类属性上的?--SpringBoot源码(五)
注:该源码分析对应SpringBoot版本为2.1.0.RELEASE 1 前言 本篇接 SpringBoot是如何实现自动配置的?--SpringBoot源码(四) 温故而知新,我们来简单回顾一下上 ...
- 玩转控件:重绘DEVEXPRESS中DateEdit控件 —— 让DateEdit支持只选择年月 (提供源码下载)
前言 上一篇博文<玩转控件:重绘ComboBox —— 让ComboBox多列显示>中,根据大家的回馈,ComboBox已经支持筛选了,更新见博文最后最后最后面. 奇葩 这两天遇到 ...
随机推荐
- 理解Java对象序列化
http://www.blogjava.net/jiangshachina/archive/2012/02/13/369898.html 1. 什么是Java对象序列化 Java平台允许我们在内存中创 ...
- [css]全屏背景图片设置,django加载图片路径
<head><style type="text/css"> #bg { position:fixed; top:; left:; width:100%; h ...
- 方法传参(JAVA与C的比较)
java代码:public class Test{ static int e; // 默认是0 static String f; //默认是null public static void main(S ...
- ubuntu14.04 yuv文件的播放及视频信息的查看
1.安装ffmpeg sudo add-apt-repository ppa:mc3man/trusty-media sudo apt-get update sudo apt-get install ...
- [LeetCode] Alien Dictionary 另类字典
There is a new alien language which uses the latin alphabet. However, the order among letters are un ...
- knockout学习笔记目录
关于knockout学习系列的文章已经写完,这里主要是做个总结,并且将目录罗列出来,方便查看.欢迎各位大神拍砖和讨论. 总结 kncokout是一个轻量级的UI类库,通过MVVM模式使前端的UI简单话 ...
- QinQ
7.3.1 QinQ技术诞生的背景 --<华为交换机学习指南> QinQ最初主要是为扩展VLAN ID空间而产生的,但随着城域以太网的发展以及运营商精细化运作的要求,QinQ的双层标签又有 ...
- AngularJS依赖注入
<html> <head> <meta charset="utf-8"> <title>AngularJS 依赖注入</tit ...
- 研究Extension和Category的一个例子
Category: 1. 无法添加实例变量 2.将类的实现分散到多个不同文件或多个不同框架中. Extension: 1. 可以添加实例变量 注: 如果Category的头文件中也使用Extensio ...
- c++多重继承
可以在子类中通过基类名访问函数 // oj4.cpp : Defines the entry point for the console application.// #include "s ...