1、WordCount

  Job类:

  1. package com.simope.mr.wcFor;
  2.  
  3. import org.apache.hadoop.conf.Configuration;
  4. import org.apache.hadoop.fs.Path;
  5. import org.apache.hadoop.io.IntWritable;
  6. import org.apache.hadoop.io.Text;
  7. import org.apache.hadoop.mapreduce.Job;
  8. import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
  9. import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
  10.  
  11. /**
  12. * @deprecated 统计文本单词个数
  13. * @author JimLy
  14. * @see 20150127
  15. * */
  16. public class WcForJob {
  17.  
  18. public static void main(String[] args) {
  19.  
  20. Configuration conf = new Configuration();
  21.  
  22. try {
  23. Job job = new Job(conf);
  24.  
  25. job.setJobName("myWC");
  26. job.setJarByClass(WcForJob.class);
  27. job.setMapperClass(WcForMapper.class);
  28. job.setReducerClass(WcForReducer.class);
  29. job.setMapOutputKeyClass(Text.class);
  30. job.setMapOutputValueClass(IntWritable.class);
  31.  
  32. FileInputFormat.addInputPath(job, new Path("/usr/input/myWc"));
  33. FileOutputFormat.setOutputPath(job, new Path("/usr/output/myWc"));
  34.  
  35. System.exit(job.waitForCompletion(true) ? 0 : 1);
  36. } catch (Exception e) {
  37. System.out.println("错误信息:" + e);
  38. }
  39.  
  40. }
  41.  
  42. }

  Mapper类:

  1. package com.simope.mr.wcFor;
  2.  
  3. import java.io.IOException;
  4. import java.util.StringTokenizer;
  5.  
  6. import org.apache.hadoop.io.IntWritable;
  7. import org.apache.hadoop.io.LongWritable;
  8. import org.apache.hadoop.io.Text;
  9. import org.apache.hadoop.mapreduce.Mapper;
  10.  
  11. public class WcForMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
  12.  
  13. @Override
  14. protected void map(LongWritable key, Text value,
  15. Context context)
  16. throws IOException, InterruptedException {
  17.  
  18. String line = value.toString();
  19. StringTokenizer st = new StringTokenizer(line);
  20.  
  21. while (st.hasMoreElements()) {
  22. context.write(new Text(st.nextToken()), new IntWritable(1));
  23. }
  24.  
  25. }
  26.  
  27. }

  Reducer类:

  1. package com.simope.mr.wcFor;
  2.  
  3. import java.io.IOException;
  4.  
  5. import org.apache.hadoop.io.IntWritable;
  6. import org.apache.hadoop.io.Text;
  7. import org.apache.hadoop.mapreduce.Reducer;
  8.  
  9. public class WcForReducer extends Reducer<Text, IntWritable, Text, IntWritable>{
  10.  
  11. @Override
  12. protected void reduce(Text key, Iterable<IntWritable> value,
  13. Context context)
  14. throws IOException, InterruptedException {
  15.  
  16. int sum = 0;
  17.  
  18. for (IntWritable i : value) {
  19. sum += i.get();
  20. }
  21. context.write(key, new IntWritable(sum));
  22. }
  23.  
  24. }

  文本输入:

  

  统计输出:

    

  


2、单列排序

  Job类:

  1. package com.simope.mr.sort;
  2.  
  3. import org.apache.hadoop.conf.Configuration;
  4. import org.apache.hadoop.fs.Path;
  5. import org.apache.hadoop.io.IntWritable;
  6. import org.apache.hadoop.mapreduce.Job;
  7. import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
  8. import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
  9.  
  10. /**
  11. * @deprecated 对数字进行排序
  12. * @author JimLy
  13. * @see 20160127
  14. * */
  15. public class SortJob {
  16.  
  17. public static void main(String[] args) {
  18.  
  19. Configuration conf = new Configuration();
  20.  
  21. try {
  22. Job job = new Job(conf);
  23.  
  24. job.setJobName("sortJob");
  25. job.setJarByClass(SortJob.class);
  26. job.setMapperClass(SortMapper.class);
  27. job.setReducerClass(SortReducer.class);
  28. job.setMapOutputKeyClass(IntWritable.class);
  29. job.setMapOutputValueClass(IntWritable.class);
  30.  
  31. FileInputFormat.addInputPath(job, new Path("/usr/input/sort"));
  32. FileOutputFormat.setOutputPath(job, new Path("/usr/output/sort"));
  33.  
  34. System.exit(job.waitForCompletion(true) ? 0 : 1);
  35. } catch (Exception e) {
  36. System.out.println("错误信息:" + e);
  37. }
  38.  
  39. }
  40.  
  41. }

  Mapper类:

  1. package com.simope.mr.sort;
  2.  
  3. import java.io.IOException;
  4.  
  5. import org.apache.hadoop.io.IntWritable;
  6. import org.apache.hadoop.io.LongWritable;
  7. import org.apache.hadoop.io.Text;
  8. import org.apache.hadoop.mapreduce.Mapper;
  9.  
  10. public class SortMapper extends Mapper<LongWritable, Text, IntWritable, IntWritable>{
  11.  
  12. @Override
  13. protected void map(LongWritable key, Text value,
  14. Context context)
  15. throws IOException, InterruptedException {
  16.  
  17. String line = value.toString();
  18.  
  19. context.write(new IntWritable(Integer.parseInt(line)), new IntWritable(1));
  20.  
  21. }
  22.  
  23. }

  Reducer类:

  1. package com.simope.mr.sort;
  2.  
  3. import java.io.IOException;
  4.  
  5. import org.apache.hadoop.io.IntWritable;
  6. import org.apache.hadoop.mapreduce.Reducer;
  7.  
  8. public class SortReducer extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
  9.  
  10. private static IntWritable lineNum = new IntWritable(1);
  11.  
  12. @SuppressWarnings("unused")
  13. @Override
  14. protected void reduce(IntWritable key, Iterable<IntWritable> value,
  15. Context context)
  16. throws IOException, InterruptedException {
  17.  
  18. //考虑到有相同的值
  19. for (IntWritable val : value) {
  20. context.write(lineNum, key);
  21. lineNum = new IntWritable(lineNum.get() + 1);
  22. }
  23.  
  24. }
  25.  
  26. }

  输入文本:

 file1: file2:file3:

  统计输出:

  


3、计算学科平均成绩

  Job类:

  1. package com.simope.mr.average;
  2.  
  3. import org.apache.hadoop.conf.Configuration;
  4. import org.apache.hadoop.fs.Path;
  5. import org.apache.hadoop.io.IntWritable;
  6. import org.apache.hadoop.io.Text;
  7. import org.apache.hadoop.mapreduce.Job;
  8. import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
  9. import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
  10.  
  11. /**
  12. * @deprecated 计算学生的平均成绩
  13. * @author JimLy
  14. * @see 20160127
  15. * */
  16. public class AveJob {
  17.  
  18. public static void main(String[] args) {
  19.  
  20. Configuration conf = new Configuration();
  21.  
  22. try {
  23. Job job = new Job(conf);
  24.  
  25. job.setJobName("AveJob");
  26. job.setJarByClass(AveJob.class);
  27. job.setMapperClass(AveMapper.class);
  28. job.setReducerClass(AveReducer.class);
  29. job.setMapOutputKeyClass(Text.class);
  30. job.setMapOutputValueClass(IntWritable.class);
  31.  
  32. FileInputFormat.addInputPath(job, new Path("/usr/input/average"));
  33. FileOutputFormat.setOutputPath(job, new Path("/usr/output/average"));
  34.  
  35. System.exit(job.waitForCompletion(true) ? 0 : 1);
  36. } catch (Exception e) {
  37. System.out.println("错误信息:" + e);
  38. }
  39.  
  40. }
  41.  
  42. }

  Mapper类:

  1. package com.simope.mr.average;
  2.  
  3. import java.io.IOException;
  4. import java.io.UnsupportedEncodingException;
  5.  
  6. import org.apache.hadoop.io.IntWritable;
  7. import org.apache.hadoop.io.LongWritable;
  8. import org.apache.hadoop.io.Text;
  9. import org.apache.hadoop.mapreduce.Mapper;
  10.  
  11. public class AveMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
  12.  
  13. String line;
  14.  
  15. @Override
  16. protected void map(LongWritable key, Text value,
  17. Context context)
  18. throws IOException, InterruptedException {
  19.  
  20. line = changeTextToUTF8(value, "GBK").toString();
  21.  
  22. String[] stuArr = line.split("\t");
  23.  
  24. context.write(new Text(stuArr[0]), new IntWritable(Integer.parseInt(stuArr[1])));
  25.  
  26. }
  27.  
  28. public static Text changeTextToUTF8(Text text, String encoding) {
  29. String value = null;
  30. try {
  31. value = new String(text.getBytes(), 0, text.getLength(), encoding);
  32. } catch (UnsupportedEncodingException e) {
  33. e.printStackTrace();
  34. }
  35. return new Text(value);
  36. }
  37. }

  Reducer类:

  1. package com.simope.mr.average;
  2.  
  3. import java.io.IOException;
  4.  
  5. import org.apache.hadoop.io.IntWritable;
  6. import org.apache.hadoop.io.Text;
  7. import org.apache.hadoop.mapreduce.Reducer;
  8.  
  9. public class AveReducer extends Reducer<Text, IntWritable, Text, IntWritable>{
  10.  
  11. int count, sum;
  12.  
  13. @Override
  14. protected void reduce(Text key, Iterable<IntWritable> value,
  15. Context context)
  16. throws IOException, InterruptedException {
  17.  
  18. sum = 0;
  19. count = 0;
  20.  
  21. for (IntWritable i : value) {
  22. count++;
  23. sum += i.get();
  24. }
  25. context.write(key, new IntWritable(sum/count));
  26. }
  27.  
  28. }

  文本输入:

china:english:math:

  统计输出:

  附:乱码问题由于hadoop中强制以UTF-8编码格式,而我用的是GBK,未进行转码。


4、族谱:

  Job类:

  1. package com.simope.mr.grand;
  2.  
  3. import org.apache.hadoop.conf.Configuration;
  4. import org.apache.hadoop.fs.Path;
  5. import org.apache.hadoop.io.Text;
  6. import org.apache.hadoop.mapreduce.Job;
  7. import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
  8. import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
  9.  
  10. /**
  11. * @description 家族族谱
  12. * @author JimLy
  13. * @see 20160128
  14. * */
  15. public class GrandJob {
  16.  
  17. public static void main(String[] args) {
  18. Configuration conf = new Configuration();
  19.  
  20. try {
  21. Job job = new Job(conf);
  22.  
  23. job.setJobName("GrandJob");
  24. job.setJarByClass(GrandJob.class);
  25. job.setMapperClass(GrandMapper.class);
  26. job.setReducerClass(GrandReducer.class);
  27. job.setMapOutputKeyClass(Text.class);
  28. job.setMapOutputValueClass(Text.class);
  29.  
  30. FileInputFormat.addInputPath(job, new Path("/usr/input/grand"));
  31. FileOutputFormat.setOutputPath(job, new Path("/usr/output/grand"));
  32.  
  33. System.exit(job.waitForCompletion(true) ? 0 : 1);
  34. } catch (Exception e) {
  35. System.out.println("错误信息:" + e);
  36. }
  37.  
  38. }
  39.  
  40. }

  Mapper类:

  1. package com.simope.mr.grand;
  2.  
  3. import java.io.IOException;
  4.  
  5. import org.apache.hadoop.io.LongWritable;
  6. import org.apache.hadoop.io.Text;
  7. import org.apache.hadoop.mapreduce.Mapper;
  8.  
  9. public class GrandMapper extends Mapper<LongWritable, Text, Text, Text>{
  10.  
  11. @Override
  12. protected void map(LongWritable key, Text value,
  13. Context context)
  14. throws IOException, InterruptedException {
  15.  
  16. String line = value.toString();
  17. String[] faArr = line.split("\t");
  18.  
  19. if (faArr.length == 2) {
  20. if (!faArr[0].equals("parent")) {
  21. context.write(new Text(faArr[0]), new Text(faArr[0] + "_" + faArr[1]));
  22. context.write(new Text(faArr[1]), new Text(faArr[0] + "_" + faArr[1]));
  23. }
  24.  
  25. }
  26.  
  27. }
  28.  
  29. }

  Reducer类:

  1. package com.simope.mr.grand;
  2.  
  3. import java.io.IOException;
  4. import java.util.ArrayList;
  5. import java.util.List;
  6.  
  7. import org.apache.hadoop.io.Text;
  8. import org.apache.hadoop.mapreduce.Reducer;
  9.  
  10. public class GrandReducer extends Reducer<Text, Text, Text, Text>{
  11.  
  12. private static int time = 0;
  13.  
  14. @Override
  15. protected void reduce(Text key, Iterable<Text> value,
  16. Context context) throws IOException,
  17. InterruptedException {
  18.  
  19. List<String> paList = new ArrayList<String>();
  20. List<String> chList = new ArrayList<String>();
  21.  
  22. String info;
  23. String[] arr;
  24. for (Text i : value) {
  25. info = i.toString();
  26. arr = info.split("_");
  27. if (arr.length == 2) {
  28. paList.add(arr[0]);
  29. chList.add(arr[1]);
  30. }
  31. }
  32.  
  33. if (time == 0) {
  34. context.write(new Text("grandParent"), new Text("grandChild"));
  35. time++;
  36. }
  37.  
  38. for (int i = 0; i < paList.size(); i++) {
  39. for (int j = 0; j < chList.size(); j++) {
  40. if (paList.get(i).equals(chList.get(j))) {
  41. context.write(new Text(paList.get(j)), new Text(chList.get(i)));
  42. time++;
  43. }
  44. }
  45. }
  46.  
  47. }
  48.  
  49. }

  输入文本:

    file1:file2:

  统计输出:

  


5、二次排序:

  Job类:

  1. package com.simope.mr.secOrder;
  2.  
  3. import org.apache.hadoop.conf.Configuration;
  4. import org.apache.hadoop.fs.Path;
  5. import org.apache.hadoop.io.IntWritable;
  6. import org.apache.hadoop.mapreduce.Job;
  7. import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
  8. import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
  9.  
  10. /**
  11. * @deprecated 二次排序
  12. * @author JimLy
  13. * @see 20160129
  14. * */
  15. public class SecOrderJob {
  16.  
  17. public static void main(String[] args) {
  18. Configuration conf = new Configuration();
  19.  
  20. try {
  21. Job job = new Job(conf);
  22.  
  23. job.setJobName("SecOrderJob");
  24. job.setJarByClass(SecOrderJob.class);
  25. job.setMapperClass(SecOrderMapper.class);
  26. job.setReducerClass(SecOrderReducer.class);
  27. job.setMapOutputKeyClass(IntWritable.class);
  28. job.setMapOutputValueClass(IntWritable.class);
  29.  
  30. FileInputFormat.addInputPath(job, new Path("/usr/input/secOrder"));
  31. FileOutputFormat.setOutputPath(job, new Path("/usr/output/secOrder"));
  32.  
  33. System.exit(job.waitForCompletion(true) ? 0 : 1);
  34. } catch (Exception e) {
  35. System.out.println("错误信息:" + e);
  36. }
  37. }
  38.  
  39. }

  Mapper类:

  1. package com.simope.mr.secOrder;
  2.  
  3. import java.io.IOException;
  4.  
  5. import org.apache.hadoop.io.IntWritable;
  6. import org.apache.hadoop.io.LongWritable;
  7. import org.apache.hadoop.io.Text;
  8. import org.apache.hadoop.mapreduce.Mapper;
  9.  
  10. public class SecOrderMapper extends Mapper<LongWritable, Text, IntWritable, IntWritable>{
  11.  
  12. @Override
  13. protected void map(LongWritable key, Text value,
  14. Context context)
  15. throws IOException, InterruptedException {
  16.  
  17. String line = value.toString();
  18.  
  19. String[] numArr = line.split("\t");
  20.  
  21. if (numArr.length == 2) {
  22. context.write(new IntWritable(Integer.parseInt(numArr[0])), new IntWritable(Integer.parseInt(numArr[1])));
  23. }
  24.  
  25. }
  26.  
  27. }

  Reducer类:

  1. package com.simope.mr.secOrder;
  2.  
  3. import java.io.IOException;
  4. import java.util.ArrayList;
  5. import java.util.List;
  6.  
  7. import org.apache.hadoop.io.IntWritable;
  8. import org.apache.hadoop.io.Text;
  9. import org.apache.hadoop.mapreduce.Reducer;
  10.  
  11. public class SecOrderReducer extends Reducer<IntWritable, IntWritable, IntWritable, Text>{
  12.  
  13. @Override
  14. protected void reduce(IntWritable key, Iterable<IntWritable> value,
  15. Context context)
  16. throws IOException, InterruptedException {
  17.  
  18. String str = "";
  19.  
  20. for (IntWritable i : value) {
  21. str = str + "#" + i.get();
  22. }
  23.  
  24. str = str.substring(1, str.length());
  25.  
  26. String[] numArr = str.split("#");
  27.  
  28. String temp;
  29.  
  30. for (int i = 0; i < numArr.length; i++) {
  31. for (int j = 0; j < numArr.length; j++) {
  32. if (Integer.parseInt(numArr[j]) > Integer.parseInt(numArr[i])) {
  33. temp = numArr[i];
  34. numArr[i] = numArr[j];
  35. numArr[j] = temp;
  36. }
  37. }
  38. }
  39.  
  40. for (int i = 0; i < numArr.length; i++) {
  41. context.write(key, new Text(numArr[i]));
  42. }
  43. }
  44. }

  输入文本:

  

  统计输出:


6、计算1949-1955年中每年温度最高前10天

  RunJob类:

  1. package com.simope.mr;
  2.  
  3. import java.io.IOException;
  4. import java.text.SimpleDateFormat;
  5. import java.util.Calendar;
  6. import java.util.Date;
  7.  
  8. import org.apache.hadoop.conf.Configuration;
  9. import org.apache.hadoop.fs.Path;
  10. import org.apache.hadoop.io.LongWritable;
  11. import org.apache.hadoop.io.Text;
  12. import org.apache.hadoop.mapreduce.Job;
  13. import org.apache.hadoop.mapreduce.Mapper;
  14. import org.apache.hadoop.mapreduce.Reducer;
  15. import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
  16. import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
  17.  
  18. public class RunJob {
  19.  
  20. public static SimpleDateFormat SDF = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
  21.  
  22. static class HotMapper extends Mapper<LongWritable, Text, KeyPari, Text> {
  23.  
  24. @Override
  25. protected void map(LongWritable key, Text value,
  26. Context context)
  27. throws IOException, InterruptedException {
  28. String line = value.toString();
  29.  
  30. String[] ss = line.split("\t");
  31.  
  32. if (ss.length == 2) {
  33. try {
  34. Date date = SDF.parse(ss[0]);
  35. Calendar c = Calendar.getInstance();
  36. c.setTime(date);
  37. int year = c.get(1);
  38. String hot = ss[1].substring(0, ss[1].indexOf("C"));
  39. KeyPari kp = new KeyPari();
  40. kp.setYear(year);
  41. kp.setHot(Integer.parseInt(hot));
  42. context.write(kp, value);
  43. } catch (Exception e) {
  44. e.printStackTrace();
  45. }
  46. }
  47. }
  48. }
  49.  
  50. static class HotReduce extends Reducer<KeyPari, Text, KeyPari, Text> {
  51. @Override
  52. protected void reduce(KeyPari kp, Iterable<Text> value,
  53. Context context)
  54. throws IOException, InterruptedException {
  55.  
  56. for (Text v :value) {
  57. context.write(kp, v);
  58. }
  59. }
  60. }
  61.  
  62. public static void main(String[] args) {
  63.  
  64. Configuration conf = new Configuration();
  65. try {
  66. Job job = new Job(conf);
  67. job.setJobName("hot");
  68. job.setJarByClass(RunJob.class);
  69. job.setMapperClass(HotMapper.class);
  70. job.setReducerClass(HotReduce.class);
  71. job.setMapOutputKeyClass(KeyPari.class);
  72. job.setMapOutputValueClass(Text.class);
  73.  
  74. job.setNumReduceTasks(2);
  75. job.setPartitionerClass(FirstPartition.class);
  76. job.setSortComparatorClass(SortHot.class);
  77. job.setGroupingComparatorClass(GroupHot.class);
  78.  
  79. //mapreduce输入数据所在的目录或者文件
  80. FileInputFormat.addInputPath(job, new Path("/usr/input/hot"));
  81. //mr执行之后的输出数据的目录
  82. FileOutputFormat.setOutputPath(job, new Path("/usr/output/hot"));
  83. System.exit(job.waitForCompletion(true) ? 0 : 1);
  84.  
  85. } catch (Exception e) {
  86. e.printStackTrace();
  87. }
  88.  
  89. }
  90.  
  91. }

  FirstPartition类:

  1. package com.simope.mr;
  2.  
  3. import org.apache.hadoop.io.Text;
  4. import org.apache.hadoop.mapreduce.Partitioner;
  5.  
  6. /**
  7. * 实现分区
  8. * */
  9. public class FirstPartition extends Partitioner<KeyPari, Text> {
  10.  
  11. /**
  12. * num:reduce数量
  13. * getPartition()方法的
  14. * 输入参数:键/值对<key,value>与reducer数量num
  15. * 输出参数:分配的Reducer编号,这里是result
  16. * */
  17. public int getPartition(KeyPari key, Text value, int num) {
  18.  
  19. return (key.getYear()) * 127 % num; //按照年份分区
  20. }
  21.  
  22. }

  SortHot类:

  1. package com.simope.mr;
  2.  
  3. import org.apache.hadoop.io.WritableComparable;
  4. import org.apache.hadoop.io.WritableComparator;
  5.  
  6. /**
  7. * 排序
  8. * 因为hadoop对数据分组后默认是按照key升序排序的,所以需要自定义排序函数将分组数据降序排序
  9. * 对于一般的键,只需要key值相同,则对应的value就会分配至同一个 reduce中;
  10. * 对于复合键,形式为TextPair<key1,key2>,通过控制 key1来进行分区,则具有相同的 key1的值会被划分至同一个分区中,但此时如果 key2不相同,则不同的key2会被划分至不同的分组
  11. * */
  12. public class SortHot extends WritableComparator {
  13.  
  14. public SortHot() {
  15. super(KeyPari.class, true);
  16. }
  17.  
  18. @SuppressWarnings("rawtypes")
  19. public int compare(WritableComparable a, WritableComparable b) {
  20. KeyPari o1 = (KeyPari)a;
  21. KeyPari o2 = (KeyPari)b;
  22. int res = Integer.compare(o1.getYear(), o2.getYear()); //升序排序
  23. if (res != 0) {
  24. return res;
  25. }
  26. return -Integer.compare(o1.getHot(), o2.getHot()); //降序排序
  27. }
  28.  
  29. }

  KeyPari类:

  1. package com.simope.mr;
  2.  
  3. import java.io.DataInput;
  4. import java.io.DataOutput;
  5. import java.io.IOException;
  6.  
  7. import org.apache.hadoop.io.WritableComparable;
  8.  
  9. /**
  10. * 封装key对象
  11. * */
  12. public class KeyPari implements WritableComparable<KeyPari>{
  13.  
  14. private int year;
  15. private int hot;
  16. public int getYear() {
  17. return year;
  18. }
  19. public void setYear(int year) {
  20. this.year = year;
  21. }
  22. public int getHot() {
  23. return hot;
  24. }
  25. public void setHot(int hot) {
  26. this.hot = hot;
  27. }
  28.  
  29. public void readFields(DataInput in) throws IOException {
  30. this.year = in.readInt();
  31. this.hot = in.readInt();
  32. }
  33.  
  34. public void write(DataOutput out) throws IOException {
  35. out.writeInt(year);
  36. out.writeInt(hot);
  37. }
  38.  
  39. public int compareTo(KeyPari keyPari) {
  40. int res = Integer.compare(year, keyPari.getYear());
  41. if (res != 0) {
  42. return res;
  43. }
  44. return Integer.compare(hot, keyPari.getHot());
  45. }
  46.  
  47. public String toString() {
  48. return year + "\t" + hot;
  49. }
  50.  
  51. public int hashCode() {
  52. return new Integer(year + hot).hashCode();
  53. }
  54.  
  55. }

  GroupHot类:

  1. package com.simope.mr;
  2.  
  3. import org.apache.hadoop.io.WritableComparable;
  4. import org.apache.hadoop.io.WritableComparator;
  5.  
  6. /**
  7. * 排序
  8. * */
  9. public class GroupHot extends WritableComparator {
  10.  
  11. public GroupHot() {
  12. super(KeyPari.class, true);
  13. }
  14.  
  15. @SuppressWarnings("rawtypes")
  16. public int compare(WritableComparable a, WritableComparable b) {
  17. KeyPari o1 = (KeyPari)a;
  18. KeyPari o2 = (KeyPari)b;
  19. return Integer.compare(o1.getYear(), o2.getYear()); //升序排序
  20. }
  21.  
  22. }

初次接触Hadoop,可能代码不是最简,存在可优化的地方还请大家指教。经过第一周无聊的环境部署,终于在这周可以写代码了。。。

如需转载的请注明出处http://www.cnblogs.com/JimLy-BUG/

Hadoop入门案列,初学者Coder的更多相关文章

  1. 初学spring之入门案列

    spring其实是一个很大的开源框架,而我学的就是spring framework,这只是spring其中的一小部分.有疑惑的可以去官网去看看,spring官网我就不提供了.一百度肯定有.和sprin ...

  2. Quartz经典入门案列

    一.Quartz简介 Quartz是一个开放源码项目,专注于任务调度器,提供了极为广泛的特性如持久化任务,集群和分布式任务等.Spring对Quartz的集成与其对JDK Timer的集成在任务.触发 ...

  3. 大数据:Hadoop入门

    大数据:Hadoop入门 一:什么是大数据 什么是大数据: (1.)大数据是指在一定时间内无法用常规软件对其内容进行抓取,管理和处理的数据集合,简而言之就是数据量非常大,大到无法用常规工具进行处理,如 ...

  4. 一.hadoop入门须知

    目录: 1.hadoop入门须知 2.hadoop环境搭建 3.hadoop mapreduce之WordCount例子 4.idea本地调试hadoop程序 5.hadoop 从mysql中读取数据 ...

  5. 大数据技术之_14_Oozie学习_Oozie 的简介+Oozie 的功能模块介绍+Oozie 的部署+Oozie 的使用案列

    第1章 Oozie 的简介第2章 Oozie 的功能模块介绍2.1 模块2.2 常用节点第3章 Oozie 的部署3.1 部署 Hadoop(CDH版本的)3.1.1 解压缩 CDH 版本的 hado ...

  6. Hadoop入门 概念

    Hadoop是分布式系统基础架构,通常指Hadoop生态圈 主要解决 1.海量数据的存储 2.海量数据的分析计算 优势 高可靠性:Hadoop底层维护多个数据副本,即使Hadoop某个计算元素或存储出 ...

  7. Hadoop入门学习笔记---part4

    紧接着<Hadoop入门学习笔记---part3>中的继续了解如何用java在程序中操作HDFS. 众所周知,对文件的操作无非是创建,查看,下载,删除.下面我们就开始应用java程序进行操 ...

  8. Hadoop入门学习笔记---part3

    2015年元旦,好好学习,天天向上.良好的开端是成功的一半,任何学习都不能中断,只有坚持才会出结果.继续学习Hadoop.冰冻三尺,非一日之寒! 经过Hadoop的伪分布集群环境的搭建,基本对Hado ...

  9. Spring MVC的配置文件(XML)的几个经典案列

    1.既然是配置文件版的,那配置文件自然是必不可少,且应该会很复杂,那我们就以一个一个的来慢慢分析这些个经典案列吧! 01.实现Controller /* * 控制器 */ public class M ...

随机推荐

  1. Python语言程序设计基础(4)—— 程序的控制结构

    PM2.5 pm = eval(input()) if pm>=75: print("空气存在污染") else : print("空气没有污染") pr ...

  2. 动态规划(DP),0-1背包问题

    题目链接:http://poj.org/problem?id=3624 1.p[i][j]表示,背包容量为j,从i,i+1,i+2,...,n的最优解. 2.递推公式 p[i][j]=max(p[i+ ...

  3. http://codeforces.com/gym/100623/attachments E题

    http://codeforces.com/gym/100623/attachments E题第一个优化它虽然是镜像对称,但它毕竟是一一对称的,所以可以匹配串和模式串都从头到尾颠倒一下第二个优化,与次 ...

  4. cblas_sgemm cblas.h

    BLAS(Basic Linear Algebra Subprograms)库,是用Fortran语言实现的向量和矩阵运算库,是许多数值计算软件库的核心, 但也有一些其它的包装, 如cblas是C语言 ...

  5. Maven plugin 插件

    1.maven-surefire-plugin简介: Maven在构件时执行到测试的生命周期时,会使用maven-surefire-plugin运行测试用例,背后执行的Junit或者TestNG的测试 ...

  6. js实现div滚动条在页面刷新 滚动条位置固定

    思想:1.通过div的onscroll事件记录滚动条的scrollTop值,设置到document.cookie 2.页面加载时再读取document.cookie的值,设置给div的scrollTo ...

  7. 解决 Your project contains error(s),please fix them before running your applica ..

    解决 Your project contains error(s),please fix them before running your application问题 http://www.cnblo ...

  8. SQL on&where&having

    on.where.having这三个都可以加条件的子句中,on是最先执行,where次之,having最后.有时候如果这先后顺序不影响中间结果的话,那最终结果是相同的.但因为on是先把不符合条件的记录 ...

  9. TabbarController进行模块分类和管理

    iOS-CYLTabBarController[好用的TabbarController]   用TabbarController进行模块分类和管理,这里推荐一个CYLTabBarController, ...

  10. [洛谷P1390]公约数的和·莫比乌斯反演

    公约数的和 传送门 分析 这道题很显然答案为 \[Ans=\sum_{i=1}^n\sum_{j=i+1}^n (i,j)\] //其中\((i,j)\)意味\(gcd(i,j)\) 这样做起来很烦, ...