hbase->Mapreduce->hbase

Hbase对Mapreduce API进行了扩展，方便Mapreduce任务读写HTable数据。

package taglib.customer;

import java.io.IOException;  

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.Result;

import org.apache.hadoop.hbase.client.Scan;

import org.apache.hadoop.hbase.io.ImmutableBytesWritable;

import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;

import org.apache.hadoop.hbase.mapreduce.TableMapper;

import org.apache.hadoop.hbase.mapreduce.TableReducer;

import org.apache.hadoop.hbase.util.Bytes;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

public class MrHbase {

	public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {

		// TODO Auto-generated method stub

		Configuration conf = HBaseConfiguration.create();

		conf.set("hbase.zookeeper.quorum", "192.168.58.101");

        Job job = new Job(conf,"ExampleSummary");

        job.setJarByClass(MrHbase.class);     // class that contains mapper and reducer  

        Scan scan = new Scan();

        scan.setCaching(500);        // 1 is the default in Scan, which will be bad for MapReduce jobs

        scan.setCacheBlocks(false);  // don't set to true for MR jobs

        // set other scan attrs

        //scan.addColumn(family, qualifier);

        TableMapReduceUtil.initTableMapperJob(

                "blog",        // input table

                scan,               // Scan instance to control CF and attribute selection

                MyMapper.class,     // mapper class

                Text.class,         // mapper output key

                IntWritable.class,  // mapper output value

                job);

        TableMapReduceUtil.initTableReducerJob(

                "blog2",        // output table

                MyTableReducer.class,    // reducer class

                job);

        job.setNumReduceTasks(1);   // at least one, adjust as required  

        boolean b = job.waitForCompletion(true);

        if (!b) {

            throw new IOException("error with job!");

        }

	}

	public static class MyMapper extends TableMapper<Text, IntWritable>  {  

        private final IntWritable ONE = new IntWritable(1);

        private Text text = new Text();  

        public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {

            String ip = Bytes.toString(row.get());

            String url = new String(value.getValue(Bytes.toBytes("article"), Bytes.toBytes("title")));

            text.set(ip+"&"+url);

            context.write(text, ONE);

        }

    }

	public static class MyTableReducer extends TableReducer<Text, IntWritable, ImmutableBytesWritable>  {

        public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {

            int sum = 0;

            for (IntWritable val : values) {

                sum += val.get();

            }  

            Put put = new Put(key.getBytes());

            put.add(Bytes.toBytes("article"), Bytes.toBytes("title"), Bytes.toBytes(String.valueOf(sum)));  

            context.write(null, put);

        }

    }  

}

hbase->Mapreduce->hbase的更多相关文章

MapReduce/Hbase进阶提升(原理剖析、实战演练)
什么是MapReduce? MapReduce是一种编程模型,用于大规模数据集(大于1TB)的并行运算.概念"Map(映射)"和"Reduce(归约)",和他们 ...
Hbase篇--Hbase和MapReduce结合Api
一.前述 Mapreduce可以自定义Inputforma对象和OutPutformat对象,所以原理上Mapreduce可以和任意输入源结合. 二.步骤将结果写会到hbase中去. 2.1 Ma ...
大数据Hadoop核心架构HDFS+MapReduce+Hbase+Hive内部机理详解
微信公众号[程序员江湖] 作者黄小斜,斜杠青年,某985硕士,阿里 Java 研发工程师,于 2018 年秋招拿到 BAT 头条.网易.滴滴等 8 个大厂 offer,目前致力于分享这几年的学习经验. ...
Hbase理论&&hbase shell&&python操作hbase&&python通过mapreduce操作hbase
一.Hbase搭建: 二.理论知识介绍: 1Hbase介绍: Hbase是分布式.面向列的开源数据库(其实准确的说是面向列族).HDFS为Hbase提供可靠的底层数据存储服务,MapReduce为Hb ...
【Hbase学习之五】HBase MapReduce
环境虚拟机:VMware 10 Linux版本:CentOS-6.5-x86_64 客户端:Xshell4 FTP:Xftp4 jdk8 hadoop-2.6.5 hbase-0.98.12.1-h ...
Hadoop核心架构HDFS+MapReduce+Hbase+Hive内部机理详解
转自:http://blog.csdn.net/iamdll/article/details/20998035 分类: 分布式 2014-03-11 10:31 156人阅读评论(0) 收藏举报 ...
第十一章： Hadoop核心架构HDFS+MapReduce+Hbase+Hive内部机理详解
HDFS的体系架构整个Hadoop的体系结构主要是通过HDFS来实现对分布式存储的底层支持,并通过MR来实现对分布式并行任务处理的程序支持. HDFS采用主从(Master/Slave)结构模型,一 ...
HBase MapReduce 一些 ClassNotFoundException 所缺少的jar包
我们在用 java 操作 HBase 时,可能会出现相关的 ClassNotFoundException 等异常信息,但是我们又不想把 HBase lib 下的所有jar包全部导入到工程,因为会有 ...
【HBase】HBase与MapReduce集成——从HDFS的文件读取数据到HBase
目录需求步骤一.创建maven工程,导入jar包二.开发MapReduce程序三.结果需求将HDFS路径 /hbase/input/user.txt 文件的内容读取并写入到HBase 表 ...
【HBase】HBase与MapReduce的集成案例
目录需求步骤一.创建maven工程,导入jar包二.开发MapReduce程序三.运行结果 HBase与MapReducer集成官方帮助文档:http://archive.cloudera. ...

随机推荐

linux下无线鼠标驱动执行流程
操作系统: debian 7.4(linux 3.2.54) 硬件: 一个无线鼠标.一个有线鼠标.usb集线器. 从淘宝上花了15块钱买了个无线鼠标,很好奇它的驱动程序是如何执行的. 首先将usb集线 ...
android OTA升级包制作【转】
本文转载自:http://www.thinksaas.cn/topics/0/445/445670.html 0.签名 java -Xmx2048m -jar out/host/linux-x86/f ...
HDU 1800 Flying to the Mars 字典树，STL中的map ，哈希树
http://acm.hdu.edu.cn/showproblem.php?pid=1800 字典树 #include<iostream> #include<string.h> ...
Linux学习之CentOS(一)--CentOS6.6下Mysql数据库的安装与配置
在这里我是通过yum来进行mysql数据库的安装的,通过这种方式进行安装,可以将跟mysql相关的一些服务.jar包都给我们安装好,所以省去了很多不必要的麻烦!!! [root@larry ~]# c ...
Java -- AWT ， GUI图形界面
1. AWT 容器继承关系示例1: public class Main { public static void main(String[] args) throws Exception { Fra ...
关于输出用%lf和%f的问题
关于输入,float用%f,double用%lf 而输出时,无论是float还是double,都用%f 原文请见:http://poj.org/showmessage?message_id=12692 ...
fastjson转对象的一些属性设置
<bean class="com.alibaba.fastjson.support.spring.FastJsonHttpMessageConverter"> < ...
codeforces 86D D. Powerful array(莫队算法)
题目链接: D. Powerful array time limit per test 5 seconds memory limit per test 256 megabytes input stan ...
linux命令学习笔记（57）：ss命令
ss是Socket Statistics的缩写.顾名思义,ss命令可以用来获取socket统计信息,它可以显示和netstat 类似的内容.但ss的优势在于它能够显示更多更详细的有关TCP和连接状态的 ...
Ffmpeg移植S3C2440
Ffmpeg移植过程: FFmpeg是一个开源免费跨平台的视频和音频流方案,属于自由软件,采用LGPL或GPL许可证.它的移植同样遵循LGPL或GPL移植方法:configure.make.make ...

hbase->Mapreduce->hbase

hbase->Mapreduce->hbase的更多相关文章

随机推荐

热门专题