HBase(七)Hbase过滤器
一、过滤器(Filter)
基础API中的查询操作在面对大量数据的时候是非常苍白的,这里Hbase提供了高级的查询方法:Filter。Filter可以根据簇、列、版本等更多的条件来对数据进行过滤,基于Hbase本身提供的三维有序(主键有序、列有序、版本有序),这些Filter可以高效的完成查询过滤的任务。带有Filter条件的RPC查询请求会把Filter分发到各个RegionServer,是一个服务器端(Server-side)的过滤器,这样也可以降低网络传输的压力。
要完成一个过滤的操作,至少需要两个参数。一个是抽象的操作符,Hbase提供了枚举类型的变量来表示这些抽象的操作符:LESS/LESS_OR_EQUAL/EQUAL/NOT_EUQAL等;另外一个就是具体的比较器(Comparator),代表具体的比较逻辑,如果可以提高字节级的比较、字符串级的比较等。有了这两个参数,我们就可以清晰的定义筛选的条件,过滤数据。
1、抽象操作符(比较运算符)
LESS <
LESS_OR_EQUAL <=
EQUAL =
NOT_EQUAL <>
GREATER_OR_EQUAL >=
GREATER >
NO_OP 排除所有
2、比较器(指定比较机制)
BinaryComparator 按字节索引顺序比较指定字节数组,采用 Bytes.compareTo(byte[])
BinaryPrefixComparator 跟前面相同,只是比较左端的数据是否相同
NullComparator 判断给定的是否为空
BitComparator 按位比较
RegexStringComparator 提供一个正则的比较器,仅支持 EQUAL 和非 EQUAL
SubstringComparator 判断提供的子串是否出现在 value 中
二、HBase过滤器的分类
1、比较过滤器
1.行键过滤器 RowFilter
- public class HbaseFilterTest {
- private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
- private static final String ZK_CONNECT_VALUE = "node21:2181,node22:2181,node23:2181";
- private static Connection conn = null;
- private static Admin admin = null;
- public static void main(String[] args) throws Exception {
- Configuration conf = HBaseConfiguration.create();
- conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
- conn = ConnectionFactory.createConnection(conf);
- admin = conn.getAdmin();
- Table table = conn.getTable(TableName.valueOf("student"));
- Scan scan = new Scan();
- Filter rowFilter = new RowFilter(CompareOp.GREATER, new BinaryComparator("95007".getBytes()));
- scan.setFilter(rowFilter);
- ResultScanner resultScanner = table.getScanner(scan);
- for(Result result : resultScanner) {
- List<Cell> cells = result.listCells();
- for(Cell cell : cells) {
- System.out.println(cell);
- }
- }
- }
运行结果部分截图
2.列簇过滤器 FamilyFilter
- public class HbaseFilterTest {
- private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
- private static final String ZK_CONNECT_VALUE = "node21:2181,node22:2181,node23:2181";
- private static Connection conn = null;
- private static Admin admin = null;
- public static void main(String[] args) throws Exception {
- Configuration conf = HBaseConfiguration.create();
- conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
- conn = ConnectionFactory.createConnection(conf);
- admin = conn.getAdmin();
- Table table = conn.getTable(TableName.valueOf("student"));
- Scan scan = new Scan();
- Filter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator("info".getBytes()));
- scan.setFilter(familyFilter);
- ResultScanner resultScanner = table.getScanner(scan);
- for(Result result : resultScanner) {
- List<Cell> cells = result.listCells();
- for(Cell cell : cells) {
- System.out.println(cell);
- }
- }
- }
- }
运行结果部分截图
3.列过滤器 QualifierFilter
- public class HbaseFilterTest {
- private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
- private static final String ZK_CONNECT_VALUE = "node21:2181,node22:2181,node23:2181";
- private static Connection conn = null;
- private static Admin admin = null;
- public static void main(String[] args) throws Exception {
- Configuration conf = HBaseConfiguration.create();
- conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
- conn = ConnectionFactory.createConnection(conf);
- admin = conn.getAdmin();
- Table table = conn.getTable(TableName.valueOf("student"));
- Scan scan = new Scan();
- Filter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("name".getBytes()));
- scan.setFilter(qualifierFilter);
- ResultScanner resultScanner = table.getScanner(scan);
- for(Result result : resultScanner) {
- List<Cell> cells = result.listCells();
- for(Cell cell : cells) {
- System.out.println(cell);
- }
- }
- }
- }
运行结果部分截图
4.值过滤器 ValueFilter
- public class HbaseFilterTest {
- private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
- private static final String ZK_CONNECT_VALUE = "node21:2181,node22:2181,node23:2181";
- private static Connection conn = null;
- private static Admin admin = null;
- public static void main(String[] args) throws Exception {
- Configuration conf = HBaseConfiguration.create();
- conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
- conn = ConnectionFactory.createConnection(conf);
- admin = conn.getAdmin();
- Table table = conn.getTable(TableName.valueOf("student"));
- Scan scan = new Scan();
- Filter valueFilter = new ValueFilter(CompareOp.EQUAL, new SubstringComparator("男"));
- scan.setFilter(valueFilter);
- ResultScanner resultScanner = table.getScanner(scan);
- for(Result result : resultScanner) {
- List<Cell> cells = result.listCells();
- for(Cell cell : cells) {
- System.out.println(cell);
- }
- }
- }
- }
运行结果部分截图
5.时间戳过滤器 TimestampsFilter
- public class HbaseFilterTest {
- private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
- private static final String ZK_CONNECT_VALUE = "node21:2181,node22:2181,node23:2181";
- private static Connection conn = null;
- private static Admin admin = null;
- public static void main(String[] args) throws Exception {
- Configuration conf = HBaseConfiguration.create();
- conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
- conn = ConnectionFactory.createConnection(conf);
- admin = conn.getAdmin();
- Table table = conn.getTable(TableName.valueOf("student"));
- Scan scan = new Scan();
- List<Long> list = new ArrayList<>();
- list.add(1522469029503l);
- TimestampsFilter timestampsFilter = new TimestampsFilter(list);
- scan.setFilter(timestampsFilter);
- ResultScanner resultScanner = table.getScanner(scan);
- for(Result result : resultScanner) {
- List<Cell> cells = result.listCells();
- for(Cell cell : cells) {
- System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())
- + "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());
- }
- }
- }
- }
运行结果部分截图
2、专用过滤器
1.单列值过滤器 SingleColumnValueFilter
会返回满足条件的整行
- public class HbaseFilterTest2 {
- private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
- private static final String ZK_CONNECT_VALUE = "node21:2181,node22:2181,node23:2181";
- private static Connection conn = null;
- private static Admin admin = null;
- public static void main(String[] args) throws Exception {
- Configuration conf = HBaseConfiguration.create();
- conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
- conn = ConnectionFactory.createConnection(conf);
- admin = conn.getAdmin();
- Table table = conn.getTable(TableName.valueOf("student"));
- Scan scan = new Scan();
- SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(
- "info".getBytes(),
- "name".getBytes(),
- CompareOp.EQUAL,
- new SubstringComparator("刘晨"));
- //如果不设置为 true,则那些不包含指定 column 的行也会返回
- singleColumnValueFilter.setFilterIfMissing(true);
- scan.setFilter(singleColumnValueFilter);
- ResultScanner resultScanner = table.getScanner(scan);
- for(Result result : resultScanner) {
- List<Cell> cells = result.listCells();
- for(Cell cell : cells) {
- System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())
- + "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());
- }
- }
- }
- }
运行结果部分截图
2.单列值排除器 SingleColumnValueExcludeFilter
- public class HbaseFilterTest2 {
- private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
- private static final String ZK_CONNECT_VALUE = "node21:2181,node22:2181,node23:2181";
- private static Connection conn = null;
- private static Admin admin = null;
- public static void main(String[] args) throws Exception {
- Configuration conf = HBaseConfiguration.create();
- conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
- conn = ConnectionFactory.createConnection(conf);
- admin = conn.getAdmin();
- Table table = conn.getTable(TableName.valueOf("student"));
- Scan scan = new Scan();
- SingleColumnValueExcludeFilter singleColumnValueExcludeFilter = new SingleColumnValueExcludeFilter(
- "info".getBytes(),
- "name".getBytes(),
- CompareOp.EQUAL,
- new SubstringComparator("刘晨"));
- singleColumnValueExcludeFilter.setFilterIfMissing(true);
- scan.setFilter(singleColumnValueExcludeFilter);
- ResultScanner resultScanner = table.getScanner(scan);
- for(Result result : resultScanner) {
- List<Cell> cells = result.listCells();
- for(Cell cell : cells) {
- System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())
- + "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());
- }
- }
- }
- }
运行结果部分截图
3.前缀过滤器 PrefixFilter----针对行键
- public class HbaseFilterTest2 {
- private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
- private static final String ZK_CONNECT_VALUE = "node21:2181,node22:2181,node23:2181";
- private static Connection conn = null;
- private static Admin admin = null;
- public static void main(String[] args) throws Exception {
- Configuration conf = HBaseConfiguration.create();
- conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
- conn = ConnectionFactory.createConnection(conf);
- admin = conn.getAdmin();
- Table table = conn.getTable(TableName.valueOf("student"));
- Scan scan = new Scan();
- PrefixFilter prefixFilter = new PrefixFilter("9501".getBytes());
- scan.setFilter(prefixFilter);
- ResultScanner resultScanner = table.getScanner(scan);
- for(Result result : resultScanner) {
- List<Cell> cells = result.listCells();
- for(Cell cell : cells) {
- System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())
- + "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());
- }
- }
- }
- }
运行结果部分截图
4.列前缀过滤器 ColumnPrefixFilter
- public class HbaseFilterTest2 {
- private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
- private static final String ZK_CONNECT_VALUE = "node21:2181,node22:2181,node23:2181";
- private static Connection conn = null;
- private static Admin admin = null;
- public static void main(String[] args) throws Exception {
- Configuration conf = HBaseConfiguration.create();
- conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
- conn = ConnectionFactory.createConnection(conf);
- admin = conn.getAdmin();
- Table table = conn.getTable(TableName.valueOf("student"));
- Scan scan = new Scan();
- ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter("name".getBytes());
- scan.setFilter(columnPrefixFilter);
- ResultScanner resultScanner = table.getScanner(scan);
- for(Result result : resultScanner) {
- List<Cell> cells = result.listCells();
- for(Cell cell : cells) {
- System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())
- + "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());
- }
- }
- }
- }
运行结果部分截图
5.分页过滤器 PageFilter
HBase(七)Hbase过滤器的更多相关文章
- HBase(七): HBase体系结构剖析(下)
目录: write Compaction splite read Write: 当客户端发起一个Put请求时,首先根据RowKey寻址,从hbase:meta表中查出该Put数据最终需要去的HRegi ...
- HBase笔记6 过滤器
过滤器 过滤器是GET或者SCAN时过滤结果用的,相当于SQL的where语句 HBase中的过滤器创建后会被序列化,然后分发到各个region server中,region server会还原过滤器 ...
- HBase学习之路 (十一)HBase的协过滤器
协处理器—Coprocessor 1. 起源 Hbase 作为列族数据库最经常被人诟病的特性包括:无法轻易建立“二级索引”,难以执 行求和.计数.排序等操作.比如,在旧版本的(<0.92)Hba ...
- HBase学习-HBase原理
1.系统架构 1.1 图解 从HBase的架构图上可以看出,HBase中的组件包括Client.Zookeeper.HMaster.HRegionServer.HRegion.Store.MemS ...
- 大数据技术之_11_HBase学习_01_HBase 简介+HBase 安装+HBase Shell 操作+HBase 数据结构+HBase 原理
第1章 HBase 简介1.1 什么是 HBase1.2 HBase 特点1.3 HBase 架构1.3 HBase 中的角色1.3.1 HMaster1.3.2 RegionServer1.3.3 ...
- Hbase学习Hbase基础介绍
一.产生背景 自1970年以来,关系数据库用于数据存储和维护有关问题的解决方案.大数据的出现后,好多公司实现处理大数据并从中受益,并开始选择像Hadoop的解决方案.Hadoop使用分布式文件系统,用 ...
- 【HBase】HBase Getting Started(HBase 入门指南)
入门指南 1. 简介 Quickstart 会让你启动和运行一个单节点单机HBase. 2. 快速启动 – 单点HBase 这部分描述单节点单机HBase的配置.一个单例拥有所有的HBase守护线程- ...
- Hbase总结(一)-hbase命令,hbase安装,与Hive的区别,与传统数据库的区别,Hbase数据模型
Hbase总结(一)-hbase命令 下面我们看看HBase Shell的一些基本操作命令,我列出了几个常用的HBase Shell命令,如下: 名称 命令表达式 创建表 create '表名称', ...
- Hbase理论&&hbase shell&&python操作hbase&&python通过mapreduce操作hbase
一.Hbase搭建: 二.理论知识介绍: 1Hbase介绍: Hbase是分布式.面向列的开源数据库(其实准确的说是面向列族).HDFS为Hbase提供可靠的底层数据存储服务,MapReduce为Hb ...
- Hbase启动hbase shell运行命令报Class path contains multiple SLF4J bindings.错误
1:Hbase启动hbase shell运行命令报Class path contains multiple SLF4J bindings.错误,是因为jar包冲突了,所以对于和hadoop的jar包冲 ...
随机推荐
- 51NOD 1105 第K大的数
数组A和数组B,里面都有n个整数. 数组C共有n^2个整数,分别是: A[0] * B[0],A[0] * B[1] ...... A[0] * B[n-1] A[1] * B[0],A[1] * B ...
- Shell记录-Shell命令(文件权限)
chmod命令用于改变linux系统文件或目录的访问权限.用它控制文件或目录的访问权限.该命令有两种用法.一种是包含字母和操作符表达式的文字设定法:另一种是包含数字的数字设定法. 1. 命令格式 ch ...
- logstash过滤配置
input { redis { host => "127.0.0.1" port => 6380 data_type => "list" ke ...
- 穷竭搜索:POJ 3187 Backward Digit Sums
题目:http://poj.org/problem?id=3187 题意: 像这样,输入N : 表示层数,输入over表示最后一层的数字,然后这是一个杨辉三角,根据这个公式,由最后一层的数,推出第一行 ...
- [Luogu 3128] USACO15DEC Max Flow
[Luogu 3128] USACO15DEC Max Flow 最近跟 LCA 干上了- 树剖好啊,我再也不想写倍增了. 以及似乎成功转成了空格选手 qwq. 对于每两个点 S and T,求一下 ...
- zlib打印bit length overflow
bit length overflow code bits -> code bits -> zlib库输出此log,此log不代表压缩出现错误,没有什么危害,而且zlib非常稳定,完全可以 ...
- Session详解、ASP.NET核心知识(8)
介绍一下Session 1.作用 Cookie是存在客户端,Session是存在服务器端,目的是一样的:保存和当前客户端相关的数据(当前网站的任何一个页面都能取到Session). 在本篇博文的姊妹篇 ...
- 关于数据区间变换及numpy数组转图片数据的python实现
python实现区间转换.numpy图片数据转换 需求: 客户的需求是永无止境的,这不?前几天,用户提出了一个需求,需要将一组数据从一个区间缩放到另一区间? 思路: 先将数据归一化,再乘以对应区间的差 ...
- 【译】第十四篇 Integration Services:项目转换
本篇文章是Integration Services系列的第十四篇,详细内容请参考原文. 简介在前一篇,我们查看了SSIS变量,变量配置和表达式管理动态值.在这一篇,我们使用SQL Server数据商业 ...
- php实现异步请求
PHP开启异步多线程执行脚本 装载自:http://www.cnblogs.com/clphp/p/4913214.html 场景要求 客户端调用服务器a.php接口,需要执行一个长达5s-20s不 ...