HBASE依托于Hadoop的HDFS作为存储基础，因此结构也很类似于Hadoop的Master-Slave模式，Hbase Master Server 负责管理所有的HRegion Server，但Hbase Master Server本身并不存储HBASE中的任何数据。HBASE逻辑上的Table被定义成为一个Region存储在某一台HRegion Server上，HRegion Server 与Region的对应关系是一对多的关系。每一个HRegion在物理上会被分为三个部分：Hmemcache、Hlog、HStore，分别代表了缓存，日志，持久层。通过一次更新流程来看一下这三部分的作用：

由流程可以看出，提交更新操作将会写入到两部分实体中，HMemcache和Hlog中，HMemcache就是为了提高效率在内存中建立缓存，保证了部分最近操作过的数据能够快速的被读取和修改，Hlog是作为同步Hmemcache和Hstore的事务日志，在HRegion Server周期性的发起Flush Cache命令的时候，就会将Hmemcache中的数据持久化到Hstore中，同时会清空Hmemecache中的数据，这里采用的是比较简单的策略来做数据缓存和同步，复杂一些其实可以参照java的垃圾收集机制来做。

在读取Region信息的时候，优先读取HMemcache中的内容，如果未取到再去读取Hstore中的数据。

几个细节：

1．由于每一次Flash Cache，就会产生一个Hstore File，在Hstore中存储的文件会越来越多，对性能也会产生一定影响，因此达到设置文件数量阀值的时候就会Merge这些文件为一个大文件。

2． Cache大小的设置以及flush的时间间隔设置需要考虑内存消耗以及对性能的影响。

3． HRegion Server每次重新启动的时候会将Hlog中没有被Flush到Hstore中的数据再次载入到Hmemcache，因此Hmemcache过大对于启动的速度也有直接影响。

4． Hstore File中存储数据采用B-tree的算法，因此也支持了前面提到对于Column同Family数据操作的快速定位获取。

5． HRegion可以Merge也可以被Split，根据HRegion的大小决定。不过在做这些操作的时候HRegion都会被锁定不可使用。

6． Hbase Master Server通过Meta-info Table来获取HRegion Server的信息以及Region的信息，Meta最顶部的一个Region是虚拟的一个叫做Root Region，通过Root Region可以找到下面各个实际的Region。

7．客户端通过Hbase Master Server获得了Region所在的Region Server，然后就直接和Region Server进行交互，而对于Region Server相互之间不通信，只和Hbase Master Server交互，受到Master Server的监控和管理。

http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture

Architecture and Implementation

There are three major components of the HBase architecture:

The H!BaseMaster (analogous to the Bigtable master server)
The H!RegionServer (analogous to the Bigtable tablet server)
The HBase client, defined by org.apache.hadoop.hbase.client.HTable

Each will be discussed in the following sections.

HBaseMaster

The H!BaseMaster is responsible for assigning regions to H!RegionServers. The first region to be assigned is the ROOT region which locates all the META regions to be assigned. Each META region maps a number of user regions which comprise the multiple tables that a particular HBase instance serves. Once all the META regions have been assigned, the master will then assign user regions to the H!RegionServers, attempting to balance the number of regions served by each H!RegionServer.

It also holds a pointer to the H!RegionServer that is hosting the ROOT region.

The H!BaseMaster also monitors the health of each H!RegionServer, and if it detects a H!RegionServer is no longer reachable, it will split the H!RegionServer's write-ahead log so that there is now one write-ahead log for each region that the H!RegionServer was serving. After it has accomplished this, it will reassign the regions that were being served by the unreachable H!RegionServer.

In addition, the H!BaseMaster is also responsible for handling table administrative functions such as on/off-lining of tables, changes to the table schema (adding and removing column families), etc.

Unlike Bigtable, currently, when the H!BaseMaster dies, the cluster will shut down. In Bigtable, a Tabletserver can still serve Tablets after its connection to the Master has died. We tie them together, because we do not currently use an external lock-management system like Bigtable. The Bigtable Master allocates tablets and a lock manager (Chubby) guarantees atomic access by Tabletservers to tablets. HBase uses just a single central point for all H!RegionServers to access: the H!BaseMaster.

The META Table

The META table stores information about every user region in HBase which includes a H!RegionInfo object containing information such as the start and end row keys, whether the region is on-line or off-line, etc. and the address of the H!RegionServer that is currently serving the region. The META table can grow as the number of user regions grows.

The ROOT Table

The ROOT table is confined to a single region and maps all the regions in the META table. Like the META table, it contains a H!RegionInfo object for each META region and the location of the H!RegionServer that is serving that META region.

Each row in the ROOT and META tables is approximately 1KB in size. At the default region size of 256MB, this means that the ROOT region can map 2.6 x 10⁵ META regions, which in turn map a total 6.9 x 10¹⁰ user regions, meaning that approximately 1.8 x 10¹⁹ (2⁶⁴) bytes of user data.

HRegionServer

The H!RegionServer is responsible for handling client read and write requests. It communicates with the H!BaseMaster to get a list of regions to serve and to tell the master that it is alive. Region assignments and other instructions from the master "piggy back" on the heart beat messages.

Write Requests

When a write request is received, it is first written to a write-ahead log called a HLog. All write requests for every region the region server is serving are written to the same log. Once the request has been written to the HLog, it is stored in an in-memory cache called the Memcache. There is one Memcache for each HStore.

Read Requests

Reads are handled by first checking the Memcache and if the requested data is not found, the MapFiles are searched for results.

Cache Flushes

When the Memcache reaches a configurable size, it is flushed to disk, creating a new MapFile and a marker is written to the HLog, so that when it is replayed, log entries before the last flush can be skipped. A flush may also be triggered to relieve memory pressure on the region server.

Cache flushes happen concurrently with the region server processing read and write requests. Just before the new MapFile is moved into place, reads and writes are suspended until the MapFile has been added to the list of active MapFiles for the HStore.

Compactions

When the number of MapFiles exceeds a configurable threshold, a minor compaction is performed which consolidates the most recently written MapFiles. A major compaction is performed periodically which consolidates all the MapFiles into a single MapFile. The reason for not always performing a major compaction is that the oldest MapFile can be quite large and reading and merging it with the latest MapFiles, which are much smaller, can be very time consuming due to the amount of I/O involved in reading merging and writing the contents of the largest MapFile.

Compactions happen concurrently with the region server processing read and write requests. Just before the new MapFile is moved into place, reads and writes are suspended until the MapFile has been added to the list of active MapFiles for the HStore and the MapFiles that were merged to create the new MapFile have been removed.

Region Splits

When the aggregate size of the MapFiles for an HStore reaches a configurable size (currently 256MB), a region split is requested. Region splits divide the row range of the parent region in half and happen very quickly because the child regions read from the parent's MapFile.

The parent region is taken off-line, the region server records the new child regions in the META region and the master is informed that a split has taken place so that it can assign the children to region servers. Should the split message be lost, the master will discover the split has occurred since it periodically scans the META regions for unassigned regions.

Once the parent region is closed, read and write requests for the region are suspended. The client has a mechanism for detecting a region split and will wait and retry the request when the new children are on-line.

When a compaction is triggered in a child, the data from the parent is copied to the child. When both children have performed a compaction, the parent region is garbage collected.

HBase Client

The HBase client is responsible for finding H!RegionServers that are serving the particular row range of interest. On instantiation, the HBase client communicates with the H!BaseMaster to find the location of the ROOT region. This is the only communication between the client and the master.

Once the ROOT region is located, the client contacts that region server and scans the ROOT region to find the META region that will contain the location of the user region that contains the desired row range. It then contacts the region server that is serving that META region and scans that META region to determine the location of the user region.

After locating the user region, the client contacts the region server serving that region and issues the read or write request.

This information is cached in the client so that subsequent requests need not go through this process.

Should a region be reassigned either by the master for load balancing or because a region server has died, the client will rescan the META table to determine the new location of the user region. If the META region has been reassigned, the client will rescan the ROOT region to determine the new location of the META region. If the ROOT region has been reassigned, the client will contact the master to determine the new ROOT region location and will locate the user region by repeating the original process described above.

HBase介绍(3)---框架结构及流程的更多相关文章

HBase之Table.put客户端流程(续)
上篇博文中已经谈到,有两个流程没有讲到.一个是MetaTableAccessor.getRegionLocations,另外一个是ConnectionImplementation.cacheLocat ...
HBase之Table.put客户端流程
首先,让我们从HTable.put方法开始.由于这一节有很多方法只是简单的参数传递,我就简单略过,但是,关键的方法我还是会截图讲解,所以希望大家尽可能对照源码进行流程分析.另外,在这一节,我单单介绍p ...
HBase（一）——HBase介绍
HBase介绍 1.关系型数据库与非关系型数据库 (1)关系型数据库关系型数据库最典型的数据机构是表,由二维表及其之间的联系所组成的一个数据组织优点: 1.易于维护:都是使用表结构,格 ...
HBase介绍及简易安装（转）
HBase介绍及简易安装(转) HBase简介 HBase是Apache Hadoop的数据库,能够对大型数据提供随机.实时的读写访问,是Google的BigTable的开源实现.HBase的目标是存 ...
Hadoop生态圈-hbase介绍-完全分布式搭建
Hadoop生态圈-hbase介绍-完全分布式搭建作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任.
Hadoop生态圈-hbase介绍-伪分布式安装
Hadoop生态圈-hbase介绍-伪分布式安装作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.HBase简介 HBase是一个分布式的,持久的,强一致性的存储系统,具有近似最 ...
HBase介绍、安装与应用案例
搭建环境部署节点操作系统为CentOS,防火墙和SElinux禁用,创建了一个shiyanlou用户并在系统根目录下创建/app目录,用于存放 Hadoop等组件运行包.因为该目录用于安装hadoo ...
HBase二级索引、读写流程
HBase二级索引.读写流程一.HBse二级索引方案 1.1 基于Coprocessor方案 1.2 Phoenix二级索引特点 1.3 Phoenix 二级索引方案二.HBase读写流程 2.1 ...
HBase读写数据的详细流程及ROOT表/META表介绍
一.HBase读数据流程 1.Client访问Zookeeper,从ZK获取-ROOT-表的位置信息,通过访问-ROOT-表获取.META.表的位置,然后确定数据所在的HRegion位置: 2.Cli ...

随机推荐

利用 Django admin 完成更多任务（转）
利用 Django admin 完成更多任务 Django admin Django 为未来的开发人员提供了许多功能:一个成熟的标准库,一个活跃的用户社区,以及 Python 语言的所有好处.虽然 ...
angular.module方法
关于module的定义为:angular.module(‘com.ngbook.demo’, []).关于module函数可以传递3个参数,它们分别为: name:模块定义的名称,它应该是一个唯一的必 ...
什么时候必须使用UI相机？多个相机的作用原理？
首先,要从主画布说起,maincanvas,这个有什么限制?主画布是一张默认用来绘制UI的地方,这些UI必须是系统提供的UI组件,在画面下挂一个3D物体或非UI的2D物品是不会被绘制到画布上的,但是仍 ...
README.md的编写
1.编辑README文件大标题(一级标题):在文本下面加等于号,那么上方的文字就变成了大标题,等于号的个数无限制,但一定要大于0 大标题 ==== 中标题(二级标题):在文本下面加下划线,那么上方的 ...
Dreamweaver安装与破解
1.下载Dreamweaver cs6破解版解压包打开下载网址http://pan.baidu.com/s/1jGr8pbK,点击下载,保存到自己想要保存的位置. 2.下载Dreamweaver c ...
产品负责人（Product Owner）的主要职责和技能
角色介绍产品负责人以下简称PO,他是有授权的产品领导力核心,组成Scrum团队三个角色之一. PO担任的是产品经理的角色. PO的主要职责 1.对产品的ROI负责. ROI = profitabil ...
iOS学习之UIPickerView控件的简单使用
UIPickerView控件在给用户选择某些特定的数据时经常使用到,这里演示一个简单的选择数据,显示在UITextField输入框里,把UIPickerView作为输入View,用Toolbar作为选 ...
ADB Not Responding - Android Studio
问题描述: 最近安装了Android Studio v1.0,运行的时候老是这个错误解决方案: 网上有人说是已经有adb的进程在运行,可是打开任务管理器,找不到对应的adb 进程. 无奈之下,想到a ...
centos7 时间修改
转子 http://blog.csdn.net/kuluzs/article/details/52825331 在CentOS 6版本,时间设置有date.hwclock命令,从CentOS 7开始, ...
Linux实战教学笔记28：企业级LNMP环境应用实践
一,LNMP应用环境 1.1 LNMP介绍大约在2010年以前,互联网公司最常用的经典Web服务环境组合就是LAMP(即Linux,Apache,MySQL,PHP),近几年随着Nginx Web服 ...

HBase介绍(3)---框架结构及流程