Region Split请求是在Region MemStore Flush之后被触发的:

boolean shouldCompact = region.flushcache();

// We just want to check the size
boolean shouldSplit = region.checkSplit() != null; if (shouldSplit) {
} else if (shouldCompact) {
server.compactSplitThread.requestCompaction(region, getName());
} server.getMetrics().addFlush(region.getRecentFlushInfo());

Region Flush操作完成之后,会进行checkSplit的判断,如果返回值不为null(返回值为该Region的SplitPoint),表示该Region达到了进行Split的条件,发起相应的Split请求。


* Return the splitpoint. null indicates the region isn't splittable. If the
* splitpoint isn't explicitly specified, it will go over the stores to find
* the best splitpoint. Currently the criteria of best splitpoint is based
* on the size of the store.
public byte[] checkSplit() {
// Can't split ROOT/META
if (this.regionInfo.isMetaTable()) {
if (shouldForceSplit()) {
LOG.warn("Cannot split root/meta regions in HBase 0.20 and above");
} return null;
} if (!splitPolicy.shouldSplit()) {
return null;
} byte[] ret = splitPolicy.getSplitPoint(); if (ret != null) {
try {
checkRow(ret, "calculated split");
} catch (IOException e) {
LOG.error("Ignoring invalid split", e); return null;
} return ret;




RegionSplitPolicy shouldSplit



protected boolean shouldSplit() {
if (region.shouldForceSplit()) {
return true;
} boolean foundABigStore = false; // Get count of regions that have the same common table as this.region
int tableRegionsCount = getCountOfCommonTableRegions(); // Get size to check
long sizeToCheck = getSizeToCheck(tableRegionsCount); for (Store store : region.getStores().values()) {
// If any of the stores is unable to split (eg they contain
// reference files)
// then don't split
if ((!store.canSplit())) {
return false;
} // Mark if any store is big enough
long size = store.getSize(); if (size > sizeToCheck) {
LOG.debug("ShouldSplit because " + store.getColumnFamilyName()
+ " size=" + size + ", sizeToCheck=" + sizeToCheck
+ ", regionsWithCommonTable=" + tableRegionsCount); foundABigStore = true; break;
} return foundABigStore;





其中,进行大小判断的Region Store必须是可Split的,即该Store中不包含Reference类型的文件,如果某一Store中出现了Reference类型的文件,则表示该Region已经被Split过,不能再进行Split,此时,直接返回false即可。


(1)假设当前Region所属的表为t,计算该Region所处于的RegionServer上包含表t的Online Region数目,并将结果保存至变量tableRegionsCount中;

// Get count of regions that have the same common table as this.region
int tableRegionsCount = getCountOfCommonTableRegions();


* @return Count of regions on this server that share the table this.region
* belongs to
private int getCountOfCommonTableRegions() {
RegionServerServices rss = this.region.getRegionServerServices(); // Can be null in tests
if (rss == null) {
return 0;
} byte[] tablename = this.region.getTableDesc().getName(); int tableRegionsCount = 0; try {
List<HRegion> hri = rss.getOnlineRegions(tablename); tableRegionsCount = hri == null || hri.isEmpty() ? 0 : hri.size();
} catch (IOException e) {
LOG.debug("Failed getOnlineRegions " + Bytes.toString(tablename), e);
} return tableRegionsCount;


RegionServerServices rss = this.region.getRegionServerServices();


byte[] tablename = this.region.getTableDesc().getName();

最后获取表tablename在rss上的Online Region的数目:

List<HRegion> hri = rss.getOnlineRegions(tablename);


// Get size to check
long sizeToCheck = getSizeToCheck(tableRegionsCount);


* @return Region max size or
* <code>count of regions squared * flushsize, which ever is
* smaller; guard against there being zero regions on this server.
long getSizeToCheck(final int tableRegionsCount) {
return tableRegionsCount == 0 ? getDesiredMaxFileSize() : Math.min(
getDesiredMaxFileSize(), this.flushSize
* (tableRegionsCount * tableRegionsCount));



(2)tableRegionsCount值不为0时,结果为getDesiredMaxFileSize()与this.flushSize * (tableRegionsCount * tableRegionsCount)两者之间的最小值,其中flushSize在创建表时指定,如果创建表时没有特殊指定,则由配置项hbase.hregion.memstore.flush.size决定,默认值为134217728即128M。

RegionSplitPolicy getSplitPoint



* @return the key at which the region should be split, or null if it cannot
* be split. This will only be called if shouldSplit previously
* returned true.
protected byte[] getSplitPoint() {
byte[] explicitSplitPoint = this.region.getExplicitSplitPoint();
if (explicitSplitPoint != null) {
return explicitSplitPoint;
} Map<byte[], Store> stores = region.getStores(); byte[] splitPointFromLargestStore = null; long largestStoreSize = 0; for (Store s : stores.values()) {
byte[] splitPoint = s.getSplitPoint(); long storeSize = s.getSize(); if (splitPoint != null && largestStoreSize < storeSize) {
splitPointFromLargestStore = splitPoint; largestStoreSize = storeSize;
} return splitPointFromLargestStore;





Store getSplitPoint

* Determines if Store should be split
* @return byte[] if store should be split, null otherwise.
public byte[] getSplitPoint() {
this.lock.readLock().lock(); try {
// sanity checks
if (this.storefiles.isEmpty()) {
return null;
} // Should already be enforced by the split policy!
assert !this.region.getRegionInfo().isMetaRegion(); // Not splitable if we find a reference store file present in the
// store.
long maxSize = 0L; StoreFile largestSf = null; for (StoreFile sf : storefiles) {
if (sf.isReference()) {
// Should already be enforced since we return false in this
// case
assert false : "getSplitPoint() called on a region that can't split!"; return null;
} StoreFile.Reader r = sf.getReader(); if (r == null) {
LOG.warn("Storefile " + sf + " Reader is null"); continue;
} long size = r.length(); if (size > maxSize) {
// This is the largest one so far
maxSize = size; largestSf = sf;
} StoreFile.Reader r = largestSf.getReader(); if (r == null) {
LOG.warn("Storefile " + largestSf + " Reader is null"); return null;
} // Get first, last, and mid keys. Midkey is the key that starts
// block
// in middle of hfile. Has column and timestamp. Need to return just
// the row we want to split on as midkey.
byte[] midkey = r.midkey(); if (midkey != null) {
KeyValue mk = KeyValue.createKeyValueFromKey(midkey, 0,
midkey.length); byte[] fk = r.getFirstKey();
KeyValue firstKey = KeyValue.createKeyValueFromKey(fk, 0,
fk.length); byte[] lk = r.getLastKey();
KeyValue lastKey = KeyValue.createKeyValueFromKey(lk, 0,
lk.length); // if the midkey is the same as the first or last keys, then we
// cannot
// (ever) split this region.
if (this.comparator.compareRows(mk, firstKey) == 0
|| this.comparator.compareRows(mk, lastKey) == 0) {
if (LOG.isDebugEnabled()) {
LOG.debug("cannot split because midkey is the same as first or "
+ "last row");
} return null;
} return mk.getRow();
} catch (IOException e) {
LOG.warn("Failed getting store size for " + this, e);
} finally {
} return null;


(1)选择Store StoreFiles中的最大的那个StoreFile largestSf;

long maxSize = 0L;

StoreFile largestSf = null;

for (StoreFile sf : storefiles) {
if (sf.isReference()) {
// Should already be enforced since we return false in this
// case
assert false : "getSplitPoint() called on a region that can't split!"; return null;
} StoreFile.Reader r = sf.getReader(); if (r == null) {
LOG.warn("Storefile " + sf + " Reader is null"); continue;
} long size = r.length(); if (size > maxSize) {
// This is the largest one so far
maxSize = size; largestSf = sf;


// Get first, last, and mid keys. Midkey is the key that starts
// block
// in middle of hfile. Has column and timestamp. Need to return just
// the row we want to split on as midkey.
byte[] midkey = r.midkey(); if (midkey != null) {
KeyValue mk = KeyValue.createKeyValueFromKey(midkey, 0,
midkey.length); byte[] fk = r.getFirstKey();
KeyValue firstKey = KeyValue.createKeyValueFromKey(fk, 0,
fk.length); byte[] lk = r.getLastKey();
KeyValue lastKey = KeyValue.createKeyValueFromKey(lk, 0,
lk.length); // if the midkey is the same as the first or last keys, then we
// cannot
// (ever) split this region.
if (this.comparator.compareRows(mk, firstKey) == 0
|| this.comparator.compareRows(mk, lastKey) == 0) {
if (LOG.isDebugEnabled()) {
LOG.debug("cannot split because midkey is the same as first or "
+ "last row");
} return null;
} return mk.getRow();


Region Split是以Row作为最小切分单位的,即同一行的数据会完整的出现在某一Region中,如果MidKey与FirstKey相等或者MidKey与LastKey相等,则表示如果进行切分则会出现某Region中的RowKey是完全一样的,即该Region中仅包含一个行的数据,这种情况出现中HBase中是不合理的,因此不允许MidKey与FirstKey相等或者MidKey与LastKey相等时进行Split。



