HBase MemStore Flush由类org.apache.hadoop.hbase.regionserver.MemStoreFlusher实现,具体表现为HRegionServer中的一个实例变量cacheFlusher,类结构如下:

class MemStoreFlusher extends HasThread implements FlushRequester {
......
}

MemStoreFlusher实质是一个线程类。

HasThread可以理解为Thread的一个代码类:

/**
* Abstract class which contains a Thread and delegates the common Thread
* methods to that instance.
*
* The purpose of this class is to workaround Sun JVM bug #6915621, in which
* something internal to the JDK uses Thread.currentThread() as a monitor lock.
* This can produce deadlocks like HBASE-4367, HBASE-4101, etc.
*/
public abstract class HasThread implements Runnable { private final Thread thread; public HasThread() {
this.thread = new Thread(this);
} public HasThread(String name) {
this.thread = new Thread(this, name);
} public Thread getThread() {
return thread;
} public abstract void run(); // // Begin delegation to Thread public final String getName() {
return thread.getName();
} public void interrupt() {
thread.interrupt();
} public final boolean isAlive() {
return thread.isAlive();
} public boolean isInterrupted() {
return thread.isInterrupted();
} public final void setDaemon(boolean on) {
thread.setDaemon(on);
} public final void setName(String name) {
thread.setName(name);
} public final void setPriority(int newPriority) {
thread.setPriority(newPriority);
} public void setUncaughtExceptionHandler(UncaughtExceptionHandler eh) {
thread.setUncaughtExceptionHandler(eh);
} public void start() {
thread.start();
} public final void join() throws InterruptedException {
thread.join();
} public final void join(long millis, int nanos) throws InterruptedException {
thread.join(millis, nanos);
} public final void join(long millis) throws InterruptedException {
thread.join(millis);
}
// // End delegation to Thread }

FlushRequester是一个接口,仅包含一个方法:

/**
* Request a flush.
*/
public interface FlushRequester { /**
* Tell the listener the cache needs to be flushed.
*
* @param region
* the HRegion requesting the cache flush
*/
void requestFlush(HRegion region); }

核心变量

// These two data members go together. Any entry in the one must have
// a corresponding entry in the other.
private final BlockingQueue<FlushQueueEntry> flushQueue = new DelayQueue<FlushQueueEntry>(); private final Map<HRegion, FlushRegionEntry> regionsInQueue = new HashMap<HRegion, FlushRegionEntry>();

flushQueue:DelayQueue队列,元素类型为FlushQueueEntry,代表某一Region的Flush请求,Flusher线程不断地从该队列中获取请求信息,完成Region的Flush操作;

regionsInQueue:维护HRegion实例与请求FlushRegionEntry之间的对应关系;

如注释中所说,如果某一个FlushQueueEntry实例存在于flushQueue中,那么它必然存在于regionsInQueue中,后者看似多余,其实不然,例如,验证某一Region是否已经发起过Flush请求。

private AtomicBoolean wakeupPending = new AtomicBoolean();

wakeupPending:主要与flushQueue结合使用,flushQueue是一种阻塞队列,当队列为空时,poll操作会将线程阻塞一段时间,某些情况下需要在flushQueue中加入一个“空元素”,以唤醒线程工作,但如果线程本次操作(后面会看到Flusher线程工作实质是一个循环操作)已经被加入“空”元素,则不需要重复加入。

private final long threadWakeFrequency;

threadWakeFrequency:用于flushQueue执行poll操作时,最多等待多长时间,配置项为hbase.server.thread.wakefrequency;

private final HRegionServer server;

server:当前HRegionServer实例;

private final ReentrantLock lock = new ReentrantLock();

private final Condition flushOccurred = lock.newCondition();

lock、flushOccurred:用于同步操作,类似于synchronized、wait、signal、signalAll;

protected final long globalMemStoreLimit;

protected final long globalMemStoreLimitLowMark;

private static final float DEFAULT_UPPER = 0.4f;

private static final float DEFAULT_LOWER = 0.35f;

private static final String UPPER_KEY = "hbase.regionserver.global.memstore.upperLimit";

private static final String LOWER_KEY = "hbase.regionserver.global.memstore.lowerLimit";

globalMemStoreLimit、globalMemStoreLimitLowMark:表示HRegionServer整个MemStore的上下限值,当整个MemStore的内存消耗值达到下限值时就会采取相应的措施;

private long blockingStoreFilesNumber;

private long blockingWaitTime;

blockingStoreFilesNumber:对某一Region执行Flush操作时,如果该Region中的某一Store中已有的StoreFile数目超过blockingStoreFilesNumber(hbase.hstore.compactionThreshold),则该Region的Flush操作会被最多延迟blockingWaitTime(hbase.hstore.blockingWaitTime)。

Flush请求

所有的Region Flush请求会被放到一个DelayedQueue中,因此放入该队列的元素必须实现Delayed接口:

interface FlushQueueEntry extends Delayed {
}

Flush请求会被分为两种类型:“空”请求与实质请求,“空”请求主要用于唤醒线程,实质请求即为Region Flush请求。

“空”请求:

/**
* Token to insert into the flush queue that ensures that the flusher does
* not sleep
*/
static class WakeupFlushThread implements FlushQueueEntry { @Override
public long getDelay(TimeUnit unit) {
return 0;
} @Override
public int compareTo(Delayed o) {
return -1;
} }

“空”请求的作用主要是唤醒,不需要任何实质性的内容,且延迟时间被设为0,表示立即可从队列中获取。

实质请求:

/**
* Datastructure used in the flush queue. Holds region and retry count.
* Keeps tabs on how old this object is. Implements {@link Delayed}. On
* construction, the delay is zero. When added to a delay queue, we'll come
* out near immediately. Call {@link #requeue(long)} passing delay in
* milliseconds before readding to delay queue if you want it to stay there
* a while.
*/
static class FlushRegionEntry implements FlushQueueEntry { private final HRegion region; private final long createTime; private long whenToExpire; private int requeueCount = 0; FlushRegionEntry(final HRegion r) {
this.region = r; this.createTime = System.currentTimeMillis(); this.whenToExpire = this.createTime;
} /**
* @param maximumWait
* @return True if we have been delayed > <code>maximumWait</code>
* milliseconds.
*/
public boolean isMaximumWait(final long maximumWait) {
return (System.currentTimeMillis() - this.createTime) > maximumWait;
} /**
* @return Count of times {@link #resetDelay()} was called; i.e this is
* number of times we've been requeued.
*/
public int getRequeueCount() {
return this.requeueCount;
} /**
* @param when
* When to expire, when to come up out of the queue. Specify
* in milliseconds. This method adds
* System.currentTimeMillis() to whatever you pass.
* @return This.
*/
public FlushRegionEntry requeue(final long when) {
this.whenToExpire = System.currentTimeMillis() + when; this.requeueCount++; return this;
} @Override
public long getDelay(TimeUnit unit) {
return unit.convert(this.whenToExpire - System.currentTimeMillis(),
TimeUnit.MILLISECONDS);
} @Override
public int compareTo(Delayed other) {
return Long.valueOf(
getDelay(TimeUnit.MILLISECONDS)
- other.getDelay(TimeUnit.MILLISECONDS)).intValue();
} @Override
public String toString() {
return "[flush region "
+ Bytes.toStringBinary(region.getRegionName()) + "]";
} }

region:表示发起Flush请求的HRegion实例;

createTime:表示Flush请求的创建时间;

whenToExpire:表示Flush请求的过期时间;

requeueCount:表示Flush请求的入队次数,因为有些Flush请求根据情况需要被延迟执行,所以需要重新入队。

构造函数

MemStoreFlusher的构造函数主要用于初始化上述这些变量,其中比较重要的是RegionServer整个MemStore内存消耗上下限值的计算:

long max = ManagementFactory.getMemoryMXBean().getHeapMemoryUsage()
.getMax(); this.globalMemStoreLimit = globalMemStoreLimit(max, DEFAULT_UPPER,
UPPER_KEY, conf); long lower = globalMemStoreLimit(max, DEFAULT_LOWER, LOWER_KEY, conf); if (lower > this.globalMemStoreLimit) {
lower = this.globalMemStoreLimit; LOG.info("Setting globalMemStoreLimitLowMark == globalMemStoreLimit "
+ "because supplied " + LOWER_KEY + " was > " + UPPER_KEY);
} this.globalMemStoreLimitLowMark = lower;

方法globalMemStoreLimit的相关代码如下:

/**
* Calculate size using passed <code>key</code> for configured percentage of
* <code>max</code>.
*
* @param max
* @param defaultLimit
* @param key
* @param c
* @return Limit.
*/
static long globalMemStoreLimit(final long max, final float defaultLimit,
final String key, final Configuration c) {
float limit = c.getFloat(key, defaultLimit); return getMemStoreLimit(max, limit, defaultLimit);
} static long getMemStoreLimit(final long max, final float limit,
final float defaultLimit) {
float effectiveLimit = limit; if (limit >= 0.9f || limit < 0.1f) {
LOG.warn("Setting global memstore limit to default of "
+ defaultLimit
+ " because supplied value outside allowed range of 0.1 -> 0.9"); effectiveLimit = defaultLimit;
} return (long) (max * effectiveLimit);
}

循环Flush操作

Flush请求的处理是在一个循环的操作中被处理的:

@Override
public void run() {
while (!this.server.isStopped()) {
FlushQueueEntry fqe = null; try {
......
} catch (InterruptedException ex) {
continue;
} catch (ConcurrentModificationException ex) {
continue;
} catch (Exception ex) {
LOG.error("Cache flusher failed for entry " + fqe, ex); if (!server.checkFileSystem()) {
break;
}
}
} this.regionsInQueue.clear(); this.flushQueue.clear(); // Signal anyone waiting, so they see the close flag
lock.lock(); try {
flushOccurred.signalAll();
} finally {
lock.unlock();
} LOG.info(getName() + " exiting");
}
只要该HRegionServer没有被请求停止,则该操作将一直被执行,不断地从请求队列中获取具体的请求fqe,然后执行Flush操作,具体的操作被包含在一个try、catch块中。如果该HRegionServer已经被请求停止,则会清空相应的数据结构及唤醒其它被阻塞的线程。
某一Flush操作
wakeupPending.set(false); // allow someone to wake us up again

fqe = flushQueue.poll(threadWakeFrequency,
TimeUnit.MILLISECONDS);

从队列中获取一个Flush请求,如果此时队列为空则本线程会被阻塞直至超时,wakeupPending.set(false)则表示外界在某些条件下可以通过向队列中加入一个“空”请求(WakeupFlushThread)来唤醒被阻塞的线程。
如果从队列中获取数据的结果fqe为null或者为WakeupFlushThread实例时,则执行以下代码:
if (fqe == null || fqe instanceof WakeupFlushThread) {
if (isAboveLowWaterMark()) {
LOG.debug("Flush thread woke up because memory above low water="
+ StringUtils
.humanReadableInt(this.globalMemStoreLimitLowMark)); if (!flushOneForGlobalPressure()) {
// Wasn't able to flush any region, but we're above
// low water mark
// This is unlikely to happen, but might happen when
// closing the
// entire server - another thread is flushing
// regions. We'll just
// sleep a little bit to avoid spinning, and then
// pretend that
// we flushed one, so anyone blocked will check
// again
lock.lock(); try {
Thread.sleep(1000); flushOccurred.signalAll();
} finally {
lock.unlock();
}
} // Enqueue another one of these tokens so we'll wake up
// again
wakeupFlushThread();
} continue;
}

此时并没有获取到实质的Flush请求,主要判断当前RegionServer整个MemStore的内存消耗是否已达到下限临界值,如果已达到下限临界值,则为了缓解内存压力,需要选取某一个Region进行Flush操作。

判断内存消耗由方法isAboveHighWaterMark完成:

/**
* Return true if we're above the high watermark
*/
private boolean isAboveLowWaterMark() {
return server.getRegionServerAccounting().getGlobalMemstoreSize() >= globalMemStoreLimitLowMark;
}

如果isAboveLowWaterMark返回值为true,则表示此时RegionServer的整个MemStore内存消耗已达到下限临界值,需要选取一个Region进行Flush以缓解内存压力,由方法flushOneForGlobalPressure完成:

/**
* The memstore across all regions has exceeded the low water mark. Pick one
* region to flush and flush it synchronously (this is called from the flush
* thread)
*
* @return true if successful
*/
private boolean flushOneForGlobalPressure() {
SortedMap<Long, HRegion> regionsBySize = server
.getCopyOfOnlineRegionsSortedBySize(); Set<HRegion> excludedRegions = new HashSet<HRegion>(); boolean flushedOne = false; while (!flushedOne) {
......
} return true;
}

上述代码的主体思想是不断循环操作,直接成功选取某一Region完成Flush操作为止,在循环操作开始之前,已经依据Region大小获取到了该RegionServer上的所有Region:regionsBySize(SortedMap实现,依据Region大小作为排序依据,顺序为从大到小),如果选取的Region在执行Flush操作时发生了某些异常,导致Flush失败,则将其保存至excludedRegions,以使在下次选取过程中能够将其排除。

循环中的操作流程如下:

// Find the biggest region that doesn't have too many storefiles
// (might be null!)
HRegion bestFlushableRegion = getBiggestMemstoreRegion(
regionsBySize, excludedRegions, true);

选取当前状态下最适合进行Flush操作的Region,该Region需要满足两个条件:

(1)Region没有包含超过一定数量的StoreFile;

(2)在满足(1)的所有Region中大小为最大值。

具体执行时代码如下:

private HRegion getBiggestMemstoreRegion(
SortedMap<Long, HRegion> regionsBySize,
Set<HRegion> excludedRegions, boolean checkStoreFileCount) {
synchronized (regionsInQueue) {
for (HRegion region : regionsBySize.values()) {
//如果Region出现在excludedRegions中,则表示该Region是unflushable的。
if (excludedRegions.contains(region)) {
continue;
} if (checkStoreFileCount && isTooManyStoreFiles(region)) {
continue;
} return region;
}
} return null;
} private boolean isTooManyStoreFiles(HRegion region) {
for (Store hstore : region.stores.values()) {
if (hstore.getStorefilesCount() > this.blockingStoreFilesNumber) {
return true;
}
} return false;
}

因为regionsBySize中的Region就是根据Region大小从大到小排列的,只要依次处理其中的Region即可,如果该Region即没有出现在excludedRegions,也没有包含过多的StoreFile(checkStoreFileCount为true),即该Region就是bestFlushableRegion。

为了防止bestFlushableRegion为null(如果目前所有的Region包含的StoreFile数目都大于临界值blockingStoreFilesNumber),我们需要选取一个目前最大的Region作为备选,即时它拥有的StoreFile数目大于临界值blockingStoreFilesNumber。

// Find the biggest region, total, even if it might have too many
// flushes.
HRegion bestAnyRegion = getBiggestMemstoreRegion(regionsBySize,
excludedRegions, false); if (bestAnyRegion == null) {
LOG.error("Above memory mark but there are no flushable regions!"); return false;
}

执行getBiggestMemstoreRegion方法时,checkStoreFileCount为false,表示这些选取不考虑Region包含StoreFile的数目。

如果我们无法获取一个bestAnyRegion(bestAnyRegion为null),表示目前虽然内存压力较大,但是我们无法选取出一个可进行Flush操作的Region,直接返回false即可。

无法选取出一个可进行Flush操作的Region的原因一般有两个:

(1)在循环选取的过程中,我们发现所有的Region进行Flush操作时都失败了(可能原因是HDFS失效),它们都会出现在excludedRegions中,因此,会导致上述方法执行时返回值为null;

(2)RegionServer开始执行关闭操作。

HRegion regionToFlush;

if (bestFlushableRegion != null
&& bestAnyRegion.memstoreSize.get() > 2 * bestFlushableRegion.memstoreSize
.get()) {
// Even if it's not supposed to be flushed, pick a region if
// it's more than twice
// as big as the best flushable one - otherwise when we're under
// pressure we make
// lots of little flushes and cause lots of compactions, etc,
// which just makes
// life worse!
if (LOG.isDebugEnabled()) {
LOG.debug("Under global heap pressure: "
+ "Region "
+ bestAnyRegion.getRegionNameAsString()
+ " has too many "
+ "store files, but is "
+ StringUtils
.humanReadableInt(bestAnyRegion.memstoreSize
.get())
+ " vs best flushable region's "
+ StringUtils
.humanReadableInt(bestFlushableRegion.memstoreSize
.get()) + ". Choosing the bigger.");
} regionToFlush = bestAnyRegion;
} else {
if (bestFlushableRegion == null) {
regionToFlush = bestAnyRegion;
} else {
regionToFlush = bestFlushableRegion;
}
}

根据bestFlushableRegion和bestAnyRegion的选取结果,决定最后的选取结果regionToFlush:

(1)虽然bestFlushableRegion不为null,但bestAnyRegion的MemStore大小比bestFlushableRegion的MemStore大小两倍还要在,此时regionToFlush = bestAnyRegion;

(2)否则,如果bestFlushableRegion为null,则regionToFlush = bestAnyRegion,否则regionToFlush = bestFlushableRegion。

至此,我们已经选取出了需要进行Flush操作的Region:regionToFlush,接下来对其进行Flush即可:

Preconditions.checkState(regionToFlush.memstoreSize.get() > 0);

LOG.info("Flush of region " + regionToFlush
+ " due to global heap pressure"); flushedOne = flushRegion(regionToFlush, true); if (!flushedOne) {
LOG.info("Excluding unflushable region " + regionToFlush
+ " - trying to find a different region to flush."); excludedRegions.add(regionToFlush);
}

如果该Region的Flush操作失败,即flushRegion的返回值为false,将其添加至excludedRegions中,并继续循环选取。

如果flushOneForGlobalPressure的返回值为false,则表示我们无法选取一个Region进行Flush,如注释所说,造成这种情况可能原因是RegionServer正处于关闭中,此时,会有其它线程来负责Region的Flush操作。我们仅仅需要休眠一会儿,假装我们完成了一个Region的Flush,然后就可以唤醒其它因内存压力而阻塞的线程了,使它们可以再次对内存消耗大小进行确认(后面会讲述为何有线程被阻塞)。

如果从队列中获取数据的结果fqe为FlushRegionEntry实例,则会直接执行以下代码:

FlushRegionEntry fre = (FlushRegionEntry) fqe;

if (!flushRegion(fre)) {
break;
}

直接执行相应Region的Flush操作,如果发生错误(认为不可修复),则结束MemStoreFlusher线程的循环操作,执行清理工作。

MemStore与Put

在我们将大批量的数据定入HBase时,可能会由于内存的原因导致写入操作的Block,主要有以下两个方面的原因:

(1)reclaimMemStoreMemory

该方法是MemStoreFlusher的实例方法,在执行具体的Region batchMutate操作(完成写入操作)之前被调用,

HRegion region = getRegion(regionName);

if (!region.getRegionInfo().isMetaTable()) {
/*
* This method blocks callers until we're down to a safe
* amount of memstore consumption.
*
* ******************************************************
*/
this.cacheFlusher.reclaimMemStoreMemory();
}

可见,一般地用户表都会在实际写入数据之前都会调用此方法,该方法可能会导致写入的阻塞。

reclaimMemStoreMemory分两种情况进行处理:isAboveHighWaterMark、isAboveLowWaterMark。

isAboveHighWaterMark:RegionServer整个MemStore的内存消耗值超过上限值

if (isAboveHighWaterMark()) {
lock.lock(); try {
boolean blocked = false; long startTime = 0; while (isAboveHighWaterMark() && !server.isStopped()) {
if (!blocked) {
startTime = EnvironmentEdgeManager.currentTimeMillis(); LOG.info("Blocking updates on "
+ server.toString()
+ ": the global memstore size "
+ StringUtils.humanReadableInt(server
.getRegionServerAccounting()
.getGlobalMemstoreSize())
+ " is >= than blocking "
+ StringUtils
.humanReadableInt(globalMemStoreLimit)
+ " size");
} blocked = true; wakeupFlushThread(); try {
// we should be able to wait forever, but we've seen a
// bug where
// we miss a notify, so put a 5 second bound on it at
// least.
flushOccurred.await(5, TimeUnit.SECONDS);
} catch (InterruptedException ie) {
Thread.currentThread().interrupt();
}
} if (blocked) {
final long totalTime = EnvironmentEdgeManager
.currentTimeMillis() - startTime; if (totalTime > 0) {
this.updatesBlockedMsHighWater.add(totalTime);
} LOG.info("Unblocking updates for server "
+ server.toString());
}
} finally {
lock.unlock();
}
}

当写入数据之前,如果我们发现当内存的消耗已经超过上限值时,会有一个循环等待的过程,直到内存的消耗值低于上限值为止,在每次等待操作之前都会通过wakeupFlushThread方法在Flush请求队列放入一个空元素,以激活MemStoreFlusher线程进行工作(可能会选取某一Region进行Flush),其中,上限值的判断如下所示:

/**
* Return true if global memory usage is above the high watermark
*/
private boolean isAboveHighWaterMark() {
return server.getRegionServerAccounting().getGlobalMemstoreSize() >= globalMemStoreLimit;
}

isAboveLowWaterMark:RegionServer的整个MemStore的内存消耗值仅超过下限值

else if (isAboveLowWaterMark()) {
wakeupFlushThread();
}

此时,不需要阻塞写入操作,仅仅需要在Flush请求队列中加入一个“空”元素,促使MemStoreFlusher工作即可。

(2)checkResources

/**
* Perform a batch of mutations. It supports only Put and Delete mutations
* and will ignore other types passed.
*
* @param mutationsAndLocks
* the list of mutations paired with their requested lock IDs.
* @return an array of OperationStatus which internally contains the
* OperationStatusCode and the exceptionMessage if any.
* @throws IOException
*/
public OperationStatus[] batchMutate(
Pair<Mutation, Integer>[] mutationsAndLocks) throws IOException {
BatchOperationInProgress<Pair<Mutation, Integer>> batchOp = new BatchOperationInProgress<Pair<Mutation, Integer>>(
mutationsAndLocks); boolean initialized = false; while (!batchOp.isDone()) {
checkReadOnly(); // Check if resources to support an update, may be blocked.
checkResources();
...... } return batchOp.retCodeDetails;
}

在Region batchMutate中,每次循环写入数据之前都会进行checkResources的操作,该操作可能会导致本次地写入操作被阻塞。

/*
* Check if resources to support an update.
*
* Here we synchronize on HRegion, a broad scoped lock. Its appropriate
* given we're figuring in here whether this region is able to take on
* writes. This is only method with a synchronize (at time of writing), this
* and the synchronize on 'this' inside in internalFlushCache to send the
* notify.
*/
private void checkResources() throws RegionTooBusyException,
InterruptedIOException {
// If catalog region, do not impose resource constraints or block
// updates.
if (this.getRegionInfo().isMetaRegion()) {
return;
} boolean blocked = false; long startTime = 0; while (this.memstoreSize.get() > this.blockingMemStoreSize) {
requestFlush(); if (!blocked) {
startTime = EnvironmentEdgeManager.currentTimeMillis(); LOG.info("Blocking updates for '"
+ Thread.currentThread().getName()
+ "' on region "
+ Bytes.toStringBinary(getRegionName())
+ ": memstore size "
+ StringUtils.humanReadableInt(this.memstoreSize.get())
+ " is >= than blocking "
+ StringUtils
.humanReadableInt(this.blockingMemStoreSize)
+ " size");
} long now = EnvironmentEdgeManager.currentTimeMillis(); long timeToWait = startTime + busyWaitDuration - now; if (timeToWait <= 0L) {
final long totalTime = now - startTime; this.updatesBlockedMs.add(totalTime); LOG.info("Failed to unblock updates for region " + this + " '"
+ Thread.currentThread().getName() + "' in "
+ totalTime + "ms. The region is still busy."); throw new RegionTooBusyException("region is flushing");
} blocked = true; synchronized (this) {
try {
wait(Math.min(timeToWait, threadWakeFrequency));
} catch (InterruptedException ie) {
final long totalTime = EnvironmentEdgeManager
.currentTimeMillis() - startTime; if (totalTime > 0) {
this.updatesBlockedMs.add(totalTime);
} LOG.info("Interrupted while waiting to unblock updates for region "
+ this
+ " '"
+ Thread.currentThread().getName()
+ "'"); InterruptedIOException iie = new InterruptedIOException(); iie.initCause(ie); throw iie;
}
}
} if (blocked) {
// Add in the blocked time if appropriate
final long totalTime = EnvironmentEdgeManager.currentTimeMillis()
- startTime; if (totalTime > 0) {
this.updatesBlockedMs.add(totalTime);
} LOG.info("Unblocking updates for region " + this + " '"
+ Thread.currentThread().getName() + "'");
}
}

由上述代码可知,阻塞条件为

this.memstoreSize.get() > this.blockingMemStoreSize

如果上述条件成立,本次写入操作会被阻塞直到该Region MemStore的内存消耗值低于要求值为止。

其中,memstoreSize表示即将被写入数据的Region的MemStore的当前大小,blockingMemStoreSize由下述代码计算而来:

long flushSize = this.htableDescriptor.getMemStoreFlushSize();

if (flushSize <= 0) {
flushSize = conf.getLong(HConstants.HREGION_MEMSTORE_FLUSH_SIZE,
HTableDescriptor.DEFAULT_MEMSTORE_FLUSH_SIZE);
} this.memstoreFlushSize = flushSize; this.blockingMemStoreSize = this.memstoreFlushSize
* conf.getLong("hbase.hregion.memstore.block.multiplier", 2);

可以看出,blockingMemStoreSize为memstoreFlushSize的整数倍。

MemStoreFlusher flushRegion

当MemStoreFlusher线程在Flush队列中取出要进行Flush操作的请求元素(FlushRegionEntry)时,都是通过下面的方法来完成Flush的。

/*
* A flushRegion that checks store file count. If too many, puts the flush
* on delay queue to retry later.
*
* @param fqe
*
* @return true if the region was successfully flushed, false otherwise. If
* false, there will be accompanying log messages explaining why the log was
* not flushed.
*/
private boolean flushRegion(final FlushRegionEntry fqe) {
HRegion region = fqe.region; if (!fqe.region.getRegionInfo().isMetaRegion()
&& isTooManyStoreFiles(region)) {
if (fqe.isMaximumWait(this.blockingWaitTime)) {
LOG.info("Waited "
+ (System.currentTimeMillis() - fqe.createTime)
+ "ms on a compaction to clean up 'too many store files'; waited "
+ "long enough... proceeding with flush of "
+ region.getRegionNameAsString());
} else {
// If this is first time we've been put off, then emit a log
// message.
if (fqe.getRequeueCount() <= 0) {
// Note: We don't impose blockingStoreFiles constraint on
// meta regions
LOG.warn("Region " + region.getRegionNameAsString()
+ " has too many "
+ "store files; delaying flush up to "
+ this.blockingWaitTime + "ms"); if (!this.server.compactSplitThread.requestSplit(region)) {
try {
this.server.compactSplitThread.requestCompaction(
region, getName());
} catch (IOException e) {
LOG.error(
"Cache flush failed"
+ (region != null ? (" for region " + Bytes
.toStringBinary(region
.getRegionName()))
: ""),
RemoteExceptionHandler.checkIOException(e));
}
}
} // Put back on the queue. Have it come back out of the queue
// after a delay of this.blockingWaitTime / 100 ms.
this.flushQueue.add(fqe.requeue(this.blockingWaitTime / 100)); // Tell a lie, it's not flushed but it's ok
return true;
}
} return flushRegion(region, false);
}

上述代码根据具体情况,可能会在执行具体的flushRegion操作之前,采取一些特殊的动作。

如果当前Region所属的表是用户表,且该Region中包含过多的StoreFile,则会下述判断:

(1)该Flush请求已达到最大等待时间,认为此时必须进行处理,仅仅打印一些信息即可(因此请求队列的实现为一个DealyedQueue,每一个队列元素都会根据自己的“过期时间”进行排序);

(2)该Flush请求尚未达到最大等待时间,认为因为该Region已经包含过多的StoreFile,应该延迟本次的Flush请求,而且在延迟操作之前,如果是第一次被延迟,则会根据情况判断是否发起Split或Compact请求;

HBase MemStoreFlusher的更多相关文章

  1. hbase基本结构

    HBASE  基本结构一.overview1. hbase <=> NOSQL     不错,hbase 就是某种类型的nosql 数据库,唯一的区别就是他支持海量的数据.    hbas ...

  2. hbase多用户入库,regionserver下线问题

    近期对hbase多用户插入数据时,regionserver会莫名奇妙的关闭,regionserver的日志有很多异常: 如下: org.apache.hadoop.hbase.DroppedSnaps ...

  3. 第五章:大数据 の HBase 进阶

    本课主题 HBase 读写数据的流程 HBase 性能优化和最住实践 HBase 管理和集群操作 HBase 备份和复制 引言 前一篇 HBase 基础 (HBase 基础) 简单介绍了NoSQL是什 ...

  4. HBase MetaStore和Compaction剖析

    1.概述 客户端读写数据是先从HBase Master获取RegionServer的元数据信息,比如Region地址信息.在执行数据写操作时,HBase会先写MetaStore,为什么会写到MetaS ...

  5. HBase flush

    flush触发方式 1. Server端执行更新操作(put.delete.multi(MultiAction<R>multi).(private)checkAndMutate.mutat ...

  6. HBase写被block的分析

    一个线上集群出现莫名奇妙不能写入数据的bug,log中不断打印如下信息: 引用 2011-11-09 07:35:45,911 INFO org.apache.hadoop.hbase.regions ...

  7. HBase之HRegionServer启动(含与HMaster交互)

    在我的博文<HBase——HMaster启动之一>.<HBase——HMaster启动之二>中已经详细介绍过HMaster在启动过程中调用的各种方法.下面,单就HRegionS ...

  8. HBase写入性能改造(续)--MemStore、flush、compact参数调优及压缩卡的使用【转】

    首先续上篇测试:   经过上一篇文章中对代码及参数的修改,Hbase的写入性能在不开Hlog的情况下从3~4万提高到了11万左右. 本篇主要介绍参数调整的方法,在HDFS上加上压缩卡,最后能达到的写入 ...

  9. 一次HBase问题的解决过程(Status: INCONSISTENT)

    ==版本信息== HBase:2.7.1 Storm:1.0.1 RocketMQ:3.4.6(阿里版) ==问题描述== 2018年9月3号晚上23点左右,例行巡检系统运行状况时, 发现Storm消 ...

随机推荐

  1. [io PWA] keynote: Launching a Progressive Web App on Google.com

    Mainly about Material design (effects / colors / flashy stuff) Components (web components / polymer) ...

  2. Redis配置文件分析

    #Redis演示示例配置文件 # 注意单位问题:当须要设置内存大小的时候,能够使用类似1k.5GB.4M这种常见格式: # # 1k=> 1000 bytes #1kb => 1024 b ...

  3. debian7 安装配置

    最近几天折腾了一下Debian 7 (gnome桌面DVD版,KDE桌面CD版最后会提到),总的来说收获还是挺大的,对比以前使用ubuntu,debian 7给我的感觉像是一个新生婴儿,不带多余的花俏 ...

  4. Java 类的热替换---转载

    构建基于 Java 的在线升级系统 Java ClassLoader 技术剖析 在本文中,我们将不对 Java ClassLoader 的细节进行过于详细的讲解,而是关注于和构建在线升级系统相关的基础 ...

  5. 利用System V消息队列实现回射客户/服务器

    一.介绍 在学习UNIX网络编程 卷1时,我们当时可以利用Socket套接字来实现回射客户/服务器程序,但是Socket编程是存在一些不足的,例如: 1. 服务器必须启动之时,客户端才能连上服务端,并 ...

  6. ZOJ3527

    题意:给你一个有向图,一共N个顶点,且每个顶点只有一个前驱或后继,在顶点上建立圣地,那么就可以获得一个信仰值,如果在这个顶点的后继节点上也建立圣地,那么将改变一定的信仰值,求解能获取的最大信仰值. 思 ...

  7. composer之安装

    最近想要学习下yii框架,所以,就看了下官网,看到了貌似比较依赖composer这个东西,然后我就安装了,但是会有问题,安装不上等等问题,不论是windows还是linux命令行安装,都是因为一个问题 ...

  8. 用DOM实现文章采集-HtmlAgilityPack实现html解析

    Html Agility Pack 是CodePlex 上的一个开源项目.它提供了标准的DOM API 和XPath 支持! 下载地址:http://htmlagilitypack.codeplex. ...

  9. POJ3185 The Water Bowls(反转法or dfs 爆搜)

    POJ3185 The Water Bowls 题目大意: 奶牛有20只碗摆成一排,用鼻子顶某只碗的话,包括左右两只在内的一共三只碗会反向,现在给出碗的初始状态,问至少要用鼻子顶多少次才能使所有碗都朝 ...

  10. iscroll.js & flipsnap.js

    两个js都可以用做手机的滑动框架iscroll.js功能更多flipsnap.js应该只能水平滑动. iscroll.js介绍http://iiunknown.gitbooks.io/iscroll- ...