Major compaction时的scan操作
版权声明:本文为博主原创文章。未经博主同意不得转载。
https://blog.csdn.net/u014393917/article/details/24419355
Major compaction时的scan操作
发起majorcompaction时,通过CompactSplitThread.CompactionRunner.run開始运行
-->region.compact(compaction,store)-->store.compact(compaction)-->
CompactionContext.compact,发起compact操作
CompactionContext的实例通过HStore中的storeEngine.createCompaction()生成,
默认值为DefaultStoreEngine,通过hbase.hstore.engine.class配置。
默认的CompactionContext实例为DefaultCompactionContext。
而DefaultCompactionContext.compact方法终于调用DefaultStoreEngine.compactor来运行
compactor的实现通过hbase.hstore.defaultengine.compactor.class配置,默认实现为DefaultCompactor
调用DefaultCompactor.compact(request);
1.依据要进行compact的storefile文件,生成相应的StoreFileScanner集合列表。
在生成StoreFileScanner实例时,每个scanner中的ScanQueryMatcher为null
2.创建StoreScanner实例。设置ScanType为ScanType.COMPACT_DROP_DELETES。
privateStoreScanner(Storestore,ScanInfoscanInfo,Scanscan,
List<?
extendsKeyValueScanner>scanners,ScanTypescanType,longsmallestReadPoint,
longearliestPutTs,byte[]dropDeletesFromRow,byte[]dropDeletesToRow)throwsIOException
{
this(store,false,scan,null,scanInfo.getTtl(),
scanInfo.getMinVersions());
if(dropDeletesFromRow==null){
运行这里,传入的columns为null
matcher=newScanQueryMatcher(scan,scanInfo,null,scanType,
smallestReadPoint,earliestPutTs,oldestUnexpiredTS);
}else{
matcher=newScanQueryMatcher(scan,scanInfo,null,smallestReadPoint,
earliestPutTs,oldestUnexpiredTS,dropDeletesFromRow,dropDeletesToRow);
}
ScanqueryMatcher的构造方法:
传入的columns为null
publicScanQueryMatcher(Scanscan,ScanInfoscanInfo,
NavigableSet<byte[]>columns,ScanTypescanType,
longreadPointToUse,longearliestPutTs,longoldestUnexpiredTS){
tr中mintime=0,maxtime=long.maxvalue
this.tr=scan.getTimeRange();
this.rowComparator=scanInfo.getComparator();
此deletes属性中的kvdelete信息为到一个新的row时。都会又一次进行清空。
this.deletes=newScanDeleteTracker();
this.stopRow=scan.getStopRow();
this.startKey=
KeyValue.createFirstDeleteFamilyOnRow(scan.getStartRow(),
scanInfo.getFamily());
得到filter实例
this.filter=scan.getFilter();
this.earliestPutTs=earliestPutTs;
this.maxReadPointToTrackVersions=readPointToUse;
this.timeToPurgeDeletes=scanInfo.getTimeToPurgeDeletes();
此处为的值为false
/*how to deal with deletes */
this.isUserScan=scanType==
ScanType.USER_SCAN;
此处的值为false,scanInfo.getKeepDeletedCells()的值默认false,
可通过table的columnfmaily中配置KEEP_DELETED_CELLS属性
scan.isRaw()可通过在scan中setAttribute的_raw_属性,默觉得false
//keep deleted cells: if compaction or raw scan
this.keepDeletedCells=
(scanInfo.getKeepDeletedCells()&& !isUserScan)||scan.isRaw();
此处的值为false,此时是major的compact,不保留delete的数据
scan.isRaw()可通过在scan中setAttribute的_raw_属性,默觉得false
//retain deletes: if minor compaction or raw scan
this.retainDeletesInOutput=scanType==
ScanType.COMPACT_RETAIN_DELETES||scan.isRaw();
此时的值为false
//seePastDeleteMarker: user initiated scans
this.seePastDeleteMarkers=scanInfo.getKeepDeletedCells()&&isUserScan;
得到查询的最大版本号数,此时的scan.maxversion与scanInfo.maxversion的值是同样的值
intmaxVersions=
scan.isRaw()?scan.getMaxVersions():
Math.min(scan.getMaxVersions(),
scanInfo.getMaxVersions());
生成columns属性的值为ScanWildcardColumnTracker实例,设置hasNullColumn的值为true
//Single branch to deal with two types of reads (columnsvsall
in family)
if(columns
==null||columns.size()==
0) {
//there is always a null column in thewildcardcolumn
query.
hasNullColumn=true;
columns属性中的index表示当前比对到的column的下标值。每比較一行时。此值会又一次清空
//use a specialized scan forwildcardcolumn
tracker.
this.columns=newScanWildcardColumnTracker(
scanInfo.getMinVersions(),maxVersions,oldestUnexpiredTS);
}else{
这个部分在compact时是不会运行的
//whether there is null column in the explicit column query
hasNullColumn= (columns.first().length==
0);
//We can share the ExplicitColumnTracker,diffis
we reset
//between rows, not betweenstorefiles.
this.columns=newExplicitColumnTracker(columns,
scanInfo.getMinVersions(),maxVersions,oldestUnexpiredTS);
}
}
ScanQueryMatcher.match过滤kv是否包括的方法分析
在通过StoreScanner.next(kvlist,limit)得到下一行的kv集合时
调用ScanQueryMatcher.match来过滤数据的方法分析
当中match方法返回的值详细作用可參见StoreScanner中的例如以下方法:
publicbooleannext(List<Cell>outResult,intlimit).....
publicMatchCodematch(KeyValuekv)throwsIOException
{
调用filter的filterAllRemaining方法,假设此方法返回true表示此次scan结束
if(filter
!=null&&filter.filterAllRemaining()){
returnMatchCode.DONE_SCAN;
}
得到kv的值
byte[]bytes
=kv.getBuffer();
KV在bytes中的開始位置
intoffset
=kv.getOffset();
得到key的长度
keyvalue的组成:
4 |
4 |
2 |
~ |
1 |
~ |
~ |
8 |
1 |
~ |
kenlen |
vlen |
rowlen |
row |
cflen |
cf |
column |
timestamp |
kvtype |
value |
intkeyLength
=Bytes.toInt(bytes,offset,Bytes.SIZEOF_INT);
得到rowkey的长度记录的開始位置(不包括keylen与vlen)
offset+= KeyValue.ROW_OFFSET;
rowkey的长度记录的開始位置
intinitialOffset=offset;
得到rowkey的长度
shortrowLength
=Bytes.toShort(bytes,offset,Bytes.SIZEOF_SHORT);
得到rowkey的開始位置
offset+= Bytes.SIZEOF_SHORT;
比較当前传入的kv的rowkey部分是否与当前行中第一个kv的rowkey部分同样。也就是是否是同一行的数据
intret
=this.rowComparator.compareRows(row,this.rowOffset,this.rowLength,
bytes,offset,rowLength);
假设当前传入的kv中的rowkey大于当前行的kv的rowkey部分,表示如今传入的kv是下一行,
结束当前next操作,(不是结束scan,是结束当次的next。表示这个next的一行数据的全部kv都查找完了)
if(ret
<=-1) {
returnMatchCode.DONE;
否则表示当前传入的kv是上一行的数据,须要把当前的scanner向下移动一行
}elseif(ret
>=1) {
//could optimize this, if necessary?
//Could also be called SEEK_TO_CURRENT_ROW, but this
//should be rare/never happens.
returnMatchCode.SEEK_NEXT_ROW;
}
优化配置,是否须要不运行以下流程,直接把当前的scanner向下移动一行
stickyNextRow的值为true的条件:
1.ColumnTracker.done返回为true,
2.ColumnTracker.checkColumn返回为SEEK_NEXT_ROW.
3.filter.filterKeyValue(kv);返回结果为NEXT_ROW。
4.ColumnTracker.checkVersions返回为INCLUDE_AND_SEEK_NEXT_ROW。
ColumnTracker的实如今scan的columns为null或者是compactscan时为ScanWildcardColumnTracker。
否则为ExplicitColumnTracker。
//optimize case.
if(this.stickyNextRow)
returnMatchCode.SEEK_NEXT_ROW;
在ScanWildcardColumnTracker实例中返回值为false,
在ExplicitColumnTracker实例中返回值是当前的kv是否大于或等于查找的column列表的总和
if(this.columns.done()){
stickyNextRow=true;
returnMatchCode.SEEK_NEXT_ROW;
}
得到familylen的记录位置
//PassingrowLength
offset+=rowLength;
得到family的长度
//Skippingfamily
bytefamilyLength=bytes[offset];
把位置移动到family的名称记录的位置
offset+=familyLength+
1;
得到column的长度
intqualLength=keyLength-
(offset-initialOffset)-
KeyValue.TIMESTAMP_TYPE_SIZE;
得到kv中timestamp的值
longtimestamp
=Bytes.toLong(bytes,initialOffset+keyLength–
KeyValue.TIMESTAMP_TYPE_SIZE);
检查timestamp是否在指定的范围内,主要检查ttl是否过期
//check for early out based ontimestampalone
if(columns.isDone(timestamp)){
假设发现kv的ttl过期,在ScanWildcardColumnTracker实例中直接返回SEEK_NEXT_COL。这个在compact中是默认
在ExplicitColumnTracker实例中检查是否有下一个column假设有返回SEEK_NEXT_COL。否则返回SEEK_NEXT_ROW。
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
}
/*
*The delete logic is pretty complicated now.
*This is corroborated by the following:
*1. The store might be instructed to keep deleted rows around.
*2. A scan can optionally see past a delete marker now.
*3. If deleted rows are kept, we have to find out when we can
* remove the delete markers.
*4. Family delete markers are always first (regardless of their TS)
*5. Delete markers should not be counted as version
*6. Delete markers affect puts of the *same* TS
*7. Delete marker need to be version counted together with puts
* they affect
*/
得到kv的类型。
bytetype
=bytes[initialOffset+keyLength– 1];
假设kv是删除的kv
if(kv.isDelete()){
在默认情况下,此keepDeletedCells值为false,这里的if检查会进去
if(!keepDeletedCells){
//first ignore delete markers if the scanner can do so, and the
//range does not include the marker
//
//during flushes andcompactionsalso
ignore delete markers newer
//than thereadpointof
any open scanner, this prevents deleted
//rows that could still be seen by a scanner from being collected
此时的值为true,scan中的tr默觉得alltime=true
booleanincludeDeleteMarker=seePastDeleteMarkers?
tr.withinTimeRange(timestamp):
tr.withinOrAfterTimeRange(timestamp);
把删除的kv加入到DeleteTracker中。compact时的实现为ScanDeleteTracker。
if(includeDeleteMarker
&&kv.getMvccVersion()<=maxReadPointToTrackVersions){
this.deletes.add(bytes,offset,qualLength,timestamp,type);
}
//Can't early out now, because DelFam come before any other keys
}
假设非minorcompact时。
或者在compact的scan时,同一时候当前时间减去kv的timestamp还不到hbase.hstore.time.to.purge.deletes配置的时间。
默认的配置值为0,
或者kv的mvcc值大于如今最大的mvcc值时。此if会进行。
眼下在做majorcompact的scan,不进去
if(retainDeletesInOutput
|| (!isUserScan&& (EnvironmentEdgeManager.currentTimeMillis()-timestamp)<=timeToPurgeDeletes)
||kv.getMvccVersion()>maxReadPointToTrackVersions){
//always include or it is not time yet to check whether it is OK
//to purgedeltesor
not
if(!isUserScan){
//if this is not a user scan (compaction), we can filter thisdeletemarkerright
here
//otherwise (i.e. a "raw" scan) we fall through to normalversion andtimerangechecking
returnMatchCode.INCLUDE;
}
下面的检查通常情况不会进入
}elseif(keepDeletedCells){
if(timestamp<earliestPutTs){
//keeping delete rows, but there are no puts older than
//this delete in the store files.
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
}
//else: fall through and do version counting on the
//delete markers
假设kv是已经delete的kv,加入到DeleteTracker后,直接返回SKIP.
}else{
returnMatchCode.SKIP;
}
//note the following next else if...
//delete marker are not subject to other delete markers
}elseif(!this.deletes.isEmpty()){
假设不是删除的KV时。检查删除的kv中是否包括此kv的版本号。
a.假设KV是DeleteFamily。同一时候当前的KV的TIMESTAMP的值小于删除的KV的TIMESTAMP的值,返回FAMILY_DELETED。
b.假设KV是DeleteFamilyVersion已经删除掉的版本号(删除时指定了timestamp)。返回FAMILY_VERSION_DELETED。
c.假设KV的是DeleteColumn,同一时候deleteTracker中包括的kv中部分qualifier的值
与传入的kv中部分qualifier的值同样。同一时候delete中包括的kv是DeleteColumn返回COLUMN_DELETED。
否则deleteTracker中包括的kv中部分qualifier的值与传入的kv中部分qualifier的值同样。
同一时候传入的kv中的timestamp的值是delete中的timestamp值。表示删除指定的版本号,返回VERSION_DELETED。
d.否则表示没有删除的情况,返回NOT_DELETED。
DeleteResultdeleteResult=deletes.isDeleted(bytes,offset,qualLength,
timestamp);
switch(deleteResult){
caseFAMILY_DELETED:
caseCOLUMN_DELETED:
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
caseVERSION_DELETED:
caseFAMILY_VERSION_DELETED:
returnMatchCode.SKIP;
caseNOT_DELETED:
break;
default:
thrownewRuntimeException("UNEXPECTED");
}
}
检查当前传入的kv的timestamp是否在包括的时间范围内,默认的scan是全部时间都包括
inttimestampComparison=tr.compare(timestamp);
假设当前kv的时间超过了最大的时间,返回SKIP。
if(timestampComparison>=
1) {
returnMatchCode.SKIP;
}elseif(timestampComparison<=
-1) {
假设当前kv的时间小于了最小的时间,返回SEEK_NEXT_COL或者SEEK_NEXT_ROW。
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
}
假设时间在正常的范围内,columns.checkColumn假设是compact时的scan此方法返回INCLUDE。
其他情况请參见ExplicitColumnTracker的实现。
//STEP 1: Check if the column is part of the requested columns
MatchCodecolChecker=columns.checkColumn(bytes,offset,qualLength,type);
此处的IF检查会进入
if(colChecker==MatchCode.INCLUDE){
运行filter操作,并依据filter的响应返回相关的值。此处不在说明,比較easy看明确。
ReturnCodefilterResponse=ReturnCode.SKIP;
//STEP 2: Yes, the column is part of the requested columns. Check iffilter is present
if(filter
!=null){
//STEP 3: Filter the key value and return if it filters out
filterResponse=filter.filterKeyValue(kv);
switch(filterResponse){
caseSKIP:
returnMatchCode.SKIP;
caseNEXT_COL:
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
caseNEXT_ROW:
stickyNextRow=true;
returnMatchCode.SEEK_NEXT_ROW;
caseSEEK_NEXT_USING_HINT:
returnMatchCode.SEEK_NEXT_USING_HINT;
default:
//Itmeans it is either include or include and seek next
break;
}
}
/*
* STEP 4: Reaching this stepmeans the column is part of the requested columns and either
* the filter is null or thefilter has returned INCLUDE or INCLUDE_AND_NEXT_COL response.
* Now check the number ofversions needed. This method call returns SKIP, INCLUDE,
* INCLUDE_AND_SEEK_NEXT_ROW,INCLUDE_AND_SEEK_NEXT_COL.
*
* FilterResponse ColumnChecker Desired behavior
* INCLUDE SKIP row has already been included, SKIP.
* INCLUDE INCLUDE INCLUDE
* INCLUDE INCLUDE_AND_SEEK_NEXT_COL INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE INCLUDE_AND_SEEK_NEXT_ROW INCLUDE_AND_SEEK_NEXT_ROW
* INCLUDE_AND_SEEK_NEXT_COL SKIP row has already been included, SKIP.
* INCLUDE_AND_SEEK_NEXT_COLINCLUDE INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE_AND_SEEK_NEXT_COLINCLUDE_AND_SEEK_NEXT_COL INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE_AND_SEEK_NEXT_COLINCLUDE_AND_SEEK_NEXT_ROW INCLUDE_AND_SEEK_NEXT_ROW
*
* In all the above scenarios, wereturn the column checker return value except for
* FilterResponse(INCLUDE_AND_SEEK_NEXT_COL) and ColumnChecker(INCLUDE)
*/
colChecker=
columns.checkVersions(bytes,offset,qualLength,timestamp,type,
kv.getMvccVersion()>maxReadPointToTrackVersions);
//Optimizewith stickyNextRow
stickyNextRow=colChecker==MatchCode.INCLUDE_AND_SEEK_NEXT_ROW?true:stickyNextRow;
return(filterResponse==ReturnCode.INCLUDE_AND_NEXT_COL&&
colChecker==MatchCode.INCLUDE)?MatchCode.INCLUDE_AND_SEEK_NEXT_COL
:colChecker;
}
stickyNextRow= (colChecker==MatchCode.SEEK_NEXT_ROW)?true
:stickyNextRow;
returncolChecker;
}
major与minor的compact写入新storefile时的差别
假设是major的compact的写入。会在close掉writer时,
在meta中写入major==true的值MAJOR_COMPACTION_KEY=true。
此值主要用来控制做minor的compact时是否选择这个storefile文件。
if (writer!=null)
{
writer.appendMetadata(fd.maxSeqId,request.isMajor());
writer.close();
newFiles.add(writer.getPath());
}
Major compaction时的scan操作的更多相关文章
- PySpark操作HBase时设置scan参数
在用PySpark操作HBase时默认是scan操作,通常情况下我们希望加上rowkey指定范围,即只获取一部分数据参加运算.翻遍了spark的python相关文档,搜遍了google和stackov ...
- jedis keys和scan操作
关于redis的keys命令的性能问题 KEYS pattern 查找所有符合给定模式 pattern 的 key . KEYS * 匹配数据库中所有 key . KEYS h?llo 匹配 hell ...
- jedis的scan操作要注意cursor数据类型
环境 jedis3.0.0 背景 在使用jedis的"scan"操作获取redis中某些key时,发现总是出现类型转换的异常--"java.lang.ClassCastE ...
- Sqlserver2005附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法
Sqlserver2005附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法 最近几天从网上找了几个asp.net的登录案例想要研究研究代码,结果在用 Sql Server2005附 ...
- SQLServer2005+附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法
SQLServer2005+ 附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法 我们在用Sql SQLServer2005+附加数据库文件时弹出错误信息如下图的处理办法: 方案一: ...
- vb 中recordset提示对象关闭时不允许操作
vb中执行查询后,一般要判断是否为空,只要执行的查询执行了select,都可以用rs.eof 或者 rs.recordcount来判断, 但是,如果执行的sql中加了逻辑判断,导致没有执行任何sele ...
- [经使用有效]Sqlserver2005附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法
sqlserver2005附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法 最近几天从网上找了几个asp.net的登录案例想要研究研究代码,结果在用 Sql Server2005附 ...
- oracle expdp导入时 提示“ORA-39002: 操作无效 ORA-39070: 无法打开日志文件 ”
1.导出数据库的时候报错 expdp zz/zz@orcl directory=exp_dp dumpfile=zz_20170520.dump logfile=zz_20170520.log 2 ...
- linux vim vi编辑时撤销输入操作
linux vim vi编辑时撤销输入操作 1,esc退出输入状态 2,u 撤销上次操作 3,ctrl+r 恢复撤销
随机推荐
- 【原】Coursera—Andrew Ng机器学习—课程笔记 Lecture 9_Neural Networks learning
神经网络的学习(Neural Networks: Learning) 9.1 代价函数 Cost Function 参考视频: 9 - 1 - Cost Function (7 min).mkv 假设 ...
- kubernetes configmap
ConfigMaps允许您将配置工件与image内容分离,以保持容器化应用程序的便携性. 本页面提供了一系列使用示例,演示如何使用ConfigMaps中存储的数据创建ConfigMaps和配置Pod. ...
- c++ 命令模式(command)
命令模式的有点: 1.能够容易地设计一个命令队列: 2.在需要的情况下,可以比较容易地将命令记入日志. 3.可以容易的实现对请求的撤销和重做. 4.由于加进新的具体命令类不影响其他的类,因此增加新的具 ...
- 基于weui的一个小插件
移动端项目当中大量的使用了weui,为了减少工作量,方便修改,自己写了个小插件,暂时只有toast和dialog部分,可能会更新actionSheet等其他部分 更新一个手机端预览的二维码,就直接放项 ...
- 王子和公主 UVa10635
[题目描述]:王子和公主 一个王子和公主在n*n的格子中行走,这些格子是有1....n^2的编号的.现在给定p+1个数,再给定q+1个数,公主和王子可以选择其中某些格子行走,求他们最多能走几个相同的格 ...
- 协方差与pearson相关系数
协方差 协方差大于0,表示两个随机变量正线性相关 协方差等于0,表示两随机变量无线性相关 协方差小于0,表示两随机变量负线性相关 协方差智能表示随机变量的线性相关关系,不能刻画其相关程度. 因此引入了 ...
- VWAP算法(成交量加权平均价)
算法交易其实主要是用在基金公司.券商量化比较多.例如我已经选好股,要大量买入,但是单凭交易员的操作海量单而且要完成买入100万股这些的操作是有点的困难的.那么这时候怎样解决拆单,防止冲击成本的问题呢? ...
- MP3 Lame 转换 参数 设置(转)
我们在对音频格式的转换中,打交道最多的就是MP3了.如果你能彻底玩转MP3,那么对你的音频创作和对其他音频格式的掌握会有很大的帮助.下面我们给大家介绍MP3制作软件:LAME 要制作出高音质的MP3靠 ...
- CodeForces 474B Worms (水题,二分)
题意:给定 n 堆数,然后有 m 个话询问,问你在哪一堆里. 析:这个题是一个二分题,但是有一个函数,可以代替写二分,lower_bound. 代码如下: #include<bits/stdc+ ...
- 重装ubuntu
重装前 需要备份软件.配置文件等,重装系统时,最好不要重新给/home分区,也不要格式化,要不你需要备份很多东西,重装后也需要做很多设置.也就是说/home不格式化,整个重装系统都是很快的.最多花10 ...