UserScan的处理流程分析
UserScan的处理流程分析
前置说明
Userscan是通过client或cp中发起的scanner操作。
在Scan中通过caching属性来返回能够返回多少条数据。每次进行next时。
通过batch属性来设置每次在rs端每次nextkv时,可读取多少个kv。(在同一行的情况下)
在生成Scan实例时。最好是把family与column都设置上,这样能保证查询的最高效.
client端通过生成Scan实例,通过HTable下的例如以下方法得到ClientScanner实例
publicResultScannergetScanner(finalScan
scan)
在生成的ClientScanner实例中的callable属性的值为生成的一个ScannerCallable实例。
并通过callable.prepare(tries!=
0);方法得到此scan的startkey所在的region的location.在meta表中。
把startkey相应的location中得到此location的HRegionInfo信息。
并设置ClientScanner.currentRegion的值为当前的region.也就是startkey所在的region.
通过ClientScanner.next向rs发起rpc调用操作。
调用HRegionServer.scan
publicScanResponse
scan(finalRpcControllercontroller,finalScanRequest
request)
ClientScanner.next时,首先是发起openScanner操作,得到一个ScannerId
通过ScannerCallable.call方法:
if(scannerId==
-1L) {
this.scannerId=
openScanner();
}
else{
openScanner方法:中发起一个scan操作,通过rpc调用rs.scan
ScanRequest
request=
RequestConverter.buildScanRequest(
getLocation().getRegionInfo().getRegionName(),
this.scan,0,
false);
try{
ScanResponse
response=
getStub().scan(null,request);
longid
=response.getScannerId();
if(logScannerActivity){
LOG.info("Openscanner="
+ id+
" for scan="+
scan.toString()
+
"on region " +getLocation().toString());
}
returnid;
HregionServer.scan中对openScanner的处理:
publicScanResponse
scan(finalRpcControllercontroller,finalScanRequest
request)
throwsServiceException {
Leases.Lease
lease=
null;
String
scannerName=
null;
........................................非常多代码没有显示
requestCount.increment();
intttl
= 0;
HRegion
region=
null;
RegionScannerscanner
=null;
RegionScannerHolder
rsh=
null;
booleanmoreResults=
true;
booleancloseScanner=
false;
ScanResponse.Builder
builder= ScanResponse.newBuilder();
if(request.hasCloseScanner()){
closeScanner=
request.getCloseScanner();
}
introws
= 1;
if(request.hasNumberOfRows()){
rows=
request.getNumberOfRows();
}
if(request.hasScannerId()){
.................................非常多代码没有显示
}
else{
得到请求的HRegion实例,也就是startkey所在的HRegion
region=
getRegion(request.getRegion());
ClientProtos.Scan
protoScan=
request.getScan();
booleanisLoadingCfsOnDemandSet=
protoScan.hasLoadColumnFamiliesOnDemand();
Scan
scan= ProtobufUtil.toScan(protoScan);
//if the request doesn't set this, get the default region setting.
if(!isLoadingCfsOnDemandSet){
scan.setLoadColumnFamiliesOnDemand(region.isLoadingCfsOnDemandDefault());
}
scan.getAttribute(Scan.SCAN_ATTRIBUTES_METRICS_ENABLE);
假设scan没有设置family,把region中全部的family当成scan的family
region.prepareScanner(scan);
if(region.getCoprocessorHost()!=
null){
scanner=
region.getCoprocessorHost().preScannerOpen(scan);
}
if(scanner
==null){
运行HRegion.getScanner方法。生成HRegion.RegionScannerImpl方法
scanner=
region.getScanner(scan);
}
if(region.getCoprocessorHost()!=
null){
scanner=
region.getCoprocessorHost().postScannerOpen(scan,scanner);
}
把生成的RegionScanner加入到scanners集合容器中。
并设置scannerid(一个随机的值),
scannername是scannerid的string版本号。加入过期监控处理,
通过hbase.client.scanner.timeout.period配置过期时间,默认值为60000ms
老版本号通过hbase.regionserver.lease.period配置。
过期检查线程通过Leases完毕。
对scanner的过期处理通过一个
HregionServer.ScannerListener.leaseExpired实例来完毕。
scannerId=
addScanner(scanner,region);
scannerName= String.valueOf(scannerId);
ttl=
this.scannerLeaseTimeoutPeriod;
}
............................................非常多代码没有显示
Hregion.getScanner方法生成RegionScanner实例流程
publicRegionScannergetScanner(Scanscan)throwsIOException
{
returngetScanner(scan,null);
}
层次的调用,此时传入的kvscannerlist为null
protectedRegionScannergetScanner(Scanscan,
List<KeyValueScanner>additionalScanners)throwsIOException
{
startRegionOperation(Operation.SCAN);
try{
//Verify families are all valid
prepareScanner(scan);
if(scan.hasFamilies()){
for(byte[]
family :scan.getFamilyMap().keySet()){
checkFamily(family);
}
}
returninstantiateRegionScanner(scan,additionalScanners);
}finally{
closeRegionOperation();
}
}
终于生成一个HRegion.RegionScannerImpl实例
protectedRegionScannerinstantiateRegionScanner(Scanscan,
List<KeyValueScanner>additionalScanners)throwsIOException
{
returnnewRegionScannerImpl(scan,additionalScanners,this);
}
RegionScanner实例的生成构造方法:
RegionScannerImpl(Scanscan,List<KeyValueScanner>additionalScanners,HRegion
region)
throwsIOException {
this.region=
region;
this.maxResultSize=
scan.getMaxResultSize();
if(scan.hasFilter()){
this.filter=
newFilterWrapper(scan.getFilter());
}
else{
this.filter=
null;
}
this.batch=
scan.getBatch();
if(Bytes.equals(scan.getStopRow(),HConstants.EMPTY_END_ROW)&&
!scan.isGetScan()){
this.stopRow=
null;
}
else{
this.stopRow=
scan.getStopRow();
}
//If we are doing a get, we want to be [startRow,endRow] normally
//it is [startRow,endRow) and if startRow=endRow we get nothing.
this.isScan=
scan.isGetScan()?
-1 : 0;
//synchronize on scannerReadPoints so that nobody calculates
//getSmallestReadPoint, before scannerReadPoints is updated.
IsolationLevelisolationLevel=
scan.getIsolationLevel();
synchronized(scannerReadPoints){
if(isolationLevel==
IsolationLevel.READ_UNCOMMITTED){
//This scan can read even uncommitted transactions
this.readPt=
Long.MAX_VALUE;
MultiVersionConsistencyControl.setThreadReadPoint(this.readPt);
}
else{
this.readPt=
MultiVersionConsistencyControl.resetThreadReadPoint(mvcc);
}
scannerReadPoints.put(this,this.readPt);
}
//Here we separate all scanners into two lists - scanner that providedata required
//by the filter to operate (scanners list) and all others(joinedScanners list).
List<KeyValueScanner>scanners
=newArrayList<KeyValueScanner>();
List<KeyValueScanner>joinedScanners=
newArrayList<KeyValueScanner>();
if(additionalScanners!=
null){
scanners.addAll(additionalScanners);
}
迭代每个要进行scan的store。生成详细的StoreScanner实例。
通常情况下joinedHead的值为null
for(Map.Entry<byte[],NavigableSet<byte[]>>entry
:
scan.getFamilyMap().entrySet()){
Storestore
=stores.get(entry.getKey());
生成StoreScanner实例。通过HStore.getScanner(scan,columns);
KeyValueScannerscanner
=store.getScanner(scan,entry.getValue());
if(this.filter==
null|| !scan.doLoadColumnFamiliesOnDemand()
||this.filter.isFamilyEssential(entry.getKey())){
scanners.add(scanner);
}
else{
joinedScanners.add(scanner);
}
}
生成KeyValueHeap实例。把全部的storescanner的開始位置移动到startkey的位置并得到top的StoreScanner,
this.storeHeap=
newKeyValueHeap(scanners,comparator);
if(!joinedScanners.isEmpty()){
this.joinedHeap=
newKeyValueHeap(joinedScanners,comparator);
}
}
得到StoreScanner实例的HStore.getScanner(scan,columns)方法
publicKeyValueScannergetScanner(Scanscan,
finalNavigableSet<byte[]>
targetCols)throwsIOException {
lock.readLock().lock();
try{
KeyValueScannerscanner
=null;
if(this.getCoprocessorHost()!=
null){
scanner=
this.getCoprocessorHost().preStoreScannerOpen(this,scan,targetCols);
}
if(scanner
==null){
scanner=
newStoreScanner(this,getScanInfo(),scan,targetCols);
}
returnscanner;
}finally{
lock.readLock().unlock();
}
}
生成StoreScanner的构造方法:
publicStoreScanner(Storestore,ScanInfo
scanInfo,Scan
scan,finalNavigableSet<byte[]>columns)
throwsIOException {
this(store,scan.getCacheBlocks(),scan,columns,scanInfo.getTtl(),
scanInfo.getMinVersions());
假设设置有scan的_raw_属性时。columns的值须要为null
if(columns
!=null&&
scan.isRaw()){
thrownewDoNotRetryIOException(
"Cannotspecify any column for a raw scan");
}
matcher=
newScanQueryMatcher(scan,scanInfo,columns,
ScanType.USER_SCAN,Long.MAX_VALUE,HConstants.LATEST_TIMESTAMP,
oldestUnexpiredTS);
得到StoreFileScanner,StoreFileScanner中引用的StoreFile.Reader中引用HFileReaderV2,
HFileReaderV2的实例在StoreFile.Reader中假设已经存在。不会又一次创建,这样会加快scanner的创建时间。
//Pass columns to try to filter out unnecessary StoreFiles.
List<KeyValueScanner>scanners
=getScannersNoCompaction();
//Seek all scanners to the start of the Row (or if the exact matchingrow
//key does not exist, then to the start of the next matching Row).
//Always check bloom filter to optimize the top row seek for delete
//family marker.
if(explicitColumnQuery&&
lazySeekEnabledGlobally){
for(KeyValueScannerscanner
:scanners) {
scanner.requestSeek(matcher.getStartKey(),false,true);
}
}else{
if(!isParallelSeekEnabled){
for(KeyValueScannerscanner
:scanners) {
scanner.seek(matcher.getStartKey());
}
}
else{
parallelSeek(scanners,matcher.getStartKey());
}
}
//set storeLimit
this.storeLimit=
scan.getMaxResultsPerColumnFamily();
//set rowOffset
this.storeOffset=
scan.getRowOffsetPerColumnFamily();
//Combine all
seekedscanners with a heap
heap=
newKeyValueHeap(scanners,store.getComparator());
注冊,假设有storefile更新时。把更新后的storefile加入到这个StoreScanner中来。
this.store.addChangedReaderObserver(this);
}
发起scan的rpc操作
client端发起openScanner操作后,得到一个scannerId.此时发起scan操作。
通过ScannerCallable.call中发起call的操作,在scannerId不等于-1时。
Result []
rrs= null;
ScanRequest
request=
null;
try{
incRPCcallsMetrics();
request= RequestConverter.buildScanRequest(scannerId,caching,false,nextCallSeq);
ScanResponse
response=
null;
PayloadCarryingRpcControllercontroller=
newPayloadCarryingRpcController();
try{
controller.setPriority(getTableName());
response=
getStub().scan(controller,request);
...................................此处省去一些代码
nextCallSeq++;
longtimestamp
=System.currentTimeMillis();
//Results are returned via controller
CellScannercellScanner=
controller.cellScanner();
rrs= ResponseConverter.getResults(cellScanner,response);
HregionServer.scan方法中对scan时的处理流程:
得到scan中的caching属性的值,此值主要用来响应client返回的条数。
假设一行数据包括多个kv,算一条
introws
= 1;
if(request.hasNumberOfRows()){
rows=
request.getNumberOfRows();
}
假设client传入的scannerId有值,也就是不等于-1时,表示不是openScanner操作,检查scannerid是否过期
if(request.hasScannerId()){
rsh=
scanners.get(scannerName);
if(rsh
==null){
LOG.info("Clienttried
to access missing scanner " +scannerName);
thrownewUnknownScannerException(
"Name:" +
scannerName+
", already closed?");
}
此处主要是检查region是否发生过split操作。假设是会出现NotServingRegionException操作。
scanner=
rsh.s;
HRegionInfo
hri=
scanner.getRegionInfo();
region=
getRegion(hri.getRegionName());
if(region
!=rsh.r){
// Yes, should be the same instance
thrownewNotServingRegionException("Regionwas
re-opened after the scanner"
+
scannerName+
" was created: "+
hri.getRegionNameAsString());
}
}
else{
...................................此处省去一些生成Regionscanner的代码
}
表示有设置caching,假设是运行scan,此时的默认值为1,当前scan中设置有caching后,使用scan中设置的值
if(rows
>0) {
//if nextCallSeq does not match throw Exception straight away. Thisneeds to be
//performed even before checking of Lease.
//See HBASE-5974
是否有配置nextCallSeq的值。第一次调用时,此值为0,每调用一次加一,client也一样,每调用一次加一。
if(request.hasNextCallSeq()){
if(rsh
==null){
rsh=
scanners.get(scannerName);
}
if(rsh
!=null){
if(request.getNextCallSeq()!=
rsh.nextCallSeq){
thrownewOutOfOrderScannerNextException("ExpectednextCallSeq:
" + rsh.nextCallSeq
+
"But the nextCallSeq got from client: "+
request.getNextCallSeq()+
";request=" +TextFormat.shortDebugString(request));
}
//Increment the nextCallSeq value which is the next expected fromclient.
rsh.nextCallSeq++;
}
}
try{
先从租约管理中移出此租约,防止查找时间大于过期时间而出现的超时
//Remove lease while its being processed in server; protects againstcase
//where processing of request takes > lease expiration time.
lease=
leases.removeLease(scannerName);
生成要返回的条数的一个列表。scan.caching
List<Result>results
=newArrayList<Result>(rows);
longcurrentScanResultSize=
0;
booleandone
=false;
调用cp的preScannernext,假设返回为true,表示不在运行scan操作。
//Call coprocessor. Get region info from scanner.
if(region
!=null&&
region.getCoprocessorHost()!=
null){
Boolean
bypass=
region.getCoprocessorHost().preScannerNext(
scanner,results,rows);
if(!results.isEmpty()){
for(Result
r :results) {
if(maxScannerResultSize<
Long.MAX_VALUE){
for(Cellkv
:r.rawCells()){
//TODO
currentScanResultSize+= KeyValueUtil.ensureKeyValue(kv).heapSize();
}
}
}
}
if(bypass
!=null&&
bypass.booleanValue()){
done=
true;
}
}
运行scan操作。
Cp的preScannerNext返回为false,或没有设置cp(主要是RegionObServer)
返回给client的最大size通过hbase.client.scanner.max.result.size配置。默觉得long.maxvalue
假设scan也设置有maxResultSize,使用scan设置的值
if(!done)
{
longmaxResultSize=
scanner.getMaxResultSize();
if(maxResultSize<=
0) {
maxResultSize=
maxScannerResultSize;
}
List<Cell>values
=newArrayList<Cell>();
MultiVersionConsistencyControl.setThreadReadPoint(scanner.getMvccReadPoint());
region.startRegionOperation(Operation.SCAN);
try{
inti
= 0;
synchronized(scanner){
此处開始迭代,開始调用regionScanner(HRegion.RegionScannerImpl.nextRaw(List))进行查找,
迭代的长度为scan设置的caching的大小,假设运行RegionScanner.nextRaw(List)返回为false,时也会停止迭代
for(;
i <rows
&¤tScanResultSize<
maxResultSize;i++) {
返回的true表示还有数据。能够接着查询,否则表示此region中已经没有符合条件的数据了。
//Collect values to be returned here
booleanmoreRows
=scanner.nextRaw(values);
if(!values.isEmpty()){
if(maxScannerResultSize<
Long.MAX_VALUE){
for(Cellkv
:values) {
currentScanResultSize+= KeyValueUtil.ensureKeyValue(kv).heapSize();
}
}
results.add(Result.create(values));
}
if(!moreRows){
break;
}
values.clear();
}
}
region.readRequestsCount.add(i);
}
finally{
region.closeRegionOperation();
}
//coprocessor postNext hook
if(region
!=null&&
region.getCoprocessorHost()!=
null){
region.getCoprocessorHost().postScannerNext(scanner,results,rows,true);
}
}
假设没有能够再查找的数据时。设置response的moreResults为false
//If the scanner's filter - if any - is done with the scan
//and wants to tell the client to stop the scan. This is done bypassing
//a null result, and setting moreResults to false.
if(scanner.isFilterDone()&&
results.isEmpty()){
moreResults=
false;
results=
null;
}
else{
加入结果到response中,假设hbase.client.rpc.codec配置有codec的值,
默认取hbase.client.default.rpc.codec配置的值。默觉得KeyValueCodec
假设上面说的codec配置不为null时,把results生成为一个iterator,并生成一个匿名的CallScanner实现类
设置到scan时传入的controller中。这样能提升查询数据的读取性能。
假设没有配置codec时。默认直接把results列表设置到response中,这样响应的数据可能会比較大。
addResults(builder,results,controller);
}
}
finally{
又一次把租约放入到租约检查管理器中,此租约主要来检查client多长时间没有发起过scan的操作。
//We're done. On way out re-add the above removed lease.
//Adding resets expiration time on lease.
if(scanners.containsKey(scannerName)){
if(lease
!=null)leases.addLease(lease);
ttl=
this.scannerLeaseTimeoutPeriod;
}
}
}
client端获取响应的数据:ScannerCallable.call方法中
rrs= ResponseConverter.getResults(cellScanner,response);
ResponseConverter.getResults方法的实现
publicstaticResult[]
getResults(CellScannercellScanner,ScanResponse
response)
throwsIOException {
if(response==
null)returnnull;
//If
cellscanner,then the number of Results to return is the count of elements in the
//cellsPerResult list. Otherwise, it is how many results are embeddedinside the response.
intnoOfResults=
cellScanner!=
null?
response.getCellsPerResultCount():response.getResultsCount();
Result[]
results=
newResult[noOfResults];
for(inti
= 0; i<
noOfResults;i++) {
cellScanner假设codec配置为有值时,在rs响应时会生成一个匿名的实现
if(cellScanner!=
null){
......................................
intnoOfCells
=response.getCellsPerResult(i);
List<Cell>cells
= newArrayList<Cell>(noOfCells);
for(intj
= 0; j<
noOfCells;j++) {
try{
if(cellScanner.advance()==
false){
.....................................
String
msg=
"Results sent from server="+
noOfResults+
". But only got "+
i
+
"results completely at client. Resetting the scanner to scan again.";
LOG.error(msg);
thrownewDoNotRetryIOException(msg);
}
}
catch(IOException
ioe){
...........................................
LOG.error("Exceptionwhile
reading cells from result."
+
"Resettingthe scanner to scan again.",
ioe);
thrownewDoNotRetryIOException("Resettingthe
scanner.", ioe);
}
cells.add(cellScanner.current());
}
results[i]=
Result.create(cells);
}
else{
否则,没有设置codec。直接从response中读取出来数据,
//Result is pure
pb.
results[i]=
ProtobufUtil.toResult(response.getResults(i));
}
}
returnresults;
}
,通过nextScanner又一次得到下一个region,并发起连接去scan数据。
Do{
.........................此处省去一些代码。
if(values
!=null&&
values.length> 0) {
for(Result
rs:
values) {
cache.add(rs);
for(Cellkv
:rs.rawCells()){
//TODOmake
method in Cell or CellUtil
remainingResultSize-= KeyValueUtil.ensureKeyValue(kv).heapSize();
}
countdown--;
this.lastResult=
rs;
}
}
}while(remainingResultSize>
0 && countdown> 0 &&
nextScanner(countdown,values ==null));
对于这样的类型的查询操作。能够使用得到一个ClientScanner后,不运行close操作。
在rs的timeout前每次定期去从rs中拿一定量的数据下来。
缓存到ClientScanner的cache中。
每次next时从cache中直接拿数据
Hregion.RegionScannerImpl.nextRaw(list)方法分析
RegionScannerImpl是对RegionScanner接口的实现。
Rs的scan在运行时通过regionScanner.nextRaw(list)来获取数据。
通过regionScanner.isFilterDone来检查此region的查找是否完毕。
调用nextRaw方法,此方法调用还有一个重载方法,batch是scan中设置的每次可查询最大的单行中的多少个kv的kv个数
publicbooleannextRaw(List<Cell>outResults)
throwsIOException {
returnnextRaw(outResults,batch);
}
publicbooleannextRaw(List<Cell>outResults,intlimit)throwsIOException
{
booleanreturnResult;
调用nextInternal方法。
if(outResults.isEmpty()){
//Usually outResults is empty. This is true when next is called
//to handle scan or get operation.
returnResult=
nextInternal(outResults,limit);
}
else{
List<Cell>tmpList
=newArrayList<Cell>();
returnResult=
nextInternal(tmpList,limit);
outResults.addAll(tmpList);
}
调用filter.reset方法,清空当前row的filter的相关信息。
ResetFilters();
假设filter.filterAllRemaining()的返回值为true,时表示当前region的查找条件已经结束,不能在运行查找操作。
没有能够接着查找的须要,也就是没有很多其它要查找的行了。
if(isFilterDone()){
returnfalse;
}
................................此处省去一些代码
returnreturnResult;
}
nextInternal方法处理流程:
privatebooleannextInternal(List<Cell>results,intlimit)
throwsIOException {
if(!results.isEmpty()){
thrownewIllegalArgumentException("Firstparameter
should be an empty list");
}
RpcCallContextrpcCall
=RpcServer.getCurrentCall();
//The loop here is used only when at some point during the next wedetermine
//that due to effects of filters or otherwise, we have an empty row inthe result.
//Then we loop and try again. Otherwise, we must get out on the firstiteration via return,
//"true" if there's more data to read, "false" ifthere isn't (storeHeap is at a stop row,
//and joinedHeap has no more data to read for the last row (if set,joinedContinuationRow).
while(true){
if(rpcCall
!=null){
//If a user specifies a too-restrictive or too-slow scanner, the
//client might time out and disconnect while the server side
//is still processing the request. We should abort aggressively
//in that case.
longafterTime
=rpcCall.disconnectSince();
if(afterTime>=
0) {
thrownewCallerDisconnectedException(
"Abortingon region " +getRegionNameAsString()+
", call "+
this+
" after "+
afterTime+
" ms, since "+
"callerdisconnected");
}
}
得到通过startkeyseek后当前最小的一个kv。
//Let's see what we have in the storeHeap.
KeyValue
current=
this.storeHeap.peek();
byte[]currentRow=
null;
intoffset
= 0;
shortlength
= 0;
if(current
!=null){
currentRow=
current.getBuffer();
offset=
current.getRowOffset();
length=
current.getRowLength();
}
检查是否到了stopkey,假设是。返回false,joinedContinuationRow是多个cf的关联查找。不用去管它
booleanstopRow
=isStopRow(currentRow,offset,length);
//Check if we were getting data from the joinedHeap and hit the limit.
//If not, then it's main path - getting results from storeHeap.
if(joinedContinuationRow==
null){
//First, check if we are at a stop row. If so, there are no moreresults.
if(stopRow)
{
假设是stopRow,同一时候filter.hasFilterRow返回为true时。
可通过filterRowCells来检查要返回的kvlist,也能够用来改动要返回的kvlist
if(filter
!=null&&
filter.hasFilterRow()){
filter.filterRowCells(results);
}
returnfalse;
}
通过filter.filterRowkey来过滤检查key是否须要排除,假设是排除返回true,否则返回false
//Check if rowkey filter wants to exclude this row. If so, loop tonext.
//Technically, if we hit limits before on this row, we don't need thiscall.
if(filterRowKey(currentRow,offset,length))
{
假设rowkey是须要排除的rowkey,检查是否有下一行数据。
假设没有下一行数据。返回flase,表示当前region查找结束
否则清空当前的results,又一次进行查找
booleanmoreRows
=nextRow(currentRow,offset,length);
if(!moreRows)returnfalse;
results.clear();
continue;
}
開始运行region下此scan须要的全部store的StoreScanner的next进行查找,把查找的结果放到results列表中。
假设一行中包括有多个kv,如今查找这些kv达到传入的limit的大小的时候,返回kv_limit的一个空的kv。
(查找的大小已经达到limit(batch)的一行最大scan的kv个数,返回kv_limit),
否则表示还没有查找到limit的kv个数,可是当前row相应的全部达到条件的kv都已经查找完毕,返回最后一个kv。
返回的kv假设不是kv_limit,那么有可能是null或者是下一行的第一个kv.
KeyValue
nextKv=
populateResult(results,this.storeHeap,limit,currentRow,offset,
length);
假设达到limit的限制时,filter.hasFilterRow的值一定得是false,
否则会throw
IncompatibleFilterException
假设达到limit的限制时。返回true,当前row的全部kv查找结束,返回true能够接着向下查找
提示:假设hbase一行数据中可能包括多个kv时,最好是在scan时设置batch的属性,否则会一直查找到全部的kv结束
//Ok, we are good, let's try to get some results from the main heap.
if(nextKv
==KV_LIMIT) {
if(this.filter!=
null&&
filter.hasFilterRow()){
thrownewIncompatibleFilterException(
"Filterwhose hasFilterRow() returns true is incompatible with scan withlimit!");
}
returntrue;//
We hit the limit.
}
是否到结束行,从这一行代码中能够看出,stoprow是不包括的。由于nextKv肯定是下一行row中第一个kv的值
stopRow=
nextKv ==null||
isStopRow(nextKv.getBuffer(),nextKv.getRowOffset(),nextKv.getRowLength());
//save that the row was empty before filters applied to it.
finalbooleanisEmptyRow=
results.isEmpty();
假设是stopRow,同一时候filter.hasFilterRow返回为true时,
可通过filterRowCells来检查要返回的kvlist,也能够用来改动要返回的kvlist
//We have the part of the row necessary for filtering (all of it,usually).
//First filter with the filterRow(List).
if(filter
!=null&&
filter.hasFilterRow()){
filter.filterRowCells(results);
}
假设当前row的查找没有找到合法的kv,也就是results的列表没有值,检查是否还有下一行,
假设有,又一次进行查找。否则表示当前region的查找最结尾处,不能再进行查找,返回fasle
if(isEmptyRow){
booleanmoreRows
=nextRow(currentRow,offset,length);
if(!moreRows)returnfalse;
results.clear();
//This row was totally filtered out, if this is NOT the last row,
//we should continue on. Otherwise, nothing else to do.
if(!stopRow)continue;
returnfalse;
}
//Ok, we are done with storeHeap for this row.
//Now we may need to fetch additional, non-essential data into row.
//These values are not needed for filter to work, so we postpone their
//fetch to (possibly) reduce amount of data loads from disk.
if(this.joinedHeap!=
null){
..................................进行关联查找的代码,不显示。也不分析
}
}
else{
多个store进行关联查询,不分析,通常情况不会有
//Populating from the joined heap was stopped by limits, populate somemore.
populateFromJoinedHeap(results,limit);
}
//We may have just called populateFromJoinedMap and hit the limits. Ifthat is
//the case, we need to call it again on the next next() invocation.
if(joinedContinuationRow!=
null){
returntrue;
}
假设这次的查找,results的结果为空。表示没有查找到结果。检查是否还有下一行数据,假设有又一次进行查找,
否则返回false表示此region的查找结束
//Finally, we are done with both joinedHeap and storeHeap.
//Double check to prevent empty rows from appearing in result. It couldbe
//the case when SingleColumnValueExcludeFilter is used.
if(results.isEmpty()){
booleanmoreRows
=nextRow(currentRow,offset,length);
if(!moreRows)returnfalse;
if(!stopRow)continue;
}
非stoprow时。表示还能够有下一行的数据,也就是能够接着进行next操作。否则表示此region的查找结束
//We are done. Return the result.
return!stopRow;
}
}
UserScan时的ScanQueryMatcher.match方法处理
userscan时的ScanQueryMatcher在newRegionScannerImpl(scan,additionalScanners,this);时生成。
在生成StoreScanner时通过例如以下代码生成matcher实例。
matcher=
newScanQueryMatcher(scan,scanInfo,columns,
ScanType.USER_SCAN,Long.MAX_VALUE,HConstants.LATEST_TIMESTAMP,
oldestUnexpiredTS);
matcher.isUserScan的值此时为true.
publicMatchCodematch(KeyValuekv)
throwsIOException {
检查当前region的查找是否结束。pageFilter就是通过控制此filter中的方法来检查是否须要
if(filter
!=null&&
filter.filterAllRemaining()){
returnMatchCode.DONE_SCAN;
}
byte[]
bytes =kv.getBuffer();
intoffset
=kv.getOffset();
intkeyLength
=Bytes.toInt(bytes,offset,Bytes.SIZEOF_INT);
offset+= KeyValue.ROW_OFFSET;
intinitialOffset=
offset;
shortrowLength
=Bytes.toShort(bytes,offset,Bytes.SIZEOF_SHORT);
offset+= Bytes.SIZEOF_SHORT;
检查传入的kv是否是当前行的kv。也就是rowkey是否同样,假设当前的rowkey小于传入的rowkey。
表示如今已经next到下一行了。返回DONE,表示当前行查找结束
intret
=this.rowComparator.compareRows(row,this.rowOffset,this.rowLength,
bytes,offset,rowLength);
if(ret
<=-1) {
returnMatchCode.DONE;
}elseif(ret
>=1) {
假设当前的rowkey大于传入的rowkey。表示当前next出来的kv比方今的kv要小,运行nextrow操作。
//could optimize this, if necessary?
//Could also be called SEEK_TO_CURRENT_ROW, but this
//should be rare/never happens.
returnMatchCode.SEEK_NEXT_ROW;
}
是否跳过当前行的其他kv比較。这是一个优化项。
//optimize case.
if(this.stickyNextRow)
returnMatchCode.SEEK_NEXT_ROW;
假设当前行的全部要查找的(scan)column都查找完毕了。其他的当前行中非要scan的kv。
直接不比較,运行nextrow操作。
if(this.columns.done()){
stickyNextRow=
true;
returnMatchCode.SEEK_NEXT_ROW;
}
//PassingrowLength
offset+=
rowLength;
//Skippingfamily
bytefamilyLength=
bytes[offset];
offset+=
familyLength+ 1;
intqualLength=
keyLength-
(offset-
initialOffset)- KeyValue.TIMESTAMP_TYPE_SIZE;
检查当前KV的TTL是否过期,假设过期,检查是否SCAN中还有下一个COLUMN,假设有返回SEEK_NEXT_COL。
否则返回SEEK_NEXT_ROW。
longtimestamp
=Bytes.toLong(bytes,initialOffset+
keyLength- KeyValue.TIMESTAMP_TYPE_SIZE);
//check for early out based on
timestampalone
if(columns.isDone(timestamp)){
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
}
/*
*The delete logic is pretty complicated now.
*This is corroborated by the following:
*1. The store might be instructed to keep deleted rows around.
*2. A scan can optionally see past a delete marker now.
*3. If deleted rows are kept, we have to find out when we can
* remove the delete markers.
*4. Family delete markers are always first (regardless of their TS)
*5. Delete markers should not be counted as version
*6. Delete markers affect puts of the *same* TS
*7. Delete marker need to be version counted together with puts
* they affect
*/
bytetype
=bytes[initialOffset+
keyLength– 1];
假设当前KV是删除的KV。
if(kv.isDelete()){
此处会进入。把删除的KV加入到DeleteTracker中,默认是ScanDeleteTracker
if(!keepDeletedCells){
//first ignore delete markers if the scanner can do so, and the
//range does not include the marker
//
//during flushes and
compactionsalso ignore delete markers newer
//than the
readpointof any open scanner, this prevents deleted
//rows that could still be seen by a scanner from being collected
booleanincludeDeleteMarker=
seePastDeleteMarkers?
tr.withinTimeRange(timestamp):
tr.withinOrAfterTimeRange(timestamp);
if(includeDeleteMarker
&&kv.getMvccVersion()<=
maxReadPointToTrackVersions){
this.deletes.add(bytes,offset,qualLength,timestamp,type);
}
//Can't early out now, because DelFam come before any other keys
}
此处的检查不会进入,userscan不保留删除的数据
if(retainDeletesInOutput
|| (!isUserScan&& (EnvironmentEdgeManager.currentTimeMillis()-
timestamp)<=
timeToPurgeDeletes)
||
kv.getMvccVersion()>
maxReadPointToTrackVersions){
//always include or it is not time yet to check whether it is OK
//to purge
deltesor not
if(!isUserScan){
//if this is not a user scan (compaction), we can filter thisdeletemarkerright
here
//otherwise (i.e. a "raw" scan) we fall through to normalversion and
timerangechecking
returnMatchCode.INCLUDE;
}
}
elseif(keepDeletedCells){
if(timestamp<
earliestPutTs){
//keeping delete rows, but there are no puts older than
//this delete in the store files.
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
}
//else: fall through and do version counting on the
//delete markers
}
else{
returnMatchCode.SKIP;
}
//note the following next else if...
//delete marker are not subject to other delete markers
}elseif(!this.deletes.isEmpty()){
假设deleteTracker中不为空时,也就是当前行中有删除的KV,检查当前KV是否是删除的KV
提示:删除的KV在compare时,比正常的KV要小,所以在运行next操作时,delete的KV会先被查找出来。
假设是删除的KV,依据KV的删除类型。假设是版本号被删除,返回SKIP。
否则假设SCAN中还有下一个要SCAN的column时。返回SEEK_NEXT_COL。
否则表示当前行没有须要在进行查找的KV,返回SEEK_NEXT_ROW。
DeleteResultdeleteResult=
deletes.isDeleted(bytes,offset,qualLength,
timestamp);
switch(deleteResult){
caseFAMILY_DELETED:
caseCOLUMN_DELETED:
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
caseVERSION_DELETED:
caseFAMILY_VERSION_DELETED:
returnMatchCode.SKIP;
caseNOT_DELETED:
break;
default:
thrownewRuntimeException("UNEXPECTED");
}
}
检查KV的时间是否在SCAN要查找的时间范围内,
inttimestampComparison=
tr.compare(timestamp);
假设大于SCAN的最大时间。返回SKIP。
if(timestampComparison>=
1) {
returnMatchCode.SKIP;
}elseif(timestampComparison<=
-1) {
假设小于SCAN的最小时间,假设SCAN中还有下一个要SCAN的column时。返回SEEK_NEXT_COL。
否则表示当前行没有须要在进行查找的KV,返回SEEK_NEXT_ROW。
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
}
检查当前KV的column是否是SCAN中指定的column列表中包括的值,假设是INCLUDE。
否则假设SCAN中还有下一个要SCAN的column时,返回SEEK_NEXT_COL。
否则表示当前行没有须要在进行查找的KV,返回SEEK_NEXT_ROW。
//STEP 1: Check if the column is part of the requested columns
MatchCodecolChecker=
columns.checkColumn(bytes,offset,qualLength,type);
假设column是SCAN中要查找的column之中的一个
if(colChecker==
MatchCode.INCLUDE){
ReturnCodefilterResponse=
ReturnCode.SKIP;
//STEP 2: Yes, the column is part of the requested columns. Check iffilter is present
if(filter
!=null){
运行filter.filterKeyValue操作。并返回filter过滤的结果
//STEP 3: Filter the key value and return if it filters out
filterResponse=
filter.filterKeyValue(kv);
switch(filterResponse){
caseSKIP:
returnMatchCode.SKIP;
caseNEXT_COL:
假设SCAN中还有下一个要SCAN的column时。返回SEEK_NEXT_COL。
否则表示当前行没有须要在进行查找的KV,返回SEEK_NEXT_ROW。
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
caseNEXT_ROW:
stickyNextRow=
true;
returnMatchCode.SEEK_NEXT_ROW;
caseSEEK_NEXT_USING_HINT:
returnMatchCode.SEEK_NEXT_USING_HINT;
default:
//Itmeans it is either include or include and seek next
break;
}
}
/*
* STEP 4: Reaching this stepmeans the column is part of the requested columns and either
* the filter is null or thefilter has returned INCLUDE or INCLUDE_AND_NEXT_COL response.
* Now check the number ofversions needed. This method call returns SKIP, INCLUDE,
* INCLUDE_AND_SEEK_NEXT_ROW,INCLUDE_AND_SEEK_NEXT_COL.
*
* FilterResponse ColumnChecker Desired behavior
* INCLUDE SKIP row has already been included, SKIP.
* INCLUDE INCLUDE INCLUDE
* INCLUDE INCLUDE_AND_SEEK_NEXT_COL INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE INCLUDE_AND_SEEK_NEXT_ROW INCLUDE_AND_SEEK_NEXT_ROW
* INCLUDE_AND_SEEK_NEXT_COL SKIP row has already been included, SKIP.
* INCLUDE_AND_SEEK_NEXT_COLINCLUDE INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE_AND_SEEK_NEXT_COLINCLUDE_AND_SEEK_NEXT_COL INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE_AND_SEEK_NEXT_COLINCLUDE_AND_SEEK_NEXT_ROW INCLUDE_AND_SEEK_NEXT_ROW
*
* In all the above scenarios, wereturn the column checker return value except for
* FilterResponse(INCLUDE_AND_SEEK_NEXT_COL) and ColumnChecker(INCLUDE)
*/
此处主要是检查KV的是否是SCAN的最大版本号内,到这个地方。除非是KV超过了要SCAN的最大版本号,或者KV的TTL过期。
否则肯定是会包括此KV的值。
colChecker=
columns.checkVersions(bytes,offset,qualLength,timestamp,type,
kv.getMvccVersion()>
maxReadPointToTrackVersions);
//Optimizewith stickyNextRow
stickyNextRow=
colChecker==
MatchCode.INCLUDE_AND_SEEK_NEXT_ROW?
true:
stickyNextRow;
return(filterResponse==
ReturnCode.INCLUDE_AND_NEXT_COL&&
colChecker==
MatchCode.INCLUDE)?
MatchCode.INCLUDE_AND_SEEK_NEXT_COL
:
colChecker;
}
stickyNextRow= (colChecker==
MatchCode.SEEK_NEXT_ROW)?
true
:
stickyNextRow;
returncolChecker;
}
UserScan的处理流程分析的更多相关文章
- 8、Struts2 运行流程分析
1.流程分析: 请求发送给 StrutsPrepareAndExecuteFilter StrutsPrepareAndExecuteFilter 询问 ActionMapper: 该请求是否是一个 ...
- freeswitch呼叫流程分析
今天翻文档时发现之前整理的关于freeswitch呼叫相关的内容,写成博文分享出来也方便我以后查阅. 整体结构图 FreeswitchCore 模块加载过程 freeswitch主程序初始化时会从mo ...
- u-boot 流程分析
u-boot 介绍: 对于计算机来说 , 从一开始上机通电是无法直接启动操作系统的 , 这中间需要一个引导过程 , 嵌入式Linux系统同样离不开引导程序 , 这个启动程序就叫启动加载程序(Boot ...
- thttpd和cgilua安装与运行流程分析
安装 参考如下博文安装thttpd软件 http://blog.csdn.net/21aspnet/article/details/7045845 http://blog.csdn.net/drago ...
- 【转】Hostapd工作流程分析
[转]Hostapd工作流程分析 转自:http://blog.chinaunix.net/uid-30081165-id-5290531.html Hostapd是一个运行在用户态的守护进程,可以通 ...
- u-boot中nandflash初始化流程分析(转)
u-boot中nandflash初始化流程分析(转) 原文地址http://zhuairlunjj.blog.163.com/blog/static/80050945201092011249136/ ...
- Android7.0 Phone应用源码分析(二) phone来电流程分析
接上篇博文:Android7.0 Phone应用源码分析(一) phone拨号流程分析 今天我们再来分析下Android7.0 的phone的来电流程 1.1TelephonyFramework 当有 ...
- runc start container流程分析
1.runc/start.go Action: func(context *cli.Context) error 该函数首先调用container, err := getContainer(conte ...
- 从注册流程 分析如何安全退出多个Activity 多种方式(附DEMO)
退出Activity注册Android遍历 目录(?)[+] 前言 知识结构 具体方案 方案1 方法采用FLAG_ACTIVITY_CLEAR_TOP退出整个程序多activity 方案2 方 ...
随机推荐
- luogu P3818 小A和uim之大逃离 II
题目背景 话说上回……还是参见 https://www.luogu.org/problem/show?pid=1373 吧 小a和uim再次来到雨林中探险.突然一阵南风吹来,一片乌云从南部天边急涌过来 ...
- MySQL四种类型日志:Error Log、General Query Log、Binary Log、Slow Query Log
MySQL Server 有四种类型的日志——Error Log.General Query Log.Binary Log 和 Slow Query Log. 第一个是错误日志,记录mysqld的一些 ...
- 关于MySQL Cluster集群NoOfReplicas参数问题
摘自:http://www.itpub.net/thread-1845295-1-1.html 官方网站上说参数NoOfReplicas的值表示数据的备份份数,例如:NoOfReplicas=2,若在 ...
- GoodSync
文件管理这件看 似简单的事,真的不简单,因为为了防止意外情况,你需要对文件进行备份,时间一久随着文件数量的增加,再加上有时也会临时队备份文件进行修改等.再想查出 这个是最新的.文件有木有全部备份等…. ...
- Linux目录处理命令
1 创建命令 mkdir -p 目录名 其中 -p表示递归创建,英文为make directories td@td-Lenovo-IdeaPad-Y410P:~$ mkdir Test 上述命令在 ...
- TP5使用PHPMAILER发送邮件
TP使用PHPMAILER发送邮件 1.申请一个SMTP服务的邮箱. 我申请的是smtp.163.com的服务,注意SMTP服务密码不是登陆密码,需要单独设置 2.下载phpmailer类库文件htt ...
- Ubuntu下安装配置JDK
第一步:下载jdk-7-linux-i586.tar.gz wget -c http://download.oracle.com/otn-pub/java/jdk/7/jdk-7-linux-i586 ...
- 分享三个USB抓包软件---Bus Hound,USBlyzer 和-USBTrace
Bus Hound官方下载地址:http://perisoft.net/bushound/Bus Hound 简易使用手册:bus_hound5.0中文使用说明.pdf (246 K) 下载次数:9 ...
- ubuntu 12.04上安装wn725n无线网卡驱动
系统版本:$uname -a Linux yuliang-Veriton-M670G 3.11.0-17-generic #31~precise1-Ubuntu SMP Tue Feb 4 21:25 ...
- 安装openstack 时 遇见的一些问题及解决方法!
感谢朋友支持本博客,欢迎共同探讨交流.因为能力和时间有限,错误之处在所难免.欢迎指正! 假设转载.请保留作者信息. 博客地址:http://blog.csdn.net/qq_21398167 原博文地 ...