solr源码分析之数据导入DataImporter追溯。

　　若要搜索的信息都是被存储在数据库里面的，但是solr不能直接搜数据库，所以只有借助Solr组件将要搜索的信息在搜索服务器上进行索引，然后在客户端供客户使用。

1. SolrDispatchFilter

SolrDispatchFilter的作用：将请求的url映射到定义在solrconfig.xml中的处理器handler。

要处理的动作有：

  enum Action {

    PASSTHROUGH, FORWARD, RETURN, RETRY, ADMIN, REMOTEQUERY, PROCESS

  }

PASSTHROUGH:通过webapp传递到Restlet。

FORWARD:跳转重写的url（没有路径前缀和核心/集合名称）到Restlet。

RETURN:返回控制，不需要更多特定的处理，通常在设置错误并返回时产生。

RETRY:重试请求。当没有发现工作的core时，设置此参数。

注：核心是指CoreContainer

SolrDispatchFilter间接继承了javax.servlet.Filter，实现方法为doFilter()：

public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain, boolean retry) throws IOException, ServletException {

    if (!(request instanceof HttpServletRequest)) return;

    AtomicReference<ServletRequest> wrappedRequest = new AtomicReference();

    if (!authenticateRequest(request, response, wrappedRequest)) { // the response and status code have already been sent

      return;

    }

    if (wrappedRequest.get() != null) {

      request = wrappedRequest.get();

    }

    if (cores.getAuthenticationPlugin() != null) {

      log.debug("User principal: {}", ((HttpServletRequest)request).getUserPrincipal());

    }

    // No need to even create the HttpSolrCall object if this path is excluded.

    if(excludePatterns != null) {

      String servletPath = ((HttpServletRequest) request).getServletPath();

      for (Pattern p : excludePatterns) {

        Matcher matcher = p.matcher(servletPath);

        if (matcher.lookingAt()) {

          chain.doFilter(request, response);

          return;

        }

      }

    }

    HttpSolrCall call = getHttpSolrCall((HttpServletRequest) request, (HttpServletResponse) response, retry);

    try {

      Action result = call.call();

      switch (result) {

        case PASSTHROUGH:

          chain.doFilter(request, response);

          break;

        case RETRY:

          doFilter(request, response, chain, true);

          break;

        case FORWARD:

          request.getRequestDispatcher(call.getPath()).forward(request, response);

          break;

      }

    } finally {

      call.destroy();

    }

  }

SolrDispatchFilter调用HttpSolrCall的call()方法来处理。

2. 调用HttpSolrCall处理请求

HttpSolrCall的构造函数：

  public HttpSolrCall(SolrDispatchFilter solrDispatchFilter, CoreContainer cores,

               HttpServletRequest request, HttpServletResponse response, boolean retry) {

    this.solrDispatchFilter = solrDispatchFilter;

    this.cores = cores;

    this.req = request;

    this.response = response;

    this.retry = retry;

    this.requestType = RequestType.UNKNOWN;

    queryParams = SolrRequestParsers.parseQueryString(req.getQueryString());

  }

在call方法中完整请求处理：

 /**

   * This method processes the request.

   */

  public Action call() throws IOException {

    MDCLoggingContext.reset();

    MDCLoggingContext.setNode(cores);

    if (cores == null) {

      sendError(503, "Server is shutting down or failed to initialize");

      return RETURN;

    }

    if (solrDispatchFilter.abortErrorMessage != null) {

      sendError(500, solrDispatchFilter.abortErrorMessage);

      return RETURN;

    }

    try {

      init();

      /* Authorize the request if

       1. Authorization is enabled, and

       2. The requested resource is not a known static file

        */

      if (cores.getAuthorizationPlugin() != null) {

        AuthorizationContext context = getAuthCtx();

        log.info(context.toString());

        AuthorizationResponse authResponse = cores.getAuthorizationPlugin().authorize(context);

        if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && !(authResponse.statusCode == HttpStatus.SC_OK)) {

          sendError(authResponse.statusCode,

              "Unauthorized request, Response code: " + authResponse.statusCode);

          return RETURN;

        }

      }

      HttpServletResponse resp = response;

      switch (action) {

        case ADMIN:

          handleAdminRequest();

          return RETURN;

        case REMOTEQUERY:

          remoteQuery(coreUrl + path, resp);

          return RETURN;

        case PROCESS:

          final Method reqMethod = Method.getMethod(req.getMethod());

          HttpCacheHeaderUtil.setCacheControlHeader(config, resp, reqMethod);

          // unless we have been explicitly told not to, do cache validation

          // if we fail cache validation, execute the query

          if (config.getHttpCachingConfig().isNever304() ||

              !HttpCacheHeaderUtil.doCacheHeaderValidation(solrReq, req, reqMethod, resp)) {

            SolrQueryResponse solrRsp = new SolrQueryResponse();

              /* even for HEAD requests, we need to execute the handler to

               * ensure we don't get an error (and to make sure the correct

               * QueryResponseWriter is selected and we get the correct

               * Content-Type)

               */

            SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, solrRsp));

            execute(solrRsp);

            HttpCacheHeaderUtil.checkHttpCachingVeto(solrRsp, resp, reqMethod);

            Iterator<Map.Entry<String, String>> headers = solrRsp.httpHeaders();

            while (headers.hasNext()) {

              Map.Entry<String, String> entry = headers.next();

              resp.addHeader(entry.getKey(), entry.getValue());

            }

            QueryResponseWriter responseWriter = core.getQueryResponseWriter(solrReq);

            if (invalidStates != null) solrReq.getContext().put(CloudSolrClient.STATE_VERSION, invalidStates);

            writeResponse(solrRsp, responseWriter, reqMethod);

          }

          return RETURN;

        default: return action;

      }

    } catch (Throwable ex) {

      sendError(ex);

      // walk the the entire cause chain to search for an Error

      Throwable t = ex;

      while (t != null) {

        if (t instanceof Error) {

          if (t != ex) {

            SolrDispatchFilter.log.error("An Error was wrapped in another exception - please report complete stacktrace on SOLR-6161", ex);

          }

          throw (Error) t;

        }

        t = t.getCause();

      }

      return RETURN;

    } finally {

      MDCLoggingContext.clear();

    }

  }

3.获取handler

RequestHandlerBase获取handler：

/**

   * Get the request handler registered to a given name.

   *

   * This function is thread safe.

   */

  public static SolrRequestHandler getRequestHandler(String handlerName, PluginBag<SolrRequestHandler> reqHandlers) {

    if(handlerName == null) return null;

    SolrRequestHandler handler = reqHandlers.get(handlerName);

    int idx = 0;

    if(handler == null) {

      for (; ; ) {

        idx = handlerName.indexOf('/', idx+1);

        if (idx > 0) {

          String firstPart = handlerName.substring(0, idx);

          handler = reqHandlers.get(firstPart);

          if (handler == null) continue;

          if (handler instanceof NestedRequestHandler) {

            return ((NestedRequestHandler) handler).getSubHandler(handlerName.substring(idx));

          }

        } else {

          break;

        }

      }

    }

    return handler;

  }

4.处理请求handleRequest

RequestHandlerBase的handleRequest()方法处理请求：

public void handleRequest(SolrQueryRequest req, SolrQueryResponse rsp) {

    numRequests.incrementAndGet();

    TimerContext timer = requestTimes.time();

    try {

      if(pluginInfo != null && pluginInfo.attributes.containsKey(USEPARAM)) req.getContext().put(USEPARAM,pluginInfo.attributes.get(USEPARAM));

      SolrPluginUtils.setDefaults(this, req, defaults, appends, invariants);

      req.getContext().remove(USEPARAM);

      rsp.setHttpCaching(httpCaching);

      handleRequestBody( req, rsp );

      // count timeouts

      NamedList header = rsp.getResponseHeader();

      if(header != null) {

        Object partialResults = header.get("partialResults");

        boolean timedOut = partialResults == null ? false : (Boolean)partialResults;

        if( timedOut ) {

          numTimeouts.incrementAndGet();

          rsp.setHttpCaching(false);

        }

      }

    } catch (Exception e) {

      if (e instanceof SolrException) {

        SolrException se = (SolrException)e;

        if (se.code() == SolrException.ErrorCode.CONFLICT.code) {

          // TODO: should we allow this to be counted as an error (numErrors++)?

        } else {

          SolrException.log(SolrCore.log,e);

        }

      } else {

        SolrException.log(SolrCore.log,e);

        if (e instanceof SyntaxError) {

          e = new SolrException(SolrException.ErrorCode.BAD_REQUEST, e);

        }

      }

      rsp.setException(e);

      numErrors.incrementAndGet();

    }

    finally {

      timer.stop();

    }

  }

5.具体请求落到各个handler的handleRequestBody()方法，以DataImportHandler为例：

 @Override

  @SuppressWarnings("unchecked")

  public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)

          throws Exception {

    rsp.setHttpCaching(false);

    //TODO: figure out why just the first one is OK...

    ContentStream contentStream = null;

    Iterable<ContentStream> streams = req.getContentStreams();

    if(streams != null){

      for (ContentStream stream : streams) {

          contentStream = stream;

          break;

      }

    }

    SolrParams params = req.getParams();

    NamedList defaultParams = (NamedList) initArgs.get("defaults");

    RequestInfo requestParams = new RequestInfo(req, getParamsMap(params), contentStream);

    String command = requestParams.getCommand();

    if (DataImporter.SHOW_CONF_CMD.equals(command)) {

      String dataConfigFile = params.get("config");

      String dataConfig = params.get("dataConfig");

      if(dataConfigFile != null) {

        dataConfig = SolrWriter.getResourceAsString(req.getCore().getResourceLoader().openResource(dataConfigFile));

      }

      if(dataConfig==null)  {

        rsp.add("status", DataImporter.MSG.NO_CONFIG_FOUND);

      } else {

        // Modify incoming request params to add wt=raw

        ModifiableSolrParams rawParams = new ModifiableSolrParams(req.getParams());

        rawParams.set(CommonParams.WT, "raw");

        req.setParams(rawParams);

        ContentStreamBase content = new ContentStreamBase.StringStream(dataConfig);

        rsp.add(RawResponseWriter.CONTENT, content);

      }

      return;

    }

    rsp.add("initArgs", initArgs);

    String message = "";

    if (command != null) {

      rsp.add("command", command);

    }

    // If importer is still null

    if (importer == null) {

      rsp.add("status", DataImporter.MSG.NO_INIT);

      return;

    }

    if (command != null && DataImporter.ABORT_CMD.equals(command)) {

      importer.runCmd(requestParams, null);

    } else if (importer.isBusy()) {

      message = DataImporter.MSG.CMD_RUNNING;

    } else if (command != null) {

      if (DataImporter.FULL_IMPORT_CMD.equals(command)

              || DataImporter.DELTA_IMPORT_CMD.equals(command) ||

              IMPORT_CMD.equals(command)) {

        importer.maybeReloadConfiguration(requestParams, defaultParams);

        UpdateRequestProcessorChain processorChain =

                req.getCore().getUpdateProcessorChain(params);

        UpdateRequestProcessor processor = processorChain.createProcessor(req, rsp);

        SolrResourceLoader loader = req.getCore().getResourceLoader();

        DIHWriter sw = getSolrWriter(processor, loader, requestParams, req);

        if (requestParams.isDebug()) {

          if (debugEnabled) {

            // Synchronous request for the debug mode

            importer.runCmd(requestParams, sw);

            rsp.add("mode", "debug");

            rsp.add("documents", requestParams.getDebugInfo().debugDocuments);

            if (requestParams.getDebugInfo().debugVerboseOutput != null) {

              rsp.add("verbose-output", requestParams.getDebugInfo().debugVerboseOutput);

            }

          } else {

            message = DataImporter.MSG.DEBUG_NOT_ENABLED;

          }

        } else {

          // Asynchronous request for normal mode

          if(requestParams.getContentStream() == null && !requestParams.isSyncMode()){

            importer.runAsync(requestParams, sw);

          } else {

            importer.runCmd(requestParams, sw);

          }

        }

      } else if (DataImporter.RELOAD_CONF_CMD.equals(command)) {

        if(importer.maybeReloadConfiguration(requestParams, defaultParams)) {

          message = DataImporter.MSG.CONFIG_RELOADED;

        } else {

          message = DataImporter.MSG.CONFIG_NOT_RELOADED;

        }

      }

    }

    rsp.add("status", importer.isBusy() ? "busy" : "idle");

    rsp.add("importResponse", message);

    rsp.add("statusMessages", importer.getStatusMessages());

  }

6. 导入数据操作

分全量和增量：

void runCmd(RequestInfo reqParams, DIHWriter sw) {

    String command = reqParams.getCommand();

    if (command.equals(ABORT_CMD)) {

      if (docBuilder != null) {

        docBuilder.abort();

      }

      return;

    }

    if (!importLock.tryLock()){

      LOG.warn("Import command failed . another import is running");

      return;

    }

    try {

      if (FULL_IMPORT_CMD.equals(command) || IMPORT_CMD.equals(command)) {

        doFullImport(sw, reqParams);

      } else if (command.equals(DELTA_IMPORT_CMD)) {

        doDeltaImport(sw, reqParams);

      }

    } finally {

      importLock.unlock();

    }

  }

以全量为例：

  public void doFullImport(DIHWriter writer, RequestInfo requestParams) {

    LOG.info("Starting Full Import");

    setStatus(Status.RUNNING_FULL_DUMP);

    try {

      DIHProperties dihPropWriter = createPropertyWriter();

      setIndexStartTime(dihPropWriter.getCurrentTimestamp());

      docBuilder = new DocBuilder(this, writer, dihPropWriter, requestParams);

      checkWritablePersistFile(writer, dihPropWriter);

      docBuilder.execute();

      if (!requestParams.isDebug())

        cumulativeStatistics.add(docBuilder.importStatistics);

    } catch (Exception e) {

      SolrException.log(LOG, "Full Import failed", e);

      docBuilder.handleError("Full Import failed", e);

    } finally {

      setStatus(Status.IDLE);

      DocBuilder.INSTANCE.set(null);

    }

  }

7. EntityProcessorWrapper处理sql的实现类SqlEntityProcessor

  @Override

  public void init(Context context) {

    rowcache = null;

    this.context = context;

    resolver = (VariableResolver) context.getVariableResolver();

    if (entityName == null) {

      onError = resolver.replaceTokens(context.getEntityAttribute(ON_ERROR));

      if (onError == null) onError = ABORT;

      entityName = context.getEntityAttribute(ConfigNameConstants.NAME);

    }

    delegate.init(context);

  }

初始化时实现SqlEntityProcessor的初始化

  public void init(Context context) {

    super.init(context);

    dataSource = context.getDataSource();

  }

contextImpl

  @Override

  public DataSource getDataSource() {

    if (ds != null) return ds;

    if(epw==null) { return null; }

    if (epw!=null && epw.getDatasource() == null) {

      epw.setDatasource(dataImporter.getDataSourceInstance(epw.getEntity(), epw.getEntity().getDataSourceName(), this));

    }

    if (epw!=null && epw.getDatasource() != null && docBuilder != null && docBuilder.verboseDebug &&

             Context.FULL_DUMP.equals(currentProcess())) {

      //debug is not yet implemented properly for deltas

      epw.setDatasource(docBuilder.getDebugLogger().wrapDs(epw.getDatasource()));

    }

    return epw.getDatasource();

  }

DataImporter获取数据库配置：

public DataSource getDataSourceInstance(Entity key, String name, Context ctx) {

    Map<String,String> p = requestLevelDataSourceProps.get(name);

    if (p == null)

      p = config.getDataSources().get(name);

    if (p == null)

      p = requestLevelDataSourceProps.get(null);// for default data source

    if (p == null)

      p = config.getDataSources().get(null);

    if (p == null)

      throw new DataImportHandlerException(SEVERE,

              "No dataSource :" + name + " available for entity :" + key.getName());

    String type = p.get(TYPE);

    DataSource dataSrc = null;

    if (type == null) {

      dataSrc = new JdbcDataSource();

    } else {

      try {

        dataSrc = (DataSource) DocBuilder.loadClass(type, getCore()).newInstance();

      } catch (Exception e) {

        wrapAndThrow(SEVERE, e, "Invalid type for data source: " + type);

      }

    }

    try {

      Properties copyProps = new Properties();

      copyProps.putAll(p);

      Map<String, Object> map = ctx.getRequestParameters();

      if (map.containsKey("rows")) {

        int rows = Integer.parseInt((String) map.get("rows"));

        if (map.containsKey("start")) {

          rows += Integer.parseInt((String) map.get("start"));

        }

        copyProps.setProperty("maxRows", String.valueOf(rows));

      }

      dataSrc.init(ctx, copyProps);

    } catch (Exception e) {

      wrapAndThrow(SEVERE, e, "Failed to initialize DataSource: " + key.getDataSourceName());

    }

    return dataSrc;

  }

8.查询结果

public ResultSetIterator(String query) {

      try {

        Connection c = getConnection();

        stmt = c.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);

        stmt.setFetchSize(batchSize);

        stmt.setMaxRows(maxRows);

        LOG.debug("Executing SQL: " + query);

        long start = System.nanoTime();

        if (stmt.execute(query)) {

          resultSet = stmt.getResultSet();

        }

        LOG.trace("Time taken for sql :"

                + TimeUnit.MILLISECONDS.convert(System.nanoTime() - start, TimeUnit.NANOSECONDS));

        colNames = readFieldNames(resultSet.getMetaData());

      } catch (Exception e) {

        wrapAndThrow(SEVERE, e, "Unable to execute query: " + query);

      }

      if (resultSet == null) {

        rSetIterator = new ArrayList<Map<String, Object>>().iterator();

        return;

      }

      rSetIterator = new Iterator<Map<String, Object>>() {

        @Override

        public boolean hasNext() {

          return hasnext();

        }

        @Override

        public Map<String, Object> next() {

          return getARow();

        }

        @Override

        public void remove() {/* do nothing */

        }

      };

    }

solr支持数据库的全量和增量索引建立，上述代码介绍了全量索引的来龙去脉，增量索引和全量索引雷同，就不赘述了。

solr源码分析之数据导入DataImporter追溯。的更多相关文章

solr源码分析之searchComponent
上文solr源码分析之数据导入DataImporter追溯中提到了solr的工作流程,其核心是各种handler. handler定义了各种search Component, @Override pu ...
HDFS源码分析之数据块及副本状态BlockUCState、ReplicaState
关于数据块.副本的介绍,请参考文章<HDFS源码分析之数据块Block.副本Replica>. 一.数据块状态BlockUCState 数据块状态用枚举类BlockUCState来表示,代 ...
jQuery 源码分析(十) 数据缓存模块 data详解
jQuery的数据缓存模块以一种安全的方式为DOM元素附加任意类型的数据,避免了在JavaScript对象和DOM元素之间出现循环引用,以及由此而导致的内存泄漏. 数据缓存模块为DOM元素和JavaS ...
solr源码分析之solrclound
一.简介 SolrCloud是Solr4.0版本以后基于Solr和Zookeeper的分布式搜索方案.SolrCloud是Solr的基于Zookeeper一种部署方式.Solr可以以多种方式部署,例如 ...
SOFA 源码分析 — 链路数据透传
前言 SOFA-RPC 支持数据链路透传功能,官方解释: 链路数据透传功能支持应用向调用上下文中存放数据,达到整个链路上的应用都可以操作该数据. 使用方式如下,可分别向链路的 request 和 re ...
springMVC源码分析--HttpMessageConverter数据转化（一）
之前的博客我们已经介绍了很多springMVC相关的模块,接下来我们介绍一下springMVC在获取参数和返回结果值方面的处理.虽然在之前的博客老田已经分别介绍了参数处理器和返回值处理器: (1)sp ...
Hadoop源码分析之数据节点的握手，注册，上报数据块和心跳
转自:http://www.it165.net/admin/html/201402/2382.html 在上一篇文章Hadoop源码分析之DataNode的启动与停止中分析了DataNode节点的启动 ...
Android 7.0 Gallery图库源码分析3 - 数据加载及显示流程
前面分析Gallery启动流程时,说了传给DataManager的data的key是AlbumSetPage.KEY_MEDIA_PATH,value值,是”/combo/{/local/all,/p ...
11.源码分析---SOFARPC数据透传是实现的？
先把栗子放上,让大家方便测试用: Service端 public static void main(String[] args) { ServerConfig serverConfig = new S ...

随机推荐

linux下部署monogoDB服务（以及安装php mogodb扩展）
这两天网站转移到新的服务器后,登录时出现一个问题,会等待几分钟服务器才响应. 开始以为是nginx服务器的问题,后面经过排查发现是php一个登陆验证的函数的问题,每次跑到这个函数就会迟钝几分钟. 经过 ...
Java多线程系列3 synchronized 关键词
先来看一个线程安全的例子 ,两个线程对count进行累加,共累加10万次. public class AddTest { public static void main(String[] args) ...
python datetime模块用法
1. 创建naive(无时区信息)的datetime对象 import datetime dt_utc = datetime.datetime.utcnow() dt_utc # datetime.d ...
diango中的url路由系统
一.url配置 url本质是url与要为该url调用的视图函数之间的映射表 urlpatterns = [正则,视图函数[,别名]] 二.正则表达式 1.匹配原则 django是循环urlpatter ...
我的idea突然没有SVN了是怎么回事
总结一下没有svn选项的几种情况: 情况1:IntelliJ IDEA打开带SVN信息的项目不显示SVN信息,项目右键SVN以及图标还有Changes都不显示解决方法在VCS菜单中有个开关,叫Ena ...
C#语言不常用语法笔记
只看过3天C#语法书,了解个大概,与C++等不同之处,或者看开源遇到一些奇异用法,记录一下,脑子不够用的情况下,还是记笔记靠谱. ==================== 顺便吐槽下,这年头得会各种编 ...
《MarkMark学习笔记学习笔记》html学习笔记
iframe里有一个srcdoc属性,很有用! window.location.href=document.referrer//可以实现返回上一级页面并刷新 HTML5权威指南©®,比较老的书了,有些 ...
js计算器---转
至今见过的一个还没问题的计算器,收藏在此. 转自javascript写的简单的计算器原文链接,谢分享! js部分 ar num=0,result=0,numshow="0"; va ...
配置微信jssdk自定义分享
前段时间做这个功能的时候遇到这个问题,之前的话是微信自动抓取界面第一张图,现在微信更新api,必须自行配置,接入jssdk,才能实现该功能. 详细可以查看微信的jssdk文档微信官方开发者文档 ...
salt 配置管理
索引 saltstack入门 salt state sls 描述文件 saltstack配置管理高级功能 saltstack入门 192.168.86.3 salt 修改 [root@Zabbix-s ...

solr源码分析之数据导入DataImporter追溯。

solr源码分析之数据导入DataImporter追溯。的更多相关文章

随机推荐

热门专题