lucene segment会包含所有的索引文件，如tim tip等，可以认为是mini的独立索引

A Lucene index segment can be viewed as a "mini" index or a shard. Each segment is a collection of all needed files for an index, including .tim and .tip. If you list your Lucene index directory, you'll see files belonging to the same segment have the same names with all different types. In fact, if you force a merge, you'll get an index of one single segment.

Each segment contains an index of a subset of your document collection. Lucene usually creates a new segment when new documents are added to a working index, to avoid (or rather delay and batch later) reindexing cost.

When a search is executed, Lucene will fan that query over all segments, and all the index wide statistics required for relevance ranking (such as idf) are combined, so from the client's perspective, the ranking is the same as searching from an index of one segment. Note that the other famous stat, tf, is per-document, so it is already available at the segment reader layer.

Now things get more interesting when you have Lucene indexes across machines (as the case in Solr Cloud, which is one of the distributed search service built on Lucene). Due to performance and complexity, Solr Cloud don't aggregate global stats across clusters (yet), so each machine would use their own stats on the index it holds (which could be consisted of multiple segments :).

摘自：https://www.quora.com/Are-the-individual-tim-and-tip-files-term-dictionaries-of-a-Lucene-index-segment-updated-when-a-new-segment-is-added-to-Lucene

lucene segment会包含所有的索引文件，如tim tip等，可以认为是mini的独立索引的更多相关文章

Solr4.8.0源码分析(9)之Lucene的索引文件(2)
Solr4.8.0源码分析(9)之Lucene的索引文件(2) 一. Segments_N文件一个索引对应一个目录,索引文件都存放在目录里面.Solr的索引文件存放在Solr/Home下的core/ ...
Solr4.8.0源码分析(8)之Lucene的索引文件(1)
Solr4.8.0源码分析(8)之Lucene的索引文件(1) 题记:最近有幸看到觉先大神的Lucene的博客,感觉自己之前学习的以及工作的太为肤浅,所以决定先跟随觉先大神的博客学习下Lucene的原 ...
Solr4.8.0源码分析(12)之Lucene的索引文件(5)
Solr4.8.0源码分析(12)之Lucene的索引文件(5) 1. 存储域数据文件(.fdt和.fdx) Solr4.8.0里面使用的fdt和fdx的格式是lucene4.1的.为了提升压缩比,S ...
Solr4.8.0源码分析(11)之Lucene的索引文件(4)
Solr4.8.0源码分析(11)之Lucene的索引文件(4) 1. .dvd和.dvm文件 .dvm是存放了DocValue域的元数据,比如DocValue偏移量. .dvd则存放了DocValu ...
Solr4.8.0源码分析(10)之Lucene的索引文件(3)
Solr4.8.0源码分析(10)之Lucene的索引文件(3) 1. .si文件 .si文件存储了段的元数据,主要涉及SegmentInfoFormat.java和Segmentinfo.java这 ...
Lucene索引文件组成
Lucene的索引里面存了些什么,如何存放的,也即Lucene的索引文件格式,是读懂Lucene源代码的一把钥匙. 当我们真正进入到Lucene源代码之中的时候,我们会发现: Lucene的索引过程, ...
Lucene索引文件学习
最近在做搜索,抽空看一下lucene,资料挺多的,不过大部分都是3.x了--在对着官方文档大概看一下. 优化后的lucene索引文件(4.9.0) 一.段文件 1.段文件:segments_5p和s ...
lucene大索引文件分布式存储方案
这几天实现了个Lucene分布式检索的模块,采用的分布式方案是将数据分块,分别生成N个索引文件,放到N个节点上运行.检索时,对每一个节点发出查询请求,将N个节点返回的结果归并,然后生成一个新的结果.如 ...
sphinx索引文件进一步说明——最好是结合lucene一起看，直觉告诉我二者本质无异
摘自:http://blog.csdn.net/cangyingzhijia/article/details/8592441 Sphinx使用的文件包括 "sph", " ...

随机推荐

CDOJ 1225 Game Rooms
Game Rooms Time Limit: 4000/4000MS (Java/Others) Memory Limit: 65535/65535KB (Java/Others) Your ...
python013 Python3 条件控制
Python3 条件控制Python条件语句是通过一条或多条语句的执行结果(True或者False)来决定执行的代码块.可以通过下图来简单了解条件语句的执行过程: if 语句Python中if语句的一 ...
oracle 9i/10g/11g(11.2.0.3)安装包和PATCH下载地址汇总
今天上PUB看见一位热心人汇总了这么个地址列表,转发来空间: 把下面的地址复制到讯雷里就可以下载. -------------------------------------------------- ...
[BZOJ4207]Can
[BZOJ4207]Can 试题描述这个问题是源于一个在棋盘上玩的,由Sid Sackson设计的名叫Can't stop的游戏的.这个问题与Can't stop有一定的相似之处,但是不需要玩过Ca ...
Monkey King（左偏树）
洛谷传送门每次给出要争吵的猴子a和b,用并查集判断如果他们是朋友输出-1 如果不是,找出a,b在的堆的根A,B,分别合并A,B的左右孩子,再合并一下. 之后把A,B的数据更改一下:权值除以2,左右孩 ...
java打开本地应用程序(调用cmd)---Runtime用法详解
有时候我们需要借助java程序打开电脑自带的一些程序,可以直接打开或者借助cmd命令窗口打开一些常用的应用程序或者脚本,在cmd窗口执行的命令都可以通过这种方式运行. 例如: package cn.x ...
FastDFS+nginx+php的完整应用[转储]
FastDFS功能简介: FastDFS是一个开源的轻量级分布式文件系统,它对文件进行管理,功能包括:文件存储.文件同步.文件访问(文件上传.文件下载)等,解决了大容量存储和负载均衡的问题.特别适合以 ...
flex里InputText不能输入中文
最近做项目都没做任何的更新,今天突然遇到在flex里的InputText无法进行中文输入,晚上查找了下资料,很多原因说是flashplayer的一个BUG. 在网上找到两种解决办法: 1.会出现这种情 ...
Codeforces Beta Round #57 (Div. 2) E. Enemy is weak
求满足条件的三元组的个数,可以转换求一元组和二元组组成的满足条件的三元组的个数,且对于(x),(y,z),x > y,且x出现的p_x < p_y. x可直接枚举O(n),此时需要往后查询 ...
T1503 愚蠢的宠物 codevs
http://codevs.cn/problem/1503/ 时间限制: 1 s 空间限制: 128000 KB 题目等级 : 黄金 Gold 题目描述 Description 大家都知道,sh ...

lucene segment会包含所有的索引文件，如tim tip等，可以认为是mini的独立索引

lucene segment会包含所有的索引文件，如tim tip等，可以认为是mini的独立索引的更多相关文章

随机推荐

热门专题