lucene Index Store TermVector 说明
最新的lucene 3.0的field是这样的:
Field options for indexing
Index.ANALYZED – use the analyzer to break the Field’s value into a stream of separate tokens and make each token searchable.
Index.NOT_ANALYZED – do index the field, but do not analyze the String. Instead, treat the Field’s entire value as a single token and make that token searchable.
Index.ANALYZED_NO_NORMS – an advanced variant of Index.ANALYZED which does not store norms information in the index.
Index.NOT_ANALYZED_NO_NORMS – just like , but also do not store Norms.
Index.NO – don’t make this field’s value available for searching at all.
Field options for storing fields
Store.YES — store the value. When the value is stored, the original String in its entirety is recorded in the index and may be retrieved by an IndexReader.
Store.NO – do not store the value. This is often used along with Index.ANALYZED to index a large text field that doesn’t need to be retrieved in its original form.
Field options for term vectors
TermVector.YES – record the unique terms that occurred, and their counts, in each document, but do not store any positions or offsets information.
TermVector.WITH_POSITIONS – record the unique terms and their counts, and also the positions of each occurrence of every term, but no offsets.
TermVector.WITH_OFFSETS – record the unique terms and their counts, with the offsets (start & end character position) of each occurrence of every term, but no positions.
TermVector.WITH_POSITIONS_OFFSETS – store unique terms and their counts, along with positions and offsets.
TermVector.NO – do not store any term vector information.
If Index.NO is specified for a field, then you must also specify TermVector.NO.
具一些例子来说明这些怎么用
Index Store TermVector Example usage
NOT_ANALYZED YES NO Identifiers (file names, primary keys),
Telephone and Social Security
numbers, URLs, personal names, Dates
ANALYZED YES WITH_POSITIONS_OFFSETS Document title, document abstract
ANALYZED NO WITH_POSITIONS_OFFSETS Document body
NO YES NO Document type, database primary key
NOT_ANALYZED NO NO Hidden keywords
When Lucene builds the inverted index, by default it stores all necessary information to implement the Vector Space model. This model requires the count of every term that occurred in the document, as well as the positions of each occurrence (needed for phrase searches).
You can tell Lucene to skip indexing the term frequency and positions by calling:
Field.setOmitTermFreqAndPositions(true)
摘自:http://www.cnblogs.com/fxjwind/archive/2011/07/04/2097705.html
lucene Index Store TermVector 说明的更多相关文章
- ElasticSearch 2 (10) - 在ElasticSearch之下(深入理解Shard和Lucene Index)
摘要 从底层介绍ElasticSearch Shard的内部原理,以及回答为什么使用ElasticSearch有必要了解Lucene的内部工作方式? 了解ElasticSearch API的代价 构建 ...
- Lucene——Field.Store(存储域选项)及Field.Index(索引选项)
Field.Store.YES或者NO(存储域选项) 设置为YES表示或把这个域中的内容完全存储到文件中,方便进行文本的还原 设置为NO表示把这个域的内容不存储到文件中,但是可以被索引,此时内容无法完 ...
- Lucene Index Search
转发自: https://my.oschina.net/u/3777556/blog/1647031 什么是Lucene?? Lucene 是 apache 软件基金会发布的一个开放源代码的全文检索 ...
- 使用Lucene.Net实现全文检索
使用Lucene.Net实现全文检索 目录 一 Lucene.Net概述 二 分词 三 索引 四 搜索 五 实践中的问题 一 Lucene.Net概述 Lucene.Net是一个C#开发的开源全文索引 ...
- Lucene教程具体解释
(建立索引)] )中生成的索引文件的存放地址.详细步骤简单介绍例如以下: 1.创建Directory对象,索引目录 2.创建IndexSearch对象,建立查询(參数是Directory对象) 3.创 ...
- lucene 中关于Store.YES 关于Store.NO的解释
总算搞明白 lucene 中关于Store.YES 关于Store.NO的解释了 一直对Lucene Store.YES不太理解,网上多数的说法是存储字段,NO为不存储. 这样的解释有点郁闷:字面意 ...
- 解决org.apache.lucene.store.AlreadyClosedException: this Directory is closed
在Lucene中,关闭一个IndexWriter时抛出AlreadyClosedException异常: org.apache.lucene.store.AlreadyClosedException: ...
- Lucene教程(转)
Lucene教程 1 lucene简介1.1 什么是lucene Lucene是一个全文搜索框架,而不是应用产品.因此它并不像www.baidu.com 或者google Desktop那么拿来 ...
- Lucene.net站内搜索—5、搜索引擎第一版实现
目录 Lucene.net站内搜索—1.SEO优化 Lucene.net站内搜索—2.Lucene.Net简介和分词Lucene.net站内搜索—3.最简单搜索引擎代码Lucene.net站内搜索—4 ...
随机推荐
- iOS 设置RGB色的宏
转自:http://lizhuang.iteye.com/blog/1931768 //RGB Color macro #define UIColorFromRGB(rgbValue) [UICol ...
- sulime text 常用快捷键总结
Sublime Text 3汉化中文版主要特色: -语法高亮.代码提示补全.代码折叠.自定义皮肤/配色方案.多便签 -代码地图.多种界面布局与全屏免打扰模式 -完全开放的用户自定义配置与神奇实用的编辑 ...
- Delphi使用进行post数据时超时设置
因项目需要进行http的post提交数据,开始时用indy的idHttp组件,但是测试时发现当网络中断(如拔掉网线),idHttp的超时设置无效果,要等20秒才提示超时(参考网上的做法,将indy9升 ...
- centos tomcat 关于日志
一.实时查看tomcat的日志 1.先切换到tomcat5/logs 2.tail -f catalina.out 3.这样运行时就可以实时查看运行日志了 例如: cd /tomcat7/logs t ...
- 中国剩余定理 & 欧拉函数 & 莫比乌斯反演 & 狄利克雷卷积 & 杜教筛
ssplaysecond的博客(请使用VPN访问): 中国剩余定理: https://ssplaysecond.blogspot.jp/2017/04/blog-post_6.html 欧拉函数: h ...
- 框架-数据库定义MD5加密
1.--定义Md5加密declare @pt_pwd varchar(50)set @pt_pwd = ''set @pt_pwd = substring(sys.fn_sqlvarbasetostr ...
- 【IntelliJ Idea】启动参数JVM参数的配置 优先级高于 application.yaml/application.properties中的配置,前者可以覆盖后者的配置
- BIM
BIM进入中国已经有十来个年头,随着对BIM概念的深入了解.当前国内BIM应用逐渐由三维模型的可视化应用升级为基于BIM模型的信息进行项目精细化动态管理. 传统粗放的项目管理方法是工程项目难以进行精细 ...
- 【Android小项目】找不同,改编自"寻找房祖名"的一款开源小应用。
近期在微信朋友圈"寻找房祖名"和"万里寻刀"这类小游戏比較火.我试着写了一个android版本号的,里面全是一系列的形近字,实现原理非常easy:用一个Grid ...
- 前端编程提高之旅(十)----表单验证插件与cookie插件
实际项目开发中与用户交互的常见手法就是採用表单的形式.取得用户注冊.登录等信息.而当用户注冊或登录后又须要记住用户的登录状态.这就涉及到经常使用的两个操作:表单验证与cookie增删查找. ...