Solr Suggest组件的使用
- text:搜索关键字域,用户输入的搜索关键字是在该域上进行匹配,使用TextField,并进行store;
- exacttext: 与text的唯一区别是使用StringField并且不进行Store;
- contexts: 该域也是用于过滤的,只不过它为比较次要的过滤条件域;
- key: 用于搜索字域,用户输入的搜索关键字分词后的Term在这个域上进行匹配;
- content: 就是一个Term集合,用于contexts上的域进行TermQuery,在关键词的基础上再加个限制条件让返回的热词列表更符合要求,例如分类,分组等信息(给定限定范围,搜索衬衫,在男装范围内);
- weight:指定一个数字类型(int, long)的域,搜索结果将按照该域进行降序排序;
- payload:存储一个额外信息,以ByteBuf存储(其实就是byte[]方式存入索引),当搜索返回后,可以通过LookupResult结果对象的payload属性返回并反序列化该值。
- allTermRequired: 搜索阶段,是否所有用户输入的关键词都需要全部匹配;
- key:用户输入的搜索关键字,再返回给你
- highlightKey:其实就是经过高亮的搜索关键字文本,假如你在搜索的时候设置了需要关键字高亮
- value:即InputInterator接口中weight方法的返回值,即返回的当前热词的权重值,排序就是根据这个值排的
- payload:就是InputInterator接口中payload方法中指定的payload信息,设计这个payload就是用来让你存一些任意你想存的信息,这就留给你们自己去发挥想象了。
- contexts:同理即InputInterator接口中contexts方法的返回值再原样返回给你。
@Override
public void build(InputIterator iter) throws IOException { if (searcherMgr != null) {
searcherMgr.close();
searcherMgr = null;
} if (writer != null) {
writer.close();
writer = null;
} boolean success = false;
try {
// First pass: build a temporary normal Lucene index,
// just indexing the suggestions as they iterate:
writer = new IndexWriter(dir,
getIndexWriterConfig(getGramAnalyzer(), IndexWriterConfig.OpenMode.CREATE));
//long t0 = System.nanoTime(); // TODO: use threads?
BytesRef text;
while ((text = iter.next()) != null) {
BytesRef payload;
if (iter.hasPayloads()) {
payload = iter.payload();
} else {
payload = null;
} add(text, iter.contexts(), iter.weight(), payload);
} public void add(BytesRef text, Set<BytesRef> contexts, long weight, BytesRef payload) throws IOException {
ensureOpen();
writer.addDocument(buildDocument(text, contexts, weight, payload));
}
private Document buildDocument(BytesRef text, Set<BytesRef> contexts, long weight, BytesRef payload) throws IOException {
String textString = text.utf8ToString();
Document doc = new Document();
FieldType ft = getTextFieldType();
doc.add(new Field(TEXT_FIELD_NAME, textString, ft));
doc.add(new Field("textgrams", textString, ft));
doc.add(new StringField(EXACT_TEXT_FIELD_NAME, textString, Field.Store.NO));
doc.add(new BinaryDocValuesField(TEXT_FIELD_NAME, text));
doc.add(new NumericDocValuesField("weight", weight));
if (payload != null) {
doc.add(new BinaryDocValuesField("payloads", payload));
}
if (contexts != null) {
for(BytesRef context : contexts) {
doc.add(new StringField(CONTEXTS_FIELD_NAME, context, Field.Store.NO));
doc.add(new SortedSetDocValuesField(CONTEXTS_FIELD_NAME, context));
}
}
return doc;
}
private static final Sort SORT = new Sort(new SortField("weight", SortField.Type.LONG, true));
if (allTermsRequired) {
occur = BooleanClause.Occur.MUST;
} else {
occur = BooleanClause.Occur.SHOULD;
}
try (TokenStream ts = queryAnalyzer.tokenStream("", new StringReader(key.toString()))) {
//long t0 = System.currentTimeMillis();
ts.reset();
final CharTermAttribute termAtt = ts.addAttribute(CharTermAttribute.class);
final OffsetAttribute offsetAtt = ts.addAttribute(OffsetAttribute.class);
String lastToken = null;
query = new BooleanQuery.Builder();
int maxEndOffset = -1;
matchedTokens = new HashSet<>();
while (ts.incrementToken()) {
if (lastToken != null) {
matchedTokens.add(lastToken);
query.add(new TermQuery(new Term(TEXT_FIELD_NAME, lastToken)), occur);
}
lastToken = termAtt.toString();
if (lastToken != null) {
maxEndOffset = Math.max(maxEndOffset, offsetAtt.endOffset());
}
}
Set<BytesRef> contexts = new HashSet<>();
contexts.add(new BytesRef(region.getBytes("UTF8")));
List<Lookup.LookupResult> results = suggester.lookup(name, contexts, 2, true, false);
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">default</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">suggest</str>
<str name="weightField"></str>
<str name="suggestAnalyzerFieldType">string</str>
<str name="buildOnStartup">false</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler"
startup="lazy" >
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
writer = new IndexWriter(dir,
getIndexWriterConfig(getGramAnalyzer(), IndexWriterConfig.OpenMode.CREATE));
BytesRef text;
while ((text = iter.next()) != null) {
BytesRef payload;
if (iter.hasPayloads()) {
payload = iter.payload();
} else {
payload = null;
} add(text, iter.contexts(), iter.weight(), payload);
}
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<entity name="gt_brand" query="
select brand_id, brand_name, brand_pinyin, brand_name_second, sort from gt_goods_brand
" >
<field column="brand_id" name="id"/>
<field column="brand_name" name="brand_name"/>
<field column="brand_pinyin" name="brand_pinyin"/>
<field column="brand_name_second" name="brand_name_second"/>
<field column="sort" name="sort"/>
</entity>
Directory indexDir = FSDirectory.open(Paths.get("/Users/xxx/develop/tools/solr-5.5.0/server/solr/suggest/data/index"));
StandardAnalyzer analyzer = new StandardAnalyzer();
AnalyzingInfixSuggester suggester = new AnalyzingInfixSuggester(indexDir, analyzer); DirectoryReader directoryReader = DirectoryReader.open(indexDir);
DocumentDictionary documentDictionary = new DocumentDictionary(directoryReader, "brand_pinyin", "sort", "brand_name");
suggester.build(documentDictionary.getEntryIterator()); List<Lookup.LookupResult> cha = suggester.lookup("nijiazhubao", 5, false, false);
for (Lookup.LookupResult lookupResult : cha) {
// System.out.println(lookupResult.key);
// System.out.println(lookupResult.value);
System.out.println(new String(lookupResult.payload.bytes, "UTF8"));
}
<str name="field">brand_pinyin</str>
<str name="weightField">sort</str>
<str name="payloadField">brand_name</str>
<str name="suggestAnalyzerFieldType">string</str>
<str name="buildOnStartup">true</str>
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">default</str>
<str name="lookupImpl">FuzzyLookupFactory</str> <!-- org.apache.solr.spelling.suggest.fst -->
<str name="dictionaryImpl">DocumentDictionaryFactory</str> <!-- org.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory -->
<str name="field">category_name</str>
<str name="weightField"></str>
<str name="suggestAnalyzerFieldType">string</str>
</lst>
</searchComponent> <searchComponent name="suggest1" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">default</str>
<str name="lookupImpl">FuzzyLookupFactory</str> <!-- org.apache.solr.spelling.suggest.fst -->
<str name="dictionaryImpl">DocumentDictionaryFactory</str> <!-- org.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory -->
<str name="field">brand_name</str>
<str name="weightField"></str>
<str name="suggestAnalyzerFieldType">string</str>
</lst>
</searchComponent> <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">5</str>
</lst>
<arr name="components">
<str>suggest</str>
<str>suggest1</str>
</arr>
</requestHandler>
出现问题:
suggest: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: org.apache.lucene.store.LockObtainFailedException: Lock held by this virtual machine: /Users/xxx/develop/tools/solr-5.5.0/server/solr/suggest/data/analyzingInfixSuggesterIndexDir/write.lock
String indexPath = params.get(INDEX_PATH) != null
? params.get(INDEX_PATH).toString()
: DEFAULT_INDEX_PATH;
if (new File(indexPath).isAbsolute() == false) {
indexPath = core.getDataDir() + File.separator + indexPath;
}
Solr Suggest组件的使用的更多相关文章
- solr suggest智能提示配置
目录 配置文件 Java代码 遇到的问题 回到顶部 配置文件 solrconfig.xml <searchComponent name="suggest" class=&qu ...
- Solr各组件之间的关系图
原文地址:http://blog.csdn.net/clj198606061111/article/details/20854419
- solr的suggest模块
solr的suggest模块 solr有个suggest模块,用来实现下拉提醒功能,就是输入了一个文本之后,进行文本建议查找的功能. suggest请求的url http://localhost:89 ...
- Solr 6.7学习笔记(04)-- Suggest
当我们使用baidu或者Google时,你输入很少的字符,就会自动跳出来一些建议选项,在Solr里,我们称之为Suggest,在solrconfig.xml里做一些简单的配置,即可实现这一功能.配置如 ...
- 转载:Solr的自动完成实现方式(第三部分:Suggester方式续)
转自:http://www.cnblogs.com/ibook360/archive/2011/11/30/2269126.html 在之前的两个部分(part1.part2)中,我们学会了如何配置和 ...
- 转载:Solr的自动完成实现方式(第二部分:Suggester方式)
转自:http://www.cnblogs.com/ibook360/archive/2011/11/30/2269077.html 在Solr的自动完成/自动补充实现介绍(第一部分) 中我介绍了怎么 ...
- Solr4.3之检索建议suggest
原文链接:http://www.656463.com/article/Efm26v.htm 很多才学solr的人,都容易把solr spellcheck和solr suggest混淆,误以为他们是一样 ...
- solr入门之solr的拼写检查功能的应用级别尝试
今天主要是收集了些拼写检查方面的资料和 尝试使用一下拼写检查的功能--=遇到了不少问题 拼写检查的四种配置眼下我仅仅算是成功了半个吧 --------------------------------- ...
- Solr入门之(1)前言与概述
一.前言:为何选择Solr 由于搜索引擎功能在门户社区中对提高用户体验有着重在门户社区中涉及大量需要搜索引擎的功能需求,目前在实现搜索引擎的方案上有几种方案可供选择: 1. 基于Lucene自己进行封 ...
随机推荐
- POJ 3468 A Simple Problem with Integers(线段树:区间更新)
http://poj.org/problem?id=3468 题意: 给出一串数,每次在一个区间内增加c,查询[a,b]时输出a.b之间的总和. 思路: 总结一下懒惰标记的用法吧. 比如要对一个区间范 ...
- hdu 2444 The Accomodation of Students 判断二分图+二分匹配
The Accomodation of Students Time Limit: 5000/1000 MS (Java/Others) Memory Limit: 32768/32768 K ( ...
- c++ 将容量设置为容器的长度(shrink_to_fit)
#include <iostream> #include <vector> using namespace std; int main () { vector<); co ...
- android平台蓝牙编程
Android平台支持蓝牙网络协议栈,实现蓝牙设备之间数据的无线传输. 本文档描述了怎样利用android平台提供的蓝牙API去实现蓝牙设备之间的通信,蓝牙设备之间的通信主要包括了四个步骤:设置蓝牙设 ...
- 51nod-1455-dp/缩小范围
1455 宝石猎人 题目来源: CodeForces 基准时间限制:2 秒 空间限制:131072 KB 分值: 40 难度:4级算法题 收藏 关注 苏塞克岛是一个有着30001个小岛的群岛,这 ...
- App6种常见的数据加载设计
App6种常见的数据加载设计 设计师在进行APP设计的设计时,往往会更加专注于界面长什么样,界面和界面之间怎么跳转,给予用户什么样的操作反馈,却偏偏特别容易忽略掉一个比较重要的环节,就是APP数据加载 ...
- BZOJ2314 士兵的放置
树形DP,恩然后就不会了... 先写了个错的离谱程序...果然WA了 然后开始乱搞,欸,对了! 令f[i], g[i], h[i]分别表示i号节点自己放士兵,被儿子上的士兵控制,不被儿子上的士兵控制但 ...
- 006PHP文件处理—— 目录操作 删除目录 删除置顶类型文件
<?php /** * 目录操作 删除目录 删除置顶类型文件 */ //echo rmdir('61') or die('目录删除失败'); //删除一个目录中有其他文件的内容的方法: //第1 ...
- Ctrl+K,Ctrl+D
先按下Ctrl+K,然后按下Ctrl+D可以自动调整代码.
- 20181009-7 选题 Scrum立会报告+燃尽图 06
Scrum立会报告+燃尽图(06)选题 此作业要求参见:https://edu.cnblogs.com/campus/nenu/2018fall/homework/2196 一.小组介绍 组长:刘莹莹 ...