用Lucene实现分组，facet功能，FieldCache

假如你像用lucene来作分组，比如按类别分组，这种功能，好了你压力大了，lucene本身是不支持分组的。

当你想要这个功能的时候，就可能会用到基于lucene的搜索引擎solr。

不过也可以通过编码通过FieldCache和单字段，对索引进行分组，比如：想构造类别树。大类里面还有小类那种。

这个功能实现起来可能会比较麻烦，主要是lucene提供的支持也不多，参考资料也不多。

（以下代码都是我在做测试的时候做的，可以稍作修改满足相应需求。）

//用于分组统计的对象GroupCollector

import java.io.IOException;

import org.apache.lucene.index.IndexReader;
import org.apache.lucene.search.Collector;
import org.apache.lucene.search.Scorer;

public class GroupCollector extends Collector {

private GroupField gf = new GroupField();// 保存分组统计结果
   private int docBase;
   // fieldCache
   private String[] fc;

@Override
   public boolean acceptsDocsOutOfOrder() {
       return true;
   }

@Override
   public void collect(int doc) throws IOException {
       // 因为doc是每个segment的文档编号，需要加上docBase才是总的文档编号
       final int docId = doc + this.docBase;
       // 添加的GroupField中，由GroupField负责统计每个不同值的数目
       this.gf.addValue(this.fc[docId]);

}

@Override
public void setNextReader(IndexReader arg0, int arg1) throws IOException {
this.docBase = this.docBase;

}

@Override
public void setScorer(Scorer arg0) throws IOException {

}

public GroupField getGf() {
return this.gf;
}

public void setGf(GroupField gf) {
this.gf = gf;
}

public int getDocBase() {
return this.docBase;
}

public void setDocBase(int docBase) {
this.docBase = docBase;
}

public String[] getFc() {
return this.fc;
}

public void setFc(String[] fc) {
this.fc = fc;
}

}

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

/**
* 用于保存分组统计后每个字段的分组结果
*/
public class GroupField {

/**
   * 字段名
   */
   private String name;
   /**
   * 商品类型的对象列表
   */
   private List<SimpleCategory> values = new ArrayList<SimpleCategory>();
   /**
   * 保存字段值和文档个数的对应关系
   */
   private Map<String, Integer> countMap = new HashMap<String, Integer>();

public Map<String, Integer> getCountMap() {
return this.countMap;
}

public void setCountMap(Map<String, Integer> countMap) {
this.countMap = countMap;
}

public String getName() {
return this.name;
}

public void setName(String name) {
this.name = name;
}

public List<SimpleCategory> getValues() {
return this.values;
}

public void setValues(List<SimpleCategory> values) {
this.values = values;
}

/**
   * 用于商品对象list的构造
   *
   * @param value
   */
   public void addValue(String value) {
       if ((value == null) || "".equals(value)) return;
       // 对于多值的字段，支持按空格拆分
       final String[] temp = value.split(",");

if (this.countMap.get(temp[1]) == null) {
           this.countMap.put(temp[1], 1);
           // 构造商品类型临时对象
           final SimpleCategory simpleCategory = new SimpleCategory();

simpleCategory.setCategoryId(Integer.parseInt(temp[0]));
           simpleCategory.setCategoryName(temp[1]);
           simpleCategory.setParentId(Integer.parseInt(temp[2]));
           simpleCategory.setSortIndex(temp[3]);
           simpleCategory.setParentCategoryName(temp[4]);
           // simpleCategory.setAdImag(temp[5]);
           // simpleCategory.setParentAdImage(temp[6]);
           this.values.add(simpleCategory);
       }
       else {
           this.countMap.put(temp[1], this.countMap.get(temp[1]) + 1);
       }
       // for( String str : temp ){
       // if(countMap.get(str)==null){
       // countMap.put(str,1);
       // values.add(str);
       // }
       // else{
       // countMap.put(str, countMap.get(str)+1);
       // }
       // }
   }
   // class ValueComparator implements Comparator<String>{
   //
   // // public int compare(String value0, String value1) {
   // // if(countMap.get(value0)>countMap.get(value1)){
   // // return -1;
   // // }
   // // else if(countMap.get(value0)<countMap.get(value1)){
   // // return 1;
   // // }
   // // return 0;
   // // }
   // }
}

自己构建想返回的对象

/**
* 用于将lucene索引中的商品类型CategoryIndex字段，转换成商品类型的一个对象。
*
* @author xiaozd
*
*/
public class SimpleCategory extends BaseModel {

private static final long serialVersionUID = -2345212345526771266L;
   private int parentId;
   private int categoryId;
   private String categoryName;
   private String sortIndex;
   private int goodsCount;
   private String parentCategoryName;

public int getParentId() {
return this.parentId;
}

public void setParentId(int parentId) {
this.parentId = parentId;
}

public int getCategoryId() {
return this.categoryId;
}

public void setCategoryId(int categoryId) {
this.categoryId = categoryId;
}

public String getCategoryName() {
return this.categoryName;
}

public void setCategoryName(String categoryName) {
this.categoryName = categoryName;
}

public String getSortIndex() {
return this.sortIndex;
}

public void setSortIndex(String sortIndex) {
this.sortIndex = sortIndex;
}

public static long getSerialversionuid() {
return SimpleCategory.serialVersionUID;
}

public int getGoodsCount() {
return this.goodsCount;
}

public void setGoodsCount(int goodsCount) {
this.goodsCount = goodsCount;
}

public String getParentCategoryName() {
return this.parentCategoryName;
}

public void setParentCategoryName(String parentCategoryName) {
this.parentCategoryName = parentCategoryName;
}

}

/**
   * 查询商品的所有类型，方式：通过索引分组查询所有类型。
   *    @return   Map<String, String> 第一个参数表示商品类型id，第二个String表示商品类型名称
   */
   public List<SimpleCategory> getGoodsCategory() {

       List<SimpleCategory> values=new ArrayList<SimpleCategory>();
       try {

           IndexReader reader = IndexReader.open(FSDirectory.open(new File(luceneSearchPath)), true); // only searching, so read-only=true

//读取"modified"字段值，放到fieldCache中
           final String[] fc=FieldCache.DEFAULT.getStrings(reader, "categoryIndex");
           IndexSearcher searcher = new IndexSearcher(reader);
           //GroupCollector是自定义文档收集器，用于实现分组统计
           GroupCollector myCollector=new GroupCollector();
           myCollector.setFc(fc);
           searcher.search(new MatchAllDocsQuery(), myCollector);
           //GroupField用来保存分组统计的结果
           GroupField gf=myCollector.getGf();
           values=gf.getValues();
           for (SimpleCategory value : values) {
               System.out.println("商品类型名称： "+value +" 数量："+gf.getCountMap().get(value.getCategoryName())+"   商品父类型名称: "+value.getParentCategoryName());
           }

       } catch (Exception e) {
           e.printStackTrace();
       }

       return values;
   }

http://blog.csdn.net/xiaozhengdong/article/details/7035607

用Lucene实现分组，facet功能，FieldCache的更多相关文章

lucene搜索之facet查询原理和facet查询实例——TODO
转自:http://www.lai18.com/content/7084969.html Facet说明我们在浏览网站的时候,经常会遇到按某一类条件查询的情况,这种情况尤以电商网站最多,以天猫商城为 ...
MYSQL-实现ORACLE- row_number() over(partition by ) 分组排序功能
MYSQL-实现ORACLE- row_number() over(partition by ) 分组排序功能由于MYSQL没有提供类似ORACLE中OVER()这样丰富的分析函数. 所以在MYSQ ...
SSRS 系列 - 使用带参数的 MDX 查询实现一个分组聚合功能的报表
SSRS 系列 - 使用带参数的 MDX 查询实现一个分组聚合功能的报表 SSRS 系列 - 使用带参数的 MDX 查询实现一个分组聚合功能的报表 2013-10-09 23:09 by BI Wor ...
Lucene最重要的功能是对一段话的分析
Lucene最重要的功能是对一段话的分析
MYSQL-实现分组排序对比 ORACLE 和SQLserver用 row_number() over(partition by ) 分组排序功能
以下是个人笔记: 本文是为了理解 row_number() over(partition by ) 和实现各种数据库的分组排序功能 select ROW_NUMBER()over( partitio ...
相似group by的分组计数功能
之前同事发过一个语句,实现的功能比較简单,相似group by的分组计数功能,由于where条件有like,又无法用group by来实现. SELECT a.N0,b.N1,c.N2,d.N3,e. ...
使用Lucene.NET实现数据检索功能
引言在软件系统中查询数据是再平常不过的事情了,那当数据量非常大,数据存储的媒介不是数据库,或者检索方式要求更为灵活的时候,我们该如何实现数据的检索呢?为数据建立索引吧,利用索引技术可以更灵活 ...
FastScroll(2)不分组的listview 打开fastscroll的分组提示功能
本文只让fastscroll具有提示分组功能,但listview并不显示分组,如果想让分组的listview显示fastscroll,看下篇. 1,在listview中打开fastscroll 2,自 ...
BuguMongo是一个MongoDB Java开发框架，集成了DAO、Query、Lucene、GridFS等功能
http://code.google.com/p/bugumongo/ 简介 BuguMongo是一个MongoDB Java开发框架,它的主要功能包括: 基于注解的对象-文档映射(Object-Do ...

随机推荐

TLS/SSL 协议详解 ssL 、TLS 1.0、TLS 1.1、TLS 1.2的了解
TLS 1.0 RFC http://www.ietf.org/rfc/rfc2246.txt TLS 1.1 RFC http://www.ietf.org/rfc/rfc4346.txt TLS ...
DNS使用的是TCP协议还是UDP协议
原文:http://benbenxiongyuan.iteye.com/blog/1088085 DNS同时占用UDP和TCP端口53是公认的,这种单个应用协议同时使用两种传输协议的情况在TCP/IP ...
intellij idea14 +svn配置
说明:使用TortoiseSVN客户端,安装时必须选择client tools,否则不会有svn.exe,也就不能支持intellij idea的svn插件,因为intellij idea是使用命令行 ...
安卓机在按HOME键时，UNITY触发的APPLICATION_PAUSE事件
安卓机在按HOME键时,UNITY触发的APPLICATION_PAUSE事件此时安卓程序会返回,在这一瞬间,程序可以通过SOCKET发送数据包给服务器告知, 经测试在这短暂的时间内,这个数据包能发 ...
【校招面试之剑指offer】第10-2题青蛙跳台阶问题
题目1:一只青蛙一次可以跳上1级台阶,也可以跳上2级台阶.求该青蛙跳上一个n级台阶共有多少种跳法? 题目2:一只青蛙一次可以跳上1级台阶,也可以跳上2级台阶...也可以一次跳n级台阶.求该青蛙跳上一个 ...
Dom对象总结介绍&事件介绍&增删查找标签
1.dom有5个属性,属性内容如下下面开始介绍Dom属性,一共有5个属性 1.document object:文档对象 2.element object:标签对象 3.test object:文本对 ...
获取客户端真实IP地址
Java-Web获取客户端真实IP: 发生的场景:服务器端接收客户端请求的时候,一般需要进行签名验证,客户端IP限定等情况,在进行客户端IP限定的时候,需要首先获取该真实的IP. 一般分为两种情况: ...
用HttpClient发送HTTPS请求报SSLException: Certificate for <域名> doesn't match any of the subject alternative names问题的解决
最近用server酱-PushBear做消息自动推送,用apache HttpClient做https的get请求,但是代码上到服务器端就报javax.net.ssl.SSLException: Ce ...
java POI创建Excel示例（xslx和xsl区别）
Java用来处理office类库有很多,其中POI就是比较出名的一个,它是apache的类库,现在版本到了3.10,也就是2014年2月8号这个版本. 在处理PPT,Excel和Word前,需要导入以 ...
10-string类的length()返回值一起的问题
c++ string类length()(size())函数返回值–无符号数首先,先来发现问题 string s = ""; for(int i = 0; i < s.len ...

用Lucene实现分组，facet功能，FieldCache

用Lucene实现分组，facet功能，FieldCache的更多相关文章

随机推荐

热门专题