lucene 建立CRUD操作

IndexSearcher indexSearcher = new IndexSearcher(LuceneUtils.getDirectory()); // 指定所用的索引库
这句会引发线程安全问题，在全剧终 IndexSearcher只能有一个对象才可以，所以在ArticleDocumentUtils中保存一个 并且引用它。
indexSearcher为了提高效率，也是在内存中有缓存的所以需要commit才能放入索引文件数据库中

数据库优化
  每次添加数据在索引文件夹下有很多小文件，为了合并小文件提高效率

//优化，合并多个小文件为一个打文件
LuceneUtils.getIndexWriter.optimize();

//配置当小文件的数量达到多少个后就自动合并为一个大文件，最小2，默认10
 LucenenUtils.getIndexWriter().setMergeFactor(3);
当增加数据的时候自动触发。

Lucene.java

 package cn.itcast._util;

 import java.io.File;

 import java.io.IOException;

 import org.apache.lucene.analysis.Analyzer;

 import org.apache.lucene.analysis.standard.StandardAnalyzer;

 import org.apache.lucene.index.CorruptIndexException;

 import org.apache.lucene.index.IndexWriter;

 import org.apache.lucene.index.IndexWriter.MaxFieldLength;

 import org.apache.lucene.store.Directory;

 import org.apache.lucene.store.FSDirectory;

 import org.apache.lucene.store.LockObtainFailedException;

 import org.apache.lucene.util.Version;

 public class LuceneUtils {

     private static Directory directory; // 索引库目录

     private static Analyzer analyzer; // 分词器

     private static IndexWriter indexWriter;

     static {

         try {

             // 这里应是读取配置文件得到的索引库目录

             directory = FSDirectory.open(new File("./indexDir"));

             analyzer = new StandardAnalyzer(Version.LUCENE_30);

         } catch (IOException e) {

             throw new RuntimeException(e);

         }

     }

     /**

      * 获取全局唯一的IndexWriter对象

      *

      * @return

      */

     public static IndexWriter getIndexWriter() {

         // 在第一次使用IndexWriter是进行初始化

         if (indexWriter == null) {

             synchronized (LuceneUtils.class) { // 注意线程安全问题

                 if (indexWriter == null) {

                     try {

                         indexWriter = new IndexWriter(directory, analyzer, MaxFieldLength.LIMITED);

                         System.out.println("=== 已经初始化 IndexWriter ===");

                     } catch (Exception e) {

                         throw new RuntimeException(e);

                     }

                 }

             }

             // 指定一段代码，会在JVM退出之前执行。

             Runtime.getRuntime().addShutdownHook(new Thread() {

                 public void run() {

                     try {

                         indexWriter.close();

                         System.out.println("=== 已经关闭 IndexWriter ===");

                     } catch (Exception e) {

                         throw new RuntimeException(e);

                     }

                 }

             });

         }

         return indexWriter;

     }

     public static Directory getDirectory() {

         return directory;

     }

     public static Analyzer getAnalyzer() {

         return analyzer;

     }

 }

ArticleDocumentUtils.java

 package cn.itcast._util;

 import org.apache.lucene.document.Document;

 import org.apache.lucene.document.Field;

 import org.apache.lucene.document.Field.Index;

 import org.apache.lucene.document.Field.Store;

 import org.apache.lucene.util.NumericUtils;

 import cn.itcast._domain.Article;

 public class ArticleDocumentUtils {

     /**

      * 把Article转为Document

      *

      * @param article

      * @return

      */

     public static Document articleToDocument(Article article) {

         Document doc = new Document();

         String idStr = NumericUtils.intToPrefixCoded(article.getId()); // 一定要使用Lucene的工具类把数字转为字符串！

         doc.add(new Field("id", idStr, Store.YES, Index.NOT_ANALYZED)); // 注意：唯一标示符一般选择Index.NOT_ANALYZED

         doc.add(new Field("title", article.getTitle(), Store.YES, Index.ANALYZED));

         doc.add(new Field("content", article.getContent(), Store.YES, Index.ANALYZED));

         return doc;

     }

     /**

      * 把Document转为Article

      *

      * @param doc

      * @return

      */

     public static Article documentToArticle(Document doc) {

         Article article = new Article();

         Integer id = NumericUtils.prefixCodedToInt(doc.get("id")); // 一定要使用Lucene的工具类把字符串转为数字！

         article.setId(id);

         article.setTitle(doc.get("title"));

         article.setContent(doc.get("content"));

         return article;

     }

 }

QueryResult.java

 package cn.itcast._domain;

 import java.util.List;

 public class QueryResult {

     private List list; // 一段数据列表

     private int count; // 总记录数

     public QueryResult(List list, int count) {

         this.list = list;

         this.count = count;

     }

     public List getList() {

         return list;

     }

     public void setList(List list) {

         this.list = list;

     }

     public int getCount() {

         return count;

     }

     public void setCount(int count) {

         this.count = count;

     }

 }

ArticleIndexDao.java

 package cn.itcast.b_indexdao;

 import java.io.IOException;

 import java.util.ArrayList;

 import java.util.List;

 import org.apache.lucene.document.Document;

 import org.apache.lucene.index.Term;

 import org.apache.lucene.queryParser.MultiFieldQueryParser;

 import org.apache.lucene.queryParser.QueryParser;

 import org.apache.lucene.search.IndexSearcher;

 import org.apache.lucene.search.Query;

 import org.apache.lucene.search.TopDocs;

 import org.apache.lucene.util.NumericUtils;

 import org.apache.lucene.util.Version;

 import cn.itcast._domain.Article;

 import cn.itcast._domain.QueryResult;

 import cn.itcast._util.ArticleDocumentUtils;

 import cn.itcast._util.LuceneUtils;

 public class ArticleIndexDao {

     /**

      * 保存到索引库（建立索引）

      *

      * @param article

      */

     public void save(Article article) {

         // 1，把Article转为Document

         Document doc = ArticleDocumentUtils.articleToDocument(article);

         // 2，添加到索引库中

         try {

             LuceneUtils.getIndexWriter().addDocument(doc); // 添加

             LuceneUtils.getIndexWriter().commit(); // 提交更改

         } catch (Exception e) {

             throw new RuntimeException(e);

         }

     }

     /**

      * 删除索引

      *

      * Term ：某字段中出现的某一个关键词（在索引库的目录中）

      *

      * @param id

      */

     public void delete(Integer id) {

         try {

             String idStr = NumericUtils.intToPrefixCoded(id); // 一定要使用Lucene的工具类把数字转为字符串！

             Term term = new Term("id", idStr);

             LuceneUtils.getIndexWriter().deleteDocuments(term); // 删除所有含有这个Term的Document

             LuceneUtils.getIndexWriter().commit(); // 提交更改

         } catch (Exception e) {

             throw new RuntimeException(e);

         }

     }

     /**

      * 更新索引

      *

      * @param article

      */

     public void update(Article article) {

         try {

             Term term = new Term("id", NumericUtils.intToPrefixCoded(article.getId())); // 一定要使用Lucene的工具类把数字转为字符串！

             Document doc = ArticleDocumentUtils.articleToDocument(article);

             LuceneUtils.getIndexWriter().updateDocument(term, doc); // 更新就是先删除再添加

             LuceneUtils.getIndexWriter().commit(); // 提交更改

             // indexWriter.deleteDocuments(term);

             // indexWriter.addDocument(doc);

         } catch (Exception e) {

             throw new RuntimeException(e);

         }

     }

     /**

      * * 搜索   用于分页的

      *

      * @param queryString

      *            查询条件

      * @param first

      *            从结果列表的哪个索引开始获取数据

      * @param max

      *            最多获取多少条数据（如果没有这么多，就把剩余的都返回）

      *

      * @return 一段数据列表 + 符合条件的总记录数

      */

     public QueryResult search(String queryString, int first, int max) {

         IndexSearcher indexSearcher = null;

         try {

             // 1，把查询字符串转为Query对象（在title与content中查询）

             QueryParser queryParser = new MultiFieldQueryParser(Version.LUCENE_30, new String[] { "title", "content" }, LuceneUtils.getAnalyzer());

             Query query = queryParser.parse(queryString);

             // 2，执行查询，得到中间结果

             indexSearcher = new IndexSearcher(LuceneUtils.getDirectory());

             TopDocs topDocs = indexSearcher.search(query, first + max); // 最多返回前n条数据，这里要计算好，要返回足够数量的数据

             int count = topDocs.totalHits; // 符合条件的总记录数

             // 3，处理数据

             List<Article> list = new ArrayList<Article>();

             int endIndex = Math.min(first + max, topDocs.scoreDocs.length); // 计算结束的边界

             for (int i = first; i < endIndex; i++) { // 应只取一段数据

                 // 根据内部编号获取真正的Document数据

                 int docId = topDocs.scoreDocs[i].doc;

                 Document doc = indexSearcher.doc(docId);

                 // 把Document转换为Article

                 Article article = ArticleDocumentUtils.documentToArticle(doc);

                 list.add(article);

             }

             // 4，封装结果并返回

             return new QueryResult(list, count);

         } catch (Exception e) {

             throw new RuntimeException(e);

         } finally {

             // 关闭IndexSearcher

             if (indexSearcher != null) {

                 try {

                     indexSearcher.close();

                 } catch (IOException e) {

                     throw new RuntimeException(e);

                 }

             }

         }

     }

 }

不分页的查询

LuceneUtils.getIndexWriter()

     public List<Article> searchArticle(String condition) {

         // 执行搜索

         List<Article> list = new ArrayList<Article>();

         IndexSearcher indexSearcher = null;

         try {

             // 1，把查询字符串转为Query对象（默认只从title中查询）

             QueryParser queryParser = new MultiFieldQueryParser(

                     Version.LUCENE_30, new String[] { "title", "content" },

                     LuceneUtils.getAnalyzer());

             Query query = queryParser.parse(condition);

             // 2，执行查询，得到中间结果

             //indexSearcher = new IndexSearcher(LuceneUtils.getDirectory()); // 指定所用的索引库，会引发线程安全问题
　　　　　　　　　indexSearcher=LuceneUtils.getIndexWriter();

             TopDocs topDocs = indexSearcher.search(query, 1000); // 最多返回前n条结果

             int count = topDocs.totalHits;

             System.out.println("scoreDocs.length"+topDocs.scoreDocs.length);        //一样

             System.out.println("count"+count);                //一样

             ScoreDoc[] scoreDocs = topDocs.scoreDocs;

             // 3，处理结果

             for (int i = 0; i < scoreDocs.length; i++) {

                 ScoreDoc scoreDoc = scoreDocs[i];

                 float score = scoreDoc.score; // 相关度得分

                 int docId = scoreDoc.doc; // Document的内部编号

                 // 根据编号拿到Document数据

                 Document document = indexSearcher.doc(docId);

                 // 把Document转为Article

                 Article article=ArticleDocumentUtils.documentToArticle(document);

                 list.add(article);

             }

         } catch (Exception e) {

             throw new RuntimeException();

         } finally {

             try {

                 if (null != indexSearcher)

                     indexSearcher.close();

             } catch (Exception e) {

                 e.printStackTrace();

             }

         }

         return list;

     }

ArticleIndexDaoTest.java

 package cn.itcast.b_indexdao;

 import java.util.List;

 import org.junit.Test;

 import cn.itcast._domain.Article;

 import cn.itcast._domain.QueryResult;

 public class ArticleIndexDaoTest {

     private ArticleIndexDao indexDao = new ArticleIndexDao();

     @Test

     public void testSave() {

         // 准备数据

         Article article = new Article();

         article.setId(1);

         article.setTitle("准备Lucene的开发环境");

         article.setContent("如果信息检索系统在用户发出了检索请求后再去互联网上找答案，根本无法在有限的时间内返回结果。");

         // 放到索引库中

         indexDao.save(article);

     }

     @Test

     public void testSave_25() {

         for (int i = 1; i <= 25; i++) {

             // 准备数据

             Article article = new Article();

             article.setId(i);

             article.setTitle("准备Lucene的开发环境");

             article.setContent("如果信息检索系统在用户发出了检索请求后再去互联网上找答案，根本无法在有限的时间内返回结果。");

             // 放到索引库中

             indexDao.save(article);

         }

     }

     @Test

     public void testDelete() {

         indexDao.delete(1);

     }

     @Test

     public void testUpdate() {

         // 准备数据

         Article article = new Article();

         article.setId(1);

         article.setTitle("准备Lucene的开发环境");

         article.setContent("这是更新后的内容");

         // 更新到索引库中

         indexDao.update(article);

     }

     //用于分页的

     @Test

     public void testSearch() {

         // 准备查询条件

         String queryString = "lucene";

         // String queryString = "hibernate";

         // 执行搜索

         // QueryResult qr = indexDao.search(queryString, 0, 10000);

         // QueryResult qr = indexDao.search(queryString, 0, 10); // 第1页，每页10条

         // QueryResult qr = indexDao.search(queryString, 10, 10); // 第2页，每页10条

         QueryResult qr = indexDao.search(queryString, 20, 10); // 第3页，每页10条

         // 显示结果

         System.out.println("总结果数：" + qr.getCount());

         for (Article a : (List<Article>) qr.getList()) {

             System.out.println("------------------------------");

             System.out.println("id = " + a.getId());

             System.out.println("title = " + a.getTitle());

             System.out.println("content = " + a.getContent());

         }

     }

 }

不分页查询测试

 @Test

     public void testSearchArticle() {

         // 准备查询条件

         String queryString = "lucene的";

         // String queryString = "hibernate";

         // 执行搜索

         List<Article> list =dao.searchArticle(queryString);

         // 显示结果

         System.out.println("总结果数：" + list.size());

         for (Article a : list) {

             System.out.println("------------------------------");

             System.out.println("id = " + a.getId());

             System.out.println("title = " + a.getTitle());

             System.out.println("content = " + a.getContent());

         }

     }

lucene 建立CRUD操作的更多相关文章

lucene的CRUD操作Document(四)
IndexWriter writer = new IndexWriter(Directory, IndexWriterConfig); 增加文档:writer.addDocument(); 读取文档: ...
ElasticSearch基础+文档CRUD操作
本篇博客是上一篇的延续,主要用来将年前学习ES的知识点做一个回顾,方便日后进行复习和汇总!因为近期项目中使用ES出现了点小问题,因此在这里做一个详细的汇总! [01]全文检索和Lucene (1)全文 ...
【Java EE 学习 44】【Hibernate学习第一天】【Hibernate对单表的CRUD操作】
一.Hibernate简介 1.hibernate是对jdbc的二次开发 2.jdbc没有缓存机制,但是hibernate有. 3.hibernate的有点和缺点 (1)优点:有缓存,而且是二级缓存: ...
Entity Framework 程序设计入门二对数据进行CRUD操作和查询
前一篇文章介绍了应用LLBL Gen生成Entity Framework所需要的类型定义,用一行代码完成数据资料的读取, <LLBL Gen + Entity Framework 程序设计入门& ...
10月16日下午MySQL数据库CRUD操作（增加、删除、修改、查询）
1.MySQL注释语法--,# 2.2.后缀是.sql的文件是数据库查询文件. 3.保存查询. 关闭查询时会弹出提示是否保存,保存的是这段文字,不是表格(只要是执行成功了表格已经建立了).保存以后下次 ...
一步步学Mybatis-实现单表情况下的CRUD操作（3）
今天这一章要紧接上一讲中的东西,本章中创建基于单表操作的CRUD与GetList操作,此示例中以Visitor表为范例,为了创建一点测试数据我们先弄个Add方法吧继续在上次的IVisitorOper ...
Java实战之02Hibernate-01简介、常用接口、CRUD操作
一.Hibernate简介 1.Hibernate在开发中所处的位置 2.ORM映射 Object :面向对象领域的 Relational:关系数据库领域的 Mapping:映射 Object: Re ...
DevExpress ASP.NET 使用经验谈（5）-通过ASPxGridView实现CRUD操作
这节,我们将通过使用DevExpress的ASPxGridView控件,实现对数据的CRUD操作. 首先,我们在解决方案中,添加一个网站: 图一添加新网站图二添加DevExpress.Data. ...
【ASP.NET Web API教程】2.1 创建支持CRUD操作的Web API
原文 [ASP.NET Web API教程]2.1 创建支持CRUD操作的Web API 2.1 Creating a Web API that Supports CRUD Operations2.1 ...

随机推荐

lightoj 1244 - Tiles 状态DP
思路:状态DP dp[i]=2*dp[i-1]+dp[i-3] 代码如下: 求出循环节部分 1 #include<stdio.h> 2 #define m 10007 3 int p[m] ...
使 PHP 写的网站看上去像 asp 页面
# 使 PHP 代码看上去像 asp 页面更改Apache 的httpd.conf中AddType application/x-httpd-php .php .phtml改为 AddType app ...
Wunder Fund Round 2016 (Div. 1 + Div. 2 combined) E. Robot Arm 线段树
E. Robot Arm 题目连接: http://www.codeforces.com/contest/618/problem/E Description Roger is a robot. He ...
[工具开发]Proxpy Web Scan设计与实现
组内交流培训
window安装svn
window安装svn 1 安装时,安装路径选择好,把打X的都选上,默认第一个安装完毕后,安装语言包,完毕,电脑上右键打开svn,,svn设置,常规设置,选中文官网就有的下的 2 创建版本库,检出 ...
神技！微信小程序（应用号）抢先入门体验（附最新案例-豆瓣电影）持续更新
微信小程序 Demo(豆瓣电影) 由于时间的关系,没有办法写一个完整的说明,后续配合一些视频资料,请持续关注官方文档:https://mp.weixin.qq.com/debug/wxadoc/de ...
node.js 安装了express后提示不是内部命令的解决方法
比较完美的过程应该是这样的: 安装express npm install express-generator -g 再测试 express -V 然而...... 检查了下系统变量: 对比我的路径: ...
Objective-C]入门 (xcode helloworld程序创建类
一:objective-c简介 Objective-C是进行iPhone软件开发的语言 Objective-C语言是C语言的一个扩展集 Objective-C是一种面向对象的语言大小写敏感程序语句 ...
redigo简单理解
package main import ( "fmt" "github.com/gomodule/redigo/redis") func main() { // ...
python的lxml解析器
from lxml import etree import codecs import sys from lxml import etree def parser(p): tree = etree.H ...

lucene 建立CRUD操作

lucene 建立CRUD操作的更多相关文章

随机推荐

热门专题