Lucene02--入门程序

开发准备：

Win10

IDEA

JDK1.8

1、新建一个普通的maven工程

1.1 添加依赖

    <dependencies>

        <!-- Junit单元测试 -->

        <dependency>

            <groupId>junit</groupId>

            <artifactId>junit</artifactId>

            <version>4.12</version>

        </dependency>

        <!-- lucene核心库 -->

        <dependency>

            <groupId>org.apache.lucene</groupId>

            <artifactId>lucene-core</artifactId>

            <version>7.4.0</version>

        </dependency>

        <!-- Lucene的查询解析器 -->

        <dependency>

            <groupId>org.apache.lucene</groupId>

            <artifactId>lucene-queryparser</artifactId>

            <version>7.4.0</version>

        </dependency>

        <!-- lucene的默认分词器库 -->

        <dependency>

            <groupId>org.apache.lucene</groupId>

            <artifactId>lucene-analyzers-common</artifactId>

            <version>7.4.0</version>

        </dependency>

        <!-- lucene的高亮显示 -->

        <dependency>

            <groupId>org.apache.lucene</groupId>

            <artifactId>lucene-highlighter</artifactId>

            <version>7.4.0</version>

        </dependency>

    </dependencies>

1.2 创建测试类

2、往索引库中写入文档

2.1 基本流程

步骤：

2.1.1 创建索引库对象，指定索引库的位置

2.1.2 创建IndexWriterConfig对象并制定分词对象

2.1.3 创建一个IndexWriter对象

1）指定索引库的位置

2）指定一个IndexWriterConfig对象。

2.1.4 创建document对象。

2.1.5 创建field对象，将field添加到document对象中。

2.1.6 使用indexwriter对象将document对象写入索引库中。

2.1.7 关闭indexwriter对象。

2.2 代码实现

import org.apache.lucene.analysis.standard.StandardAnalyzer;

import org.apache.lucene.document.*;

import org.apache.lucene.index.*;

import org.apache.lucene.search.IndexSearcher;

import org.apache.lucene.search.ScoreDoc;

import org.apache.lucene.search.TermQuery;

import org.apache.lucene.search.TopDocs;

import org.apache.lucene.store.FSDirectory;

import org.junit.Test;

import java.io.File;

import java.io.IOException;

import java.nio.file.Path;

/**

 * @author PC-Black

 * @version v1.0

 * @date 2019/7/19 10:00

 * @description TODO

 **/

public class LuceneTest {

    @Test

    public void addOneDoc() throws IOException {

//        1 创建索引库对象，指定索引库的位置

        //1.1 创建索引库位置

        Path path = new File("D:\\lucene").toPath();

        //1.2 创建索引库对象，关联索引库位置

        FSDirectory directory = FSDirectory.open(path);

//        2 创建IndexWriterConfig对象并指定分词器对象

        //2.1 创建分词器对象用于指定分词规则

        StandardAnalyzer standardAnalyzer = new StandardAnalyzer();//标准分词器，分词规则：单字分词

        //2.2 创建写出器配置对象，关联分词器对象

        IndexWriterConfig indexWriterConfig = new IndexWriterConfig(standardAnalyzer);

//        3 创建一个IndexWriter对象 &指定索引库的位置&指定一个IndexWriterConfig对象。

        IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig);

//        4 创建document对象。

        Document document = new Document();

//        5 创建field对象，将field添加到document对象中。

        // 5.1 创建field对象

        StringField docIdField = new StringField("docId", "1", Field.Store.YES);

        TextField titleField = new TextField("title", "我的祖国", Field.Store.YES);

        TextField contentField = new TextField("content", "我的祖国是一个伟大的国家", Field.Store.YES);

        StringField scoreField = new StringField("score", "100", Field.Store.YES);

        //5.1 将field添加到document对象中。

        document.add(docIdField);

        document.add(titleField);

        document.add(contentField);

        document.add(scoreField);

//        6 使用indexwriter对象将document对象写入索引库中。

        indexWriter.addDocument(document);

//        7 关闭indexwriter对象。

        indexWriter.close();

    }

}

2.3 运行addOneDoc()方法

2.4 查看索引库生成的位置

2.5 使用Luke工具查看索引

注意：我们使用的luke版本是luke-7.4.0，跟Lucene的版本是对应的。所以可以打开7.4.0版本的Lucene创建的索引库。并且此版本的luke是jdk9编译的，所以要想运行此工具还需要jdk9才可以

3、在索引库中查询文档

3.1 查询流程

步骤：

3.1.1 创建索引库对象，指定索引库位置。

3.1.2 创建索引读取对象（IndexReader），指定索引库对象。

3.1.3 创建索引查询器对象（IndexSearcher），指定索引读取对象。

3.1.4 创建TermQuery对象，指定查询的域和查询的关键词。

3.1.5 使用索引查询器对象执行查询方法。

3.1.6 返回查询结果。遍历查询结果并输出。

3.1.7 关闭IndexReader对象。

3.2 代码实现

    @Test

    public void queryDoc() throws IOException {

//        1 创建索引库对象&指定索引库位置。

        FSDirectory fsDirectory = FSDirectory.open(new File("D:\\lucene").toPath());

//        2 创建索引读取对象（IndexReader），指定索引库对象。

//        DirectoryReader open = DirectoryReader.open(fsDirectory);

        //2.1 使用子类创建，父类引用

        IndexReader indexReader = DirectoryReader.open(fsDirectory);

//        3 创建索引查询器对象（IndexSearcher），指定索引读取对象。

        IndexSearcher indexSearcher = new IndexSearcher(indexReader);

//        4 创建分词查询对象（TermQuery），指定查询的域和查询的关键词。

        TermQuery termQuery = new TermQuery(new Term("title", "我"));

//        5 使用索引查询器对象执行查询方法。  参数一：查询条件  参数二：搜索的记录条数

        TopDocs topDocs = indexSearcher.search(termQuery, 10);

//        6 返回查询结果。遍历查询结果并输出。

        //6.1 获取查询到的结果

        ScoreDoc[] scoreDocs = topDocs.scoreDocs;

        //6.2 遍历查询结果

        for (ScoreDoc scoreDoc : scoreDocs) {

            //6.3 获取文档id，即docId

            int docId = scoreDoc.doc;

            //6.4 使用indexSearcher对象，根据docId获取document对象

            Document document = indexSearcher.doc(docId);

            //6.5 获取每个字段的值

            if (null != document) {

                String title = document.get("title");

                String content = document.get("content");

                String score = document.get("score");

                System.out.println("docId=" + docId);

                System.out.println("title=" + title);

                System.out.println("content=" + content);

                System.out.println("score=" + score);

            }

        }

//        7 关闭IndexReader对象。

        indexReader.close();

    }

3.3 运行queryDoc方法

查看控制台，我们发现没有查到数据。

因为创建索引库写入文档时，我们使用的是标准的分词规则：即按照单字分词的。而我们查询的时候，查询的字段是title，查询的关键词是“我的”。title中并没有这个分词，所以找不到。

将关键词修改成“我”，再执行查询下：