(1)源码 
https://github.com/medcl/elasticsearch-analysis-ik

 
(2)releases 
https://github.com/medcl/elasticsearch-analysis-ik/releases

(3)复制zip地址 
https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.1.1/elasticsearch-analysis-ik-6.1.1.zip

4.2 安装插件

(1)elasticsearch-plugin

[es@node1 elasticsearch-6.1.1]$ bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.1.1/elasticsearch-analysis-ik-6.1.1.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.1.1/elasticsearch-analysis-ik-6.1.1.zip
[=================================================] 100%
-> Installed analysis-ik
[es@node1 elasticsearch-6.1.1]$ ll plugins/
total 0
drwxr-xr-x 2 es es 199 Jan 7 08:52 analysis-ik
[es@node1 elasticsearch-6.1.1]$

(2)查看目录

[es@node1 elasticsearch-6.1.1]$ ll plugins/analysis-ik/
total 1420
-rw-r--r-- 1 es es 263965 Jan 7 08:52 commons-codec-1.9.jar
-rw-r--r-- 1 es es 61829 Jan 7 08:52 commons-logging-1.2.jar
-rw-r--r-- 1 es es 51658 Jan 7 08:52 elasticsearch-analysis-ik-6.1.1.jar
-rw-r--r-- 1 es es 736658 Jan 7 08:52 httpclient-4.5.2.jar
-rw-r--r-- 1 es es 326724 Jan 7 08:52 httpcore-4.4.4.jar
-rw-r--r-- 1 es es 2666 Jan 7 08:52 plugin-descriptor.properties
[es@node1 elasticsearch-6.1.1]$

4.3 重启elasticsearch

[es@node1 elasticsearch-6.1.1]$ bin/elasticsearch
[2018-01-07T09:01:17,283][INFO ][o.e.n.Node ] [] initializing ...
[2018-01-07T09:01:17,421][INFO ][o.e.e.NodeEnvironment ] [cNWkQjt] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [14.3gb], net total_space [21.9gb], types [rootfs]
[2018-01-07T09:01:17,422][INFO ][o.e.e.NodeEnvironment ] [cNWkQjt] heap size [1007.3mb], compressed ordinary object pointers [true]
[2018-01-07T09:01:17,484][INFO ][o.e.n.Node ] node name [cNWkQjt] derived from node ID [cNWkQjt9SzKFNtyx8IIu-A]; set [node.name] to override
[2018-01-07T09:01:17,484][INFO ][o.e.n.Node ] version[6.1.1], pid[3445], build[bd92e7f/2017-12-17T20:23:25.338Z], OS[Linux/3.10.0-514.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_112/25.112-b15]
[2018-01-07T09:01:17,485][INFO ][o.e.n.Node ] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/opt/elasticsearch-6.1.1, -Des.path.conf=/opt/elasticsearch-6.1.1/config]
[2018-01-07T09:01:19,000][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [aggs-matrix-stats]
[2018-01-07T09:01:19,000][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [analysis-common]
[2018-01-07T09:01:19,000][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [ingest-common]
[2018-01-07T09:01:19,001][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-expression]
[2018-01-07T09:01:19,001][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-mustache]
[2018-01-07T09:01:19,001][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-painless]
[2018-01-07T09:01:19,001][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [mapper-extras]
[2018-01-07T09:01:19,001][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [parent-join]
[2018-01-07T09:01:19,002][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [percolator]
[2018-01-07T09:01:19,002][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [reindex]
[2018-01-07T09:01:19,002][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [repository-url]
[2018-01-07T09:01:19,002][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [transport-netty4]
[2018-01-07T09:01:19,002][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [tribe]
[2018-01-07T09:01:19,003][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded plugin [analysis-ik]
[2018-01-07T09:01:21,678][INFO ][o.e.d.DiscoveryModule ] [cNWkQjt] using discovery type [zen]
[2018-01-07T09:01:22,567][INFO ][o.e.n.Node ] initialized
[2018-01-07T09:01:22,568][INFO ][o.e.n.Node ] [cNWkQjt] starting ...
[2018-01-07T09:01:22,803][INFO ][o.e.t.TransportService ] [cNWkQjt] publish_address {192.168.80.131:9300}, bound_addresses {192.168.80.131:9300}
[2018-01-07T09:01:22,837][INFO ][o.e.b.BootstrapChecks ] [cNWkQjt] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2018-01-07T09:01:25,940][INFO ][o.e.c.s.MasterService ] [cNWkQjt] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{Xvho5gpPTuavakz227C_uA}{192.168.80.131}{192.168.80.131:9300}
[2018-01-07T09:01:25,949][INFO ][o.e.c.s.ClusterApplierService] [cNWkQjt] new_master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{Xvho5gpPTuavakz227C_uA}{192.168.80.131}{192.168.80.131:9300}, reason: apply cluster state (from master [master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{Xvho5gpPTuavakz227C_uA}{192.168.80.131}{192.168.80.131:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2018-01-07T09:01:25,993][INFO ][o.e.h.n.Netty4HttpServerTransport] [cNWkQjt] publish_address {192.168.80.131:9200}, bound_addresses {192.168.80.131:9200}
[2018-01-07T09:01:25,993][INFO ][o.e.n.Node ] [cNWkQjt] started
[2018-01-07T09:01:26,077][INFO ][o.w.a.d.Monitor ] try load config from /opt/elasticsearch-6.1.1/config/analysis-ik/IKAnalyzer.cfg.xml
[2018-01-07T09:01:26,799][INFO ][o.e.g.GatewayService ] [cNWkQjt] recovered [2] indices into cluster_state
[2018-01-07T09:01:27,526][INFO ][o.e.c.r.a.AllocationService] [cNWkQjt] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[test][2], [test][0]] ...]).

4.3 测试IK中文分词器的基本功能

(1)ik_smart 
其中pretty本意”漂亮的”,表示以美观的形式打印出JSON格式响应。

GET _analyze?pretty
{
"analyzer": "ik_smart",
"text":"安徽省长江流域"
}

分词结果

{
"tokens": [ { "token": "安徽省", "start_offset": 0, "end_offset": 3, "type": "CN_WORD", "position": 0 }, { "token": "长江流域", "start_offset": 3, "end_offset": 7, "type": "CN_WORD", "position": 1 } ] }

 
(2)ik_max_word

GET _analyze?pretty
{
"analyzer": "ik_max_word",
"text":"安徽省长江流域"
}

分词结果

{
"tokens": [ { "token": "安徽省", "start_offset": 0, "end_offset": 3, "type": "CN_WORD", "position": 0 }, { "token": "安徽", "start_offset": 0, "end_offset": 2, "type": "CN_WORD", "position": 1 }, { "token": "省长", "start_offset": 2, "end_offset": 4, "type": "CN_WORD", "position": 2 }, { "token": "长江流域", "start_offset": 3, "end_offset": 7, "type": "CN_WORD", "position": 3 }, { "token": "长江", "start_offset": 3, "end_offset": 5, "type": "CN_WORD", "position": 4 }, { "token": "江流", "start_offset": 4, "end_offset": 6, "type": "CN_WORD", "position": 5 }, { "token": "流域", "start_offset": 5, "end_offset": 7, "type": "CN_WORD", "position": 6 } ] }

(3)新词

GET _analyze?pretty
{
"analyzer": "ik_smart",
"text": "王者荣耀"
}

分词结果

{
"tokens": [ { "token": "王者", "start_offset": 0, "end_offset": 2, "type": "CN_WORD", "position": 0 }, { "token": "荣耀", "start_offset": 2, "end_offset": 4, "type": "CN_WORD", "position": 1 } ] }

4.4 扩展字典

(1)查看已有词典

[es@node1 analysis-ik]$ pwd
/opt/elasticsearch-6.1.1/config/analysis-ik
[es@node1 analysis-ik]$ ll
total 8260
-rw-rw---- 1 es bigdata 5225922 Jan 7 08:52 extra_main.dic -rw-rw---- 1 es bigdata 63188 Jan 7 08:52 extra_single_word.dic -rw-rw---- 1 es bigdata 63188 Jan 7 08:52 extra_single_word_full.dic -rw-rw---- 1 es bigdata 10855 Jan 7 08:52 extra_single_word_low_freq.dic -rw-rw---- 1 es bigdata 156 Jan 7 08:52 extra_stopword.dic -rw-rw---- 1 es bigdata 625 Jan 7 08:52 IKAnalyzer.cfg.xml -rw-rw---- 1 es bigdata 3058510 Jan 7 08:52 main.dic -rw-rw---- 1 es bigdata 123 Jan 7 08:52 preposition.dic -rw-rw---- 1 es bigdata 1824 Jan 7 08:52 quantifier.dic -rw-rw---- 1 es bigdata 164 Jan 7 08:52 stopword.dic -rw-rw---- 1 es bigdata 192 Jan 7 08:52 suffix.dic -rw-rw---- 1 es bigdata 752 Jan 7 08:52 surname.dic [es@node1 analysis-ik]$

(2)自定义词典

[es@node1 analysis-ik]$ mkdir custom
[es@node1 analysis-ik]$ vi custom/new_word.dic
[es@node1 analysis-ik]$ cat custom/new_word.dic
老铁
王者荣耀
洪荒之力
共有产权房
一带一路
[es@node1 analysis-ik]$

(3)更新配置

[es@node1 analysis-ik]$ vi IKAnalyzer.cfg.xml
[es@node1 analysis-ik]$ cat IKAnalyzer.cfg.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict">custom/new_word.dic</entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords"></entry>
<!--用户可以在这里配置远程扩展字典 -->
<!-- <entry key="remote_ext_dict">words_location</entry> -->
<!--用户可以在这里配置远程扩展停止词字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>
[es@node1 analysis-ik]$

(4)重启elasticsearch

[es@node1 elasticsearch-6.1.1]$ bin/elasticsearch
[2018-01-07T10:00:23,032][INFO ][o.e.n.Node ] [] initializing ...
[2018-01-07T10:00:23,170][INFO ][o.e.e.NodeEnvironment ] [cNWkQjt] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [14.3gb], net total_space [21.9gb], types [rootfs]
[2018-01-07T10:00:23,171][INFO ][o.e.e.NodeEnvironment ] [cNWkQjt] heap size [1007.3mb], compressed ordinary object pointers [true]
[2018-01-07T10:00:23,209][INFO ][o.e.n.Node ] node name [cNWkQjt] derived from node ID [cNWkQjt9SzKFNtyx8IIu-A]; set [node.name] to override
[2018-01-07T10:00:23,210][INFO ][o.e.n.Node ] version[6.1.1], pid[3574], build[bd92e7f/2017-12-17T20:23:25.338Z], OS[Linux/3.10.0-514.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_112/25.112-b15]
[2018-01-07T10:00:23,210][INFO ][o.e.n.Node ] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/opt/elasticsearch-6.1.1, -Des.path.conf=/opt/elasticsearch-6.1.1/config]
[2018-01-07T10:00:24,717][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [aggs-matrix-stats]
[2018-01-07T10:00:24,717][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [analysis-common]
[2018-01-07T10:00:24,718][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [ingest-common]
[2018-01-07T10:00:24,718][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-expression]
[2018-01-07T10:00:24,718][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-mustache]
[2018-01-07T10:00:24,718][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-painless]
[2018-01-07T10:00:24,718][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [mapper-extras]
[2018-01-07T10:00:24,719][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [parent-join]
[2018-01-07T10:00:24,719][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [percolator]
[2018-01-07T10:00:24,719][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [reindex]
[2018-01-07T10:00:24,719][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [repository-url]
[2018-01-07T10:00:24,719][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [transport-netty4]
[2018-01-07T10:00:24,720][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [tribe]
[2018-01-07T10:00:24,720][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded plugin [analysis-ik]
[2018-01-07T10:00:27,866][INFO ][o.e.d.DiscoveryModule ] [cNWkQjt] using discovery type [zen]
[2018-01-07T10:00:28,794][INFO ][o.e.n.Node ] initialized
[2018-01-07T10:00:28,795][INFO ][o.e.n.Node ] [cNWkQjt] starting ...
[2018-01-07T10:00:29,047][INFO ][o.e.t.TransportService ] [cNWkQjt] publish_address {192.168.80.131:9300}, bound_addresses {192.168.80.131:9300}
[2018-01-07T10:00:29,093][INFO ][o.e.b.BootstrapChecks ] [cNWkQjt] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2018-01-07T10:00:32,210][INFO ][o.e.c.s.MasterService ] [cNWkQjt] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{N6t0NiDmQp2vlrbx-FtcUQ}{192.168.80.131}{192.168.80.131:9300}
[2018-01-07T10:00:32,217][INFO ][o.e.c.s.ClusterApplierService] [cNWkQjt] new_master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{N6t0NiDmQp2vlrbx-FtcUQ}{192.168.80.131}{192.168.80.131:9300}, reason: apply cluster state (from master [master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{N6t0NiDmQp2vlrbx-FtcUQ}{192.168.80.131}{192.168.80.131:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2018-01-07T10:00:32,285][INFO ][o.e.h.n.Netty4HttpServerTransport] [cNWkQjt] publish_address {192.168.80.131:9200}, bound_addresses {192.168.80.131:9200}
[2018-01-07T10:00:32,286][INFO ][o.e.n.Node ] [cNWkQjt] started
[2018-01-07T10:00:32,326][INFO ][o.w.a.d.Monitor ] try load config from /opt/elasticsearch-6.1.1/config/analysis-ik/IKAnalyzer.cfg.xml
[2018-01-07T10:00:32,905][INFO ][o.w.a.d.Monitor ] [Dict Loading] custom/new_word.dic
[2018-01-07T10:00:33,279][INFO ][o.e.g.GatewayService ] [cNWkQjt] recovered [2] indices into cluster_state
[2018-01-07T10:00:34,092][INFO ][o.e.c.r.a.AllocationService] [cNWkQjt] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[test][3]] ...]).

从输出信息中可以看到

[Dict Loading] custom/new_word.dic

说明自定义词典已经加载了。

(5)重启Kibana 
重启Kibana后,从新执行下面命令

GET _analyze?pretty
{
"analyzer": "ik_smart",
"text":"王者荣耀"
}

分词结果

{
"tokens": [ { "token": "王者荣耀", "start_offset": 0, "end_offset": 4, "type": "CN_WORD", "position": 0 } ] }

IK分词器插件的更多相关文章

  1. docker上安装elasticsearch和ik分词器插件和header,实现分词功能

    docker run -di --name=tensquare_es -p 9200: -p 9300:9300 elasticsearch:5.6.8 创建elasticsearch容器(如果版本不 ...

  2. es-07-head插件-ik分词器插件

    5.x以后, es对head插件的支持并不是特别好 而且kibana的功能越来越强大, 建议使用kibana 1, head插件安装 在一台机器上安装head插件就可以了 1), 更新,安装依赖 su ...

  3. IK分词器插件elasticsearch-analysis-ik 6.1.1

    http://88250.b3log.org/full-text-search-elasticsearch#b3_solo_h3_0 IK分词器插件 (1)源码 https://github.com/ ...

  4. Linux下,非Docker启动Elasticsearch 6.3.0,安装ik分词器插件,以及使用Kibana测试Elasticsearch,

    Linux下,非Docker启动Elasticsearch 6.3.0 查看java版本,需要1.8版本 java -version yum -y install java 创建用户,因为elasti ...

  5. Elasticsearch 7.x - IK分词器插件(ik_smart,ik_max_word)

    一.安装IK分词器 Elasticsearch也需要安装IK分析器以实现对中文更好的分词支持. 去Github下载最新版elasticsearch-ik https://github.com/medc ...

  6. 通过docker安装elasticsearch和安装ik分词器插件及安装kibana

    前提: 已经安装好docker运行环境: 步骤: 1.安装elasticsearch 6.2.2版本,目前最新版是7.2.0,这里之所以选择6.2.2是因为最新的SpringBoot2.1.6默认支持 ...

  7. windows elasticsearch使用ik分词器插件后启动报错java.security.AccessControlException: access denied ("java.io.FilePermission" "D:...........\plugins\ik-analyzer\config\IKAnalyzer.cfg.xml" "read")

    删除es安装文件夹中空格,遂解决......(哭

  8. Elastic Stack 笔记(二)Elasticsearch5.6 安装 IK 分词器和 Head 插件

    博客地址:http://www.moonxy.com 一.前言 Elasticsearch 作为开源搜索引擎服务器,其核心功能在于索引和搜索数据.索引是把文档写入 Elasticsearch 的过程, ...

  9. Restful认识和 IK分词器的使用

    什么是Restful风格 Restful是一种面向资源的架构风格,可以简单理解为:使用URL定位资源,用HTTP动词(GET,POST,DELETE,PUT)描述操作. 使用Restful的好处: 透 ...

随机推荐

  1. Java中的反射该如何使用?

    1. 什么是反射 反射是一种功能强大且复杂的机制.Java反射说的是在运行状态中,对于任何一个类,我们都能够知道这个类有哪些方法和属性.对于任何一个对象,我们都能够对它的方法和属性进行调用.我们把这种 ...

  2. Yii2配置

    最外层:配置文件,params Yii2导航 <?php NavBar::begin([ 'brandLabel' => '大海', 'brandUrl' => Yii::$app- ...

  3. luoguP1273 有线电视网 [树形dp]

    题目描述 某收费有线电视网计划转播一场重要的足球比赛.他们的转播网和用户终端构成一棵树状结构,这棵树的根结点位于足球比赛的现场,树叶为各个用户终端,其他中转站为该树的内部节点. 从转播站到转播站以及从 ...

  4. Core Data could not fulfill a fault

    做项目的时候在iOS4系统遇到过这样一个crash,console显示的错误信息是"Core Data could not fulfill a fault". 字面意思是什么?&q ...

  5. (转)JAVA国际化

    转:http://www.cnblogs.com/jjtech/archive/2011/02/14/1954291.html 国际化英文单词为:Internationalization,又称I18N ...

  6. (转)在Source Insight中看Python代码

    http://blog.csdn.net/lvming404/archive/2009/03/18/4000394.aspx SI是个很强大的代码查看修改工具,以前用来看C,C++都是相当happy的 ...

  7. 2019 牛客多校第六场 B Shorten IPv6 Address

    题目链接:https://ac.nowcoder.com/acm/contest/886/B 题目大意 给定一个 128 位的二进制 ip 地址,让你以 16 位一组,每组转成 16 进制,用冒号连接 ...

  8. 2019 牛客多校第一场 E ABBA

    题目链接:https://ac.nowcoder.com/acm/contest/881/E 题目大意 问有多少个由 (n + m) 个 ‘A’ 和 (n + m) 个 ‘B’,组成的字符串能被分割成 ...

  9. Git 学习(二)Git 基础

    Git 基础 Git 在保存和对待各种信息的时候与其它版本控制系统如 SVN 等等有很大差异,尽管操作起来的命令形式非常相近,理解这些差异将有助于防止你使用中的困惑. Git 记录的是什么? 如果有使 ...

  10. SpringMVC(day1搭建SpringWebMvc项目)

    MVC和webMVC的区别 Model(模型) 数据模型,提供要展示的数据,因此包含数据和行为,行为是用来处理这些数据的.不过现在一般都分离开来:Value Object(数据) 和 服务层(行为). ...