下载IK安装包

https://github.com/medcl/elasticsearch-analysis-ik

https://github.com/medcl/elasticsearch-analysis-ik/releases

解压

tar –zxvf elasticsearch-analysis-ik-5.4.0.tar.gz

安装

拷贝elasticsearch-analysis-ik-5.4.0文件夹到

elasticsearch-5.4.0/plugins目录下

重启

重启elasticsearch

测试ik分词器

生成索引

PUT  test

测试细粒度的分词

GET test/_analyze?analyzer=ik_max_word

{

  "text":"武汉市长江大桥"

}

返回值

{

  "tokens": [

    {

      "token": "武汉市",

      "start_offset": 0,

      "end_offset": 3,

      "type": "CN_WORD",

      "position": 0

    },

    {

      "token": "武汉",

      "start_offset": 0,

      "end_offset": 2,

      "type": "CN_WORD",

      "position": 1

    },

    {

      "token": "市长",

      "start_offset": 2,

      "end_offset": 4,

      "type": "CN_WORD",

      "position": 2

    },

    {

      "token": "长江大桥",

      "start_offset": 3,

      "end_offset": 7,

      "type": "CN_WORD",

      "position": 3

    },

    {

      "token": "长江",

      "start_offset": 3,

      "end_offset": 5,

      "type": "CN_WORD",

      "position": 4

    },

    {

      "token": "大桥",

      "start_offset": 5,

      "end_offset": 7,

      "type": "CN_WORD",

      "position": 5

    }

  ]

}

测试粗粒度的分词

GET test/_analyze?analyzer=iksmart

{

  "text":"武汉市长江大桥"

}

只返回了两个词

{

  "tokens": [

    {

      "token": "武汉市",

      "start_offset": 0,

      "end_offset": 3,

      "type": "CN_WORD",

      "position": 0

    },

    {

      "token": "长江大桥",

      "start_offset": 3,

      "end_offset": 7,

      "type": "CN_WORD",

      "position": 1

    }

  ]

}

测试ik分词器官方实例

语法是curl的方式，粘贴到kibana后会自动生成restful方式的查询dsl

create a index

curl -XPUT http://localhost:9200/index

create a mapping

curl -XPOST http://localhost:9200/index/fulltext/_mapping -d'

{

        "properties": {

            "content": {

                "type": "text",

                "analyzer": "ik_max_word",

                "search_analyzer": "ik_max_word"

            }

        }

}'

3.index some docs

curl -XPOST http://localhost:9200/index/fulltext/1 -d'

{"content":"美国留给伊拉克的是个烂摊子吗"}

'

curl -XPOST http://localhost:9200/index/fulltext/2 -d'

{"content":"公安部：各地校车将享最高路权"}

'

curl -XPOST http://localhost:9200/index/fulltext/3 -d'

{"content":"中韩渔警冲突调查：韩警平均每天扣1艘中国渔船"}

'

curl -XPOST http://localhost:9200/index/fulltext/4 -d'

{"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}

'

4.query with highlighting

高亮查询，会在返回的匹配关键词前后加上pre_tags、post_tags里的内容

curl -XPOST http://localhost:9200/index/fulltext/_search  -d'

{

    "query" : { "match" : { "content" : "中国" }},

    "highlight" : {

        "pre_tags" : ["<tag1>", "<tag2>"],

        "post_tags" : ["</tag1>", "</tag2>"],

        "fields" : {

            "content" : {}

        }

    }

}

'

返回的结果中命中的关键会加上pretags 和post_tags

{

    "took": 14,

    "timed_out": false,

    "_shards": {

        "total": 5,

        "successful": 5,

        "failed": 0

    },

    "hits": {

        "total": 2,

        "max_score": 2,

        "hits": [

            {

                "_index": "index",

                "_type": "fulltext",

                "_id": "4",

                "_score": 2,

                "_source": {

                    "content": "中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"

                },

                "highlight": {

                    "content": [

                        "<tag1>中国</tag1>驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首 "

                    ]

                }

            },

            {

                "_index": "index",

                "_type": "fulltext",

                "_id": "3",

                "_score": 2,

                "_source": {

                    "content": "中韩渔警冲突调查：韩警平均每天扣1艘中国渔船"

                },

                "highlight": {

                    "content": [

                        "均每天扣1艘<tag1>中国</tag1>渔船 "

                    ]

                }

            }

        ]

    }

}

安装ik分词插件的更多相关文章

elasticsearch安装IK分词插件
一打开网页:https://github.com/medcl/elasticsearch-analysis-ik/releases 这个是ik相关的包,找到你想下载的版本,下载对应的zip包二然 ...
Elastic Stack 笔记（二）Elasticsearch5.6 安装 IK 分词器和 Head 插件
博客地址:http://www.moonxy.com 一.前言 Elasticsearch 作为开源搜索引擎服务器,其核心功能在于索引和搜索数据.索引是把文档写入 Elasticsearch 的过程, ...
ES之一：Elasticsearch6.4 windows安装 head插件ik分词插件安装
准备安装目标:1.Elasticsearch6.42.head插件3.ik分词插件第一步:安装Elasticsearch6.4 下载方式:1.官网下载 https://www.elastic.co/ ...
Elasticsearch安装中文分词插件ik
Elasticsearch默认提供的分词器,会把每一个汉字分开,而不是我们想要的依据关键词来分词.比如: curl -XPOST "http://localhost:9200/userinf ...
Linux下,非Docker启动Elasticsearch 6.3.0,安装ik分词器插件,以及使用Kibana测试Elasticsearch,
Linux下,非Docker启动Elasticsearch 6.3.0 查看java版本,需要1.8版本 java -version yum -y install java 创建用户,因为elasti ...
Windows10安装Elasticsearch IK分词插件
安装插件 cmd切换到Elasticsearch安装目录下 C:\Users\Administrator>D: D:\>cd D:\Program Files\Elastic\Elasti ...
Centos7部署elasticsearch并且安装ik分词以及插件kibana
第一步下载对应的安装包 elasticsearch下载地址:https://www.elastic.co/cn/downloads/elasticsearch ik分词下载:https://gith ...
如何开发自己的搜索帝国之安装ik分词器
Elasticsearch默认提供的分词器,会把每个汉字分开,而不是我们想要的根据关键词来分词,我是中国人不能简单的分成一个个字,我们更希望 “中国人”,“中国”,“我”这样的分词,这样我们就需要 ...
Elasticsearch入门之从零开始安装ik分词器
起因需要在ES中使用聚合进行统计分析,但是聚合字段值为中文,ES的默认分词器对于中文支持非常不友好:会把完整的中文词语拆分为一系列独立的汉字进行聚合,显然这并不是我的初衷.我们来看个实例: POST ...

随机推荐

TieredMergePolicy
setFloorSegmentMB多少MB一个层级,在此区间的segment分为一个floor. setMaxMergeAtOnce一次merge多少个segment. setSegmentsPerT ...
【转】Java多线程面试问题集锦
如果你即将去一家从事大型系统研发的公司进行Java面试,不可避免的会有多线程相关的问题.下面是一些针对初学者或者新手的问题,如果你已经具备良好的基础,那么你可以跳过本文,直接尝试针对进阶水平的Java ...
C#登出系统并清除Cookie
1.前端页面代码: 前端页面代码主要显示退出系统或者网站的可视化按钮代码,代码如下:(请忽略项目关键字:CPU) <ul class="nav navbar-nav navbar-ri ...
什么样的项目适合docker部署，docker应用场景
docker官网上说明了docker的典型场景: 使应用的打包与部署自动化创建轻量.私密的PAAS环境实现自动化测试和持续的集成/部署根据这些特性,我们可以想象一下,如果你的项目有如下痛点或者需 ...
pandas操作mysql从放弃到入门
目录相关帮助文档一.如何读取数据库-read_sql 二.如何筛选数据三.如何连表-merge 四.如何删除一行或一列-drop 五.如何分组统计-groupyby 六.如何排序-sort_va ...
一个由"2020年1月7日京东出现的重大 Bug 漏洞"引起的思考...
2020年1月7日,京东由于优惠券设置错误,导致大量产品以0元或者超低价成交,并且发货.网传小家电被薅24万件,损失损失金额高达7000多万.很多网友表示收到货了,在网上晒出到货截图.下面为购买截图: ...
微信小程序点击图片放大
WXML: <view class='imgList'> <view class='imgList-li' wx:for='{{imgArr}}'> <image cla ...
cogs 2098. [SYOI 2015] Asm.Def的病毒 LCA 求两条路径是否相交
2098. [SYOI 2015] Asm.Def的病毒 ★☆ 输入文件:asm_virus.in 输出文件:asm_virus.out 简单对比时间限制:1 s 内存限制:256 M ...
（分块暴力）Time to Raid Cowavans CodeForces - 103D
题意给你一段长度为n(1 ≤ n ≤ 3·1e5)的序列,m (1 ≤ p ≤ 3·1e5)个询问,每次询问a,a+b,a+2b+...<=n的和思路一开始一直想也想不到怎么分,去维护哪些 ...

安装ik分词插件

下载IK安装包

解压

安装

重启

测试ik分词器

测试ik分词器官方实例

安装ik分词插件的更多相关文章

随机推荐

热门专题