你还不会ES的CUD吗？

近端时间在搬砖过程中对es进行了操作，但是对es查询文档不熟悉，所以这两周都在研究es，简略看了《Elasticsearch权威指南》，摸摸鱼又是一天。

es是一款基于Lucene的实时分布式搜索和分析引擎，今天咱不聊其应用场景，聊一下es索引增删改。

环境：Centos 7，Elasticsearch6.8.3，jdk8

（最新的es是7版本，7版本需要jdk11以上，所以装了es6.8.3版本。）

下面都将以student索引为例

一、创建索引

PUT   http://192.168.197.100:9200/student

{

    "mapping":{

      "_doc":{ //“_doc”是类型type，es6中一个索引下只有一个type，不能有其它type

        "properties":{

          "id": {

              "type": "keyword"

          },

          "name":{

            "type":"text",

            "index":"analyzed",

            "analyzer":"standard"

          },

          "age":{

            "type":"integer",

            "fields": {

              "keyword": {

                "type": "keyword",

                "ignore_above":256

              }

            }

          },

          "birthday":{

            "type":"date"

          },

          "gender":{

            "type":"keyword"

          },

          "grade":{

            "type":"text",

            "fields":{

              "keyword":{

                "type":"keyword",

                 "ignore_above":256

              }

            }

          },

          "class":{

            "type":"text",

            "fields":{

              "keyword":{

                "type":"keyword",

                 "ignore_above":256

              }

            }

          }

        }

      }

    },

    "settings":{

      //主分片数量

      "number_of_shards" : 1,

      //分片副本数量

      "number_of_replicas" : 1

    }

}

type属性是text和keyword的区别：

（1）text在查询的时候会被分词，用于搜索

（2）keyword在查询的时候不会被分词，用于聚合

index属性是表示字符串以何种方式被索引，有三种值

（1）analyzed：字段可以被模糊匹配，类似于sql中的like

（2）not_analyzed：字段只能精确匹配，类似于sql中的“=”

（3）no：字段不提供搜索

analyzer属性是设置分词器，中文的话一般是ik分词器，也可以自定义分词器。

number_of_shards属性是主分片数量，默认是5，创建之后不能修改

number_of_replicas属性时分片副本数量，默认是1，可以修改

创建成功之后会返回如下json字符串

{    "acknowledged": true,    "shards_acknowledged": true,    "index": "student"}

创建之后如何查看索引的详细信息呢？

GET http://192.168.197.100:9200/student/_mapping

es6版本，索引之下只能有一个类型，例如上文中的“_doc”。

es跟关系型数据库比较：

二、修改索引

//修改分片副本数量为2

PUT http://192.168.197.100:9200/student/_settings

{

  "number_of_replicas":2

}

三、删除索引

//删除单个索引

DELETE http://192.168.197.100:9200/student

//删除所有索引

DELETE  http://192.168.197.100:9200/_all

四、默认分词器standard和ik分词器比较

es默认的分词器是standard，它对英文的分词是以空格分割的，中文则是将一个词分成一个一个的文字，所以其不适合作为中文分词器。

例如：standard对英文的分词

//此api是查看文本分词情况的

POST http://192.168.197.100:9200/_analyze

{

  "text":"the People's Republic of China",

  "analyzer":"standard"

}

结果如下：

{

    "tokens": [

        {

            "token": "the",

            "start_offset": 0,

            "end_offset": 3,

            "type": "<ALPHANUM>",

            "position": 0

        },

        {

            "token": "people's",

            "start_offset": 4,

            "end_offset": 12,

            "type": "<ALPHANUM>",

            "position": 1

        },

        {

            "token": "republic",

            "start_offset": 13,

            "end_offset": 21,

            "type": "<ALPHANUM>",

            "position": 2

        },

        {

            "token": "of",

            "start_offset": 22,

            "end_offset": 24,

            "type": "<ALPHANUM>",

            "position": 3

        },

        {

            "token": "china",

            "start_offset": 25,

            "end_offset": 30,

            "type": "<ALPHANUM>",

            "position": 4

        }

    ]

}

对中文的分词：

POST http://192.168.197.100:9200/_analyze

{

  "text":"中华人民共和国万岁",

  "analyzer":"standard"

}

结果如下：

{

    "tokens": [

        {

            "token": "中",

            "start_offset": 0,

            "end_offset": 1,

            "type": "<IDEOGRAPHIC>",

            "position": 0

        },

        {

            "token": "华",

            "start_offset": 1,

            "end_offset": 2,

            "type": "<IDEOGRAPHIC>",

            "position": 1

        },

        {

            "token": "人",

            "start_offset": 2,

            "end_offset": 3,

            "type": "<IDEOGRAPHIC>",

            "position": 2

        },

        {

            "token": "民",

            "start_offset": 3,

            "end_offset": 4,

            "type": "<IDEOGRAPHIC>",

            "position": 3

        },

        {

            "token": "共",

            "start_offset": 4,

            "end_offset": 5,

            "type": "<IDEOGRAPHIC>",

            "position": 4

        },

        {

            "token": "和",

            "start_offset": 5,

            "end_offset": 6,

            "type": "<IDEOGRAPHIC>",

            "position": 5

        },

        {

            "token": "国",

            "start_offset": 6,

            "end_offset": 7,

            "type": "<IDEOGRAPHIC>",

            "position": 6

        },

        {

            "token": "万",

            "start_offset": 7,

            "end_offset": 8,

            "type": "<IDEOGRAPHIC>",

            "position": 7

        },

        {

            "token": "岁",

            "start_offset": 8,

            "end_offset": 9,

            "type": "<IDEOGRAPHIC>",

            "position": 8

        }

    ]

}

ik分词器是支持对中文进行词语分割的，其有两个分词器，分别是ik_smart和ik_max_word。

（1）ik_smart：对中文进行最大粒度的划分，简略划分

例如：

POST http://192.168.197.100:9200/_analyze

{

  "text":"中华人民共和国万岁",

  "analyzer":"ik_smart"

}

结果如下：

{

    "tokens": [

        {

            "token": "中华人民共和国",

            "start_offset": 0,

            "end_offset": 7,

            "type": "CN_WORD",

            "position": 0

        },

        {

            "token": "万岁",

            "start_offset": 7,

            "end_offset": 9,

            "type": "CN_WORD",

            "position": 1

        }

    ]

}

（2）ik_max_word：对中文进行最小粒度的划分，将文本划分尽量多的词语

例如：

POST http://192.168.197.100:9200/_analyze

{

  "text":"中华人民共和国万岁",

  "analyzer":"ik_max_word"

}

结果如下：

{

    "tokens": [

        {

            "token": "中华人民共和国",

            "start_offset": 0,

            "end_offset": 7,

            "type": "CN_WORD",

            "position": 0

        },

        {

            "token": "中华人民",

            "start_offset": 0,

            "end_offset": 4,

            "type": "CN_WORD",

            "position": 1

        },

        {

            "token": "中华",

            "start_offset": 0,

            "end_offset": 2,

            "type": "CN_WORD",

            "position": 2

        },

        {

            "token": "华人",

            "start_offset": 1,

            "end_offset": 3,

            "type": "CN_WORD",

            "position": 3

        },

        {

            "token": "人民共和国",

            "start_offset": 2,

            "end_offset": 7,

            "type": "CN_WORD",

            "position": 4

        },

        {

            "token": "人民",

            "start_offset": 2,

            "end_offset": 4,

            "type": "CN_WORD",

            "position": 5

        },

        {

            "token": "共和国",

            "start_offset": 4,

            "end_offset": 7,

            "type": "CN_WORD",

            "position": 6

        },

        {

            "token": "共和",

            "start_offset": 4,

            "end_offset": 6,

            "type": "CN_WORD",

            "position": 7

        },

        {

            "token": "国",

            "start_offset": 6,

            "end_offset": 7,

            "type": "CN_CHAR",

            "position": 8

        },

        {

            "token": "万岁",

            "start_offset": 7,

            "end_offset": 9,

            "type": "CN_WORD",

            "position": 9

        },

        {

            "token": "万",

            "start_offset": 7,

            "end_offset": 8,

            "type": "TYPE_CNUM",

            "position": 10

        },

        {

            "token": "岁",

            "start_offset": 8,

            "end_offset": 9,

            "type": "COUNT",

            "position": 11

        }

    ]

}

ik分词器对英文的分词：

POST http://192.168.197.100:9200/_analyze

{

  "text":"the People's Republic of China",

  "analyzer":"ik_smart"

}

结果如下：会将不重要的词去掉，但standard分词器会保留（英语水平已经退化到a an the都不知道是属于什么类型的词了，身为中国人，这个不能骄傲）

{

    "tokens": [

        {

            "token": "people",

            "start_offset": 4,

            "end_offset": 10,

            "type": "ENGLISH",

            "position": 0

        },

        {

            "token": "s",

            "start_offset": 11,

            "end_offset": 12,

            "type": "ENGLISH",

            "position": 1

        },

        {

            "token": "republic",

            "start_offset": 13,

            "end_offset": 21,

            "type": "ENGLISH",

            "position": 2

        },

        {

            "token": "china",

            "start_offset": 25,

            "end_offset": 30,

            "type": "ENGLISH",

            "position": 3

        }

    ]

}

五、添加文档

可以任意添加字段

//1是“_id”的值，唯一的，也可以随机生成

POST http://192.168.197.100:9200/student/_doc/1

{

  "id":1,

  "name":"tom",

  "age":20,

  "gender":"male",

  "grade":"7",

  "class":"1"

}

六、更新文档

POST http://192.168.197.100:9200/student/_doc/1/_update

{

  "doc":{

    "name":"jack"

  }

}

七、删除文档

//1是“_id”的值

DELETE http://192.168.197.100:9200/student/_doc/1

上述就是简略的对es进行索引创建，修改，删除，文档添加，删除，修改等操作，为避免篇幅太长，文档查询操作将在下篇进行更新。

你还不会ES的CUD吗？的更多相关文章

OpenGL ES 正反面设置指令
在OpenGL ES 中,仅有一种表面网格表示方式,那就是三角形. 三角形的三个顶点,可以组几个面?有答 1 的没有?有!那就是还不懂OpenGL ES 的我. 事实上,一张纸是有正反面的,那么一个三 ...
2017 ES GZ Meetup分享：Data Warehouse with ElasticSearch in Datastory
以下是我在2017 ES 广州 meetup的分享 ppt:https://elasticsearch.cn/slides/11#page=22 摘要 ES最多使用的场景是搜索和日志分析,然而ES强大 ...
让node支持es模块化(export、import)的方法
node版本v7.9.0,支持了大部分es6的功能,但还不支持es6模块化(export.import). 检测ES6 可以使用es-checker来检测当前Node.js对ES6的支持情况. 使用命 ...
第3章 ES文档和故障处理
第3章 ES文档和故障处理一.ES网络配置表 ES网络配置表是ES的硬件和软件组成的列表.ES网络配置常包括以下项目: 分级项目杂项信息系统名.系统厂商/型号.CPU速率.RAM.存储器.系统 ...
ES内存持续上升问题定位
https://discuss.elastic.co/t/memory-usage-of-the-machine-with-es-is-continuously-increasing/23537/ ...
公司ES升级带来的坑怎么填？
前言公司的ES最近需要全部进行升级,目的是方便维护和统一管理.以前的版本不统一,这次准备统一升级到一个固定的版本. 同时还会给ES加上权限控制,虽然都是部署在内网,为了防止误操作,加上权限还是有必要 ...
Elasticsearch ES索引
ES是一个基于RESTful web接口并且构建在Apache Lucene之上的开源分布式搜索引擎. 同时ES还是一个分布式文档数据库,其中每个字段均可被索引,而且每个字段的数据均可被搜索,能够横向 ...
ES读写数据过程及原理
ES读写数据过程及原理倒排索引首先来了解一下什么是倒排索引倒排索引,就是建立词语与文档的对应关系(词语在什么文档出现,出现了多少次,在什么位置出现) 搜索的时候,根据搜索关键词,直接在索引中找到 ...
ES[7.6.x]学习笔记（九）搜索
搜索是ES最最核心的内容,没有之一.前面章节的内容,索引.动态映射.分词器等都是铺垫,最重要的就是最后点击搜索这一下.下面我们就看看点击搜索这一下的背后,都做了哪些事情. 分数(score) ES的搜 ...

随机推荐

maatwebsite lost precision when export long integer data
Maatwebsite would lost precision when export long integer data, no matter string or int storaged in ...
Linux下安装mysql时报错：FATAL ERROR: please install the following Perl modules before executing ./scripts/mysql_install_db:Data::Dumper
如题,安装mysql过程中,执行scripts/mysql_install_db --user=mysql命令时报错: FATAL ERROR: please install the followin ...
在服务器上使用python-gym出现的显示问题
参考链接: http://www.luyixian.cn/news_show_392045.aspx https://www.cnblogs.com/cenariusxz/p/12666938.htm ...
微众银行FATE联邦学习框架
参考:https://github.com/webankfintech/fate https://www.fedai.org/#/ 一.Docker Standalone 安装 FATE $ sh b ...
Docker 安装及配置镜像加速
Docker 版本随着 Docker 的飞速发展,企业级功能的上线,更好的服务意味着需要支付一定的费用,目前 Docker 被分为两个版本: community-edition 社区版 enterp ...
学习一下 JVM （二） -- 学习一下 JVM 中对象、String 相关知识
一.JDK 8 版本下 JVM 对象的分配.布局.访问(简单了解下) 1.对象的创建过程 (1)前言 Java 是一门面向对象的编程语言,程序运行过程中在任意时刻都可能有对象被创建.开发中常用 new ...
【Android】AndroidStudio打包apk出现的一些问题 `Error:Execution failed for task ':app:lintVitalRelease'.
作者:程序员小冰,CSDN博客:http://blog.csdn.net/qq_21376985, QQ986945193 公众号:程序员小冰 1,错误代码: `Error:Execution fai ...
GENYMOTION问题之an error occurred while deploying a file install_failed_no_machine_abis
GENYMOTION问题之an error occurred while deploying a file install_failed_no_machine_abis 出现上面错误,看网上有一种解决 ...
Codeforces 1389 题解(A-E)
AC代码 A. LCM Problem 若$a < b$,则$LCM(a,b)$是$a$的整数倍且$LCM(a,b) \ne a$,所以$LCM(a,b) \ge 2a$,当 ...
SpringBoot—整合log4j2入门和log4j2.xml配置详解
关注微信公众号:CodingTechWork,一起学习进步. 引言对于一个线上程序或者服务而言,重要的是要有日志输出,这样才能方便运维.而日志的输出需要有一定的规划,如日志命名.日志大小,日志分 ...

你还不会ES的CUD吗？

你还不会ES的CUD吗？的更多相关文章

随机推荐

热门专题