elasticsearch入门使用（四）索引、安装IK分词器及增删改查数据

一、查看、创建索引

创建一个名字为user索引：

curl -X PUT 'localhost:9200/stu'

{"acknowledged":true,"shards_acknowledged":true,"index":"stu"}

二、查看索引：http://192.168.56.101:9200/_cat/indices?v IP地址请修改为自己的IP

pri：分片数量 rep：副本集

三、删除索引

curl -X DELETE 'localhost:9200/stu'

{"acknowledged":true}

四、安装ik6.2.2分词器，注意ik的版本最好跟es的版本保持一致

cd /

/usr/share/elasticsearch/bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.2/elasticsearch-analysis-ik-6.2.2.zip

重新启动elasticsearch

systemctl restart elasticsearch

测试IK中文分词器是否安装成功

curl -XGET -H 'Content-Type: application/json' 'http://localhost:9200/_analyze?pretty' -d '{ "analyzer" : "ik_max_word", "text": "中华人民共和国国歌" }'

返回json

{

  "tokens" : [

    { "token" : "中华人民共和国",  "start_offset" : 0,  "end_offset" : 7,  "type" : "CN_WORD", "position" : 0  },

    { "token" : "中华人民",  "start_offset" : 0,  "end_offset" : 4, "type" : "CN_WORD",  "position" : 1  },

    { "token" : "中华",  "start_offset" : 0,  "end_offset" : 2,  "type" : "CN_WORD",  "position" : 2 },

    { "token" : "华人", "start_offset" : 1,  "end_offset" : 3, "type" : "CN_WORD", "position" : 3 },

    { "token" : "人民共和国",  "start_offset" : 2,  "end_offset" : 7,  "type" : "CN_WORD", "position" : 4 },

    { "token" : "人民",  "start_offset" : 2, "end_offset" : 4, "type" : "CN_WORD",  "position" : 5 },

    { "token" : "共和国",  "start_offset" : 4,  "end_offset" : 7, "type" : "CN_WORD",  "position" : 6 },

    { "token" : "共和", "start_offset" : 4,  "end_offset" : 6, "type" : "CN_WORD",  "position" : 7  },

    { "token" : "国",  "start_offset" : 6, "end_offset" : 7,  "type" : "CN_CHAR",  "position" : 8 },

    { "token" : "国歌",  "start_offset" : 7,  "end_offset" : 9,  "type" : "CN_WORD", "position" : 9 }

  ]

}

五、设置索引

假设这个是我们的数据结构，数据类型覆盖还是比较全

stu:索引名称

person：Type名称

analyzer：字段文本分词器，默认是analyzed

search_analyzer：搜索分词器，默认是analyzed

ik_max_word：中文分词器

curl -XGET -H 'Content-Type: application/json' 'http://127.0.0.1:9200/stu' -d '

{

  "mappings": {

    "person": {

      "dynamic":true,

      "dynamic_date_formats":["yyyy-MM-dd hh:mm:ss", "yyyy-MM-dd" ],

      "properties": {

        "id": { "type": "integer",  "store":true  },

        "name": { "type": "text", "store": true },

        "cname": { "type": "text",  "analyzer": "ik_max_word", "search_analyzer": "ik_max_word", "store": true },

        "age": { "type": "integer"  },

        "score": { "type": "float"  },

        "email": { "type": "text", "store":true },

        "birthday": {  "type": "date", "format":"yyyy-MM-dd","store":true },

        "regdate": { "type": "date", "format":"yyyy-MM-dd hh:mm:ss","store":true  },

        "city": { "type": "keyword", "analyzer": "keyword", "store":true },

        "address": { "type": "text", "analyzer": "ik_max_word" }

      }

    }

  }

}'

六、新增数据

curl -XPOST -H 'Content-Type: application/json' '127.0.0.1:9200/stu/person' -d '

{

  "id": "11",

  "name": "zhang san",

  "cnname": "张三",

  "age": 20,

  "score":80.8,

  "email":"zhang.san@163.com",

  "birthday":"2000-03-03",

  "regdate":"2018-03-03T15:33:33Z",

  "city":"PEK",

  "address":"上海市闸北区保德路389号"

}'

'127.0.0.1:9200/stu/person' 不指定的话会分配一个ID，如下"_id":"aiO0EWIB1IWtAj8my_8s"

{"_index":"stu","_type":"person","_id":"aiO0EWIB1IWtAj8my_8s","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}

注意：以下参数如果在person后指定id=abc123的话会根据这个ID更新或者新增数据,result=created/updated

curl -XPOST -H 'Content-Type: application/json' '127.0.0.1:9200/stu/person/abc123' -d '

{

  "id": "111",

  "name": "li si",

  "cnname": "李四",

  "age": 21,

  "score":98.9,

  "email":"lisi@qq.com",

  "birthday":"2008-03-03",

  "regdate":"2019-03-03T15:33:33Z",

  "city":"SHA",

  "address":"江苏省苏州市园区现代大道188号"

}'

{"_index":"stu","_type":"person","_id":"abc123","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}

六、删除数据

指定具体的_id删除

curl -X DELETE 'localhost:9200/stu/person/Dpts-2EBY6Wnp0_K3NkH'

根据Query DSL删除，参考语法。官方：Delete By Query API

curl -XPOST -H 'Content-Type: application/json' 'localhost:9200/stu/person/_delete_by_query?pretty' -d '

{

  "query": {

    "bool":{

      "filter": [

        {"term":{ "city": "pek" }}

      ]

    }

  }

}'

七、修改数据

注意：更新操作会重新更新索引

带ID全部字段更新(参考上面带ID新增数据，同样的道理)

略...
带ID部分字段更新 Elasticsearch Reference [6.2] » Document APIs » Update API

注意doc和script不能同时在一次请求里POST

更新id=abc123设置name="lisi" ,并新增一个属性 "bugs": 0

curl -XPOST -H 'Content-Type: application/json' 'localhost:9200/stu/person/abc123/_update?pretty' -d '

{

    "doc" : {

      "name" : "lisi",

      "bugs": 0

   }

}'

更新id=abc123设置年龄+=4

curl -XPOST -H 'Content-Type: application/json' 'localhost:9200/stu/person/abc123/_update?pretty' -d '

{

    "script" : {

        "source": "ctx._source.age += 4"

    }

}'

3. _update_by_query根据条件更新 Elasticsearch Reference [6.2] » Document APIs » Update By Query API

在不更新源文件的情况下根据index更新文档，也可以用于新增字段属性

_update_by_query支持script,如果script和doc同时存在会忽略doc

_update_by_query在执时候快照内部索引，当文档生成快照是document正则变更的话将会发生版本冲突，否则的话会更新版本号

0不是一个有效的版本号，因此版本号为0不支持_update_by_query更新

所有的更新和查询失败都会终止，如果只是想进行简单的类似计数器类的功能可以在请求参数里加conflicts=proceed重新尝试更新

URL参数

除了标准的pretty参数外，Update_By_Query还可以支持refresh, wait_for_completion, wait_for_active_shards, timeout and scroll

refresh：URL发送refresh参数会在update完之后更新所有分片的索引,与Index API中的refresh不一样的是只会接受新数据进行索引

wait_for_completion：如果请求中包含wait_for_completion=false，则会进行与检查启动request返回一个task，可以被Index API取消或者查看状态

wait_for_active_shards：控制在处理请求之前必须激活多少个分片副本

timeout：设置分片的从不可用变成可用的时间，

scroll：由于Update_By_Query会进行所有上下文检索，默认时间是5分钟，实例 ?scroll=10m 修改为10分钟

requests_per_second：设置一个正整数，控制等待时间内每个批次操作索引的数量

完整实例：

GET stu/_update_by_query?pretty&conflicts=proceed&refresh=true&timeout=1s

{

  "script": {

   "source": "ctx._source.name=\"lisi2\";ctx._source.bugs=10"

  },

  "query": {

    "bool": {

      "filter": [

        {"term": { "id": "111" } }

      ]

    }

  }

}

下面错误示范：无法更新bugs。

curl -XPOST -H 'Content-Type: application/json' 'localhost:9200/stu/person/_update_by_query?conflicts=proceed&pretty' -d '

{

  "script": {

    "source": "ctx._source.age++",

    "bugs": 10

  },

  "query": {

    "bool":{

      "filter": [

        {"term":{ "id": "111" }},

        {"term":{ "name": "lisi" }}

      ]

    }

  }

}'

八、查询数据

查询部分请参考 elasticsearch入门使用（三） Query DSL