elasticsearch之警惕inner hits的性能问题

一、inner hits简介

elasticsearch提供了nested数据类型来处理主子文档的问题，可以解决子文档字段被分裂平铺导致字段之间失去了整体的关联性；

elasticsearch提供的inner hits主要完成在通过子文档进行匹配查询的时候，可以方便控制匹配的子文档的返回；

二、数据描述

数据结构及index情况可以参考 elasticsearch支持大table格式数据的搜索

三、问题简介

通过一个简单的ip来搜索，只匹配了一个主文档，而且返回了十个子元素，并进行了高亮处理；

查询语句

{

  "_source": {

    "excludes": [

      "content"

    ]

  },

  "query": {

    "bool": {

      "should": {

        "nested": {

          "path": "content",

          "query": {

            "query_string": {

              "query": "192.168.1.1*",

              "fields": [

                "content.*"

              ]

            }

          },

          "inner_hits": {

            "from": 0,

            "size": 10,

            "highlight": {

              "fields": {

                "*": {}

              },

              "fragment_size": 1000

            }

          },

          "score_mode": "avg",

          "ignore_unmapped": true

        }

      }

    }

  },

  "size": 20,

  "timeout": "20s"

}

执行语句的时间长达3111ms，只是匹配了一个文档，并且只高亮返回10个子文档，时间不至于这么长；

{

    "took":3111,

    "timed_out":false,

    "_shards":{

        "total":1,

        "successful":1,

        "skipped":0,

        "failed":0

    },

    "hits":{

        "total":1,

        "max_score":0.001722915,

        "hits":[

		]

    }

}

四、定位问题

执行以下语句，使用profile api来查看query执行的时间；



{

  "profile": true,

  "_source": {

    "excludes": [

      "content"

    ]

  },

  "query": {

    "bool": {

      "should": {

        "nested": {

          "path": "content",

          "query": {

            "query_string": {

              "query": "192.168.1.1*",

              "fields": [

                "content.*"

              ]

            }

          },

          "inner_hits": {

            "from": 0,

            "size": 10,

            "highlight": {

              "fields": {

                "*": {}

              },

              "fragment_size": 1000

            }

          },

          "score_mode": "avg",

          "ignore_unmapped": true

        }

      }

    }

  },

  "size": 20,

  "timeout": "20s"

}

通过profile部分，我们可以看到整个search的时间不到20ms，肯定不是查询导致的问题了；

{

    "took":2859,

    "timed_out":false,

    "profile":{

        "shards":[

            {

                "searches":[

                    {

                        "query":[

                            {

                                "type":"BooleanQuery",

                                "time":"9.9ms",

                                "time_in_nanos":9945310,

                                "breakdown":{

                                    "score":9349172,

                                    "build_scorer_count":6,

                                    "match_count":0,

                                    "create_weight":398951,

                                    "next_doc":1262,

                                    "match":0,

                                    "create_weight_count":1,

                                    "next_doc_count":1,

                                    "score_count":1,

                                    "build_scorer":176010,

                                    "advance":19905,

                                    "advance_count":1

                                }

                            }

                        ],

                        "rewrite_time":41647,

                        "collector":[

                            {

                                "name":"CancellableCollector",

                                "reason":"search_cancelled",

                                "time":"9.3ms",

                                "time_in_nanos":9376796,

                                "children":[

                                    {

                                        "name":"SimpleTopScoreDocCollector",

                                        "reason":"search_top_hits",

                                        "time":"9.3ms",

                                        "time_in_nanos":9355874

                                    }

                                ]

                            }

                        ]

                    }

                ],

                "aggregations":[

                ]

            }

        ]

    }

}

是不是高亮的问题呢？

去掉查询语句中的高亮部分，执行如下查询语句；

{

  "_source": {

    "excludes": [

      "content"

    ]

  },

  "query": {

    "bool": {

      "should": {

        "nested": {

          "path": "content",

          "query": {

            "query_string": {

              "query": "192.168.1.1*",

              "fields": [

                "content.*"

              ]

            }

          },

          "inner_hits": {

            "from": 0,

            "size": 10

          },

          "score_mode": "avg",

          "ignore_unmapped": true

        }

      }

    }

  },

  "size": 20,

  "timeout": "20s"

}

可以看到执行时间并没有什么大的变化；

{

    "took":3117,

    "timed_out":false,

    "_shards":{

        "total":1,

        "successful":1,

        "skipped":0,

        "failed":0

    },

    "hits":{

        "total":1,

        "max_score":0.001722915,

        "hits":[

            {

                 "inner_hits":{

                    "content":{

                        "hits":{

                            "total":400000,

                            "max_score":0.001722915,

                            "hits":[

                             ]

                        }

                    }

                }

            }

        ]

    }

}

现在剩下的只能是跟返回的文档有关系了；

禁止返回主文档，执行如下查询语句；

{

  "_source": false,

  "query": {

    "bool": {

      "should": {

        "nested": {

          "path": "content",

          "query": {

            "query_string": {

              "query": "192.168.1.1*",

              "fields": [

                "content.*"

              ]

            }

          },

          "inner_hits": {

            "from": 0,

            "size": 10

          },

          "score_mode": "avg",

          "ignore_unmapped": true

        }

      }

    }

  },

  "size": 20,

  "timeout": "20s"

}

可以看到时间还是没有什么变化；

{

    "took":2915,

    "timed_out":false,

    "_shards":{

        "total":1,

        "successful":1,

        "skipped":0,

        "failed":0

    },

    "hits":{

        "total":1,

        "max_score":0.001722915,

        "hits":[

            {

                 "inner_hits":{

                    "content":{

                        "hits":{

                            "total":400000,

                            "max_score":0.001722915,

                            "hits":[

                             ]

                        }

                    }

                }

            }

        ]

    }

}

修改查询语句，禁止返回子文档，执行以下语句

{

  "_source": false,

  "query": {

    "bool": {

      "should": {

        "nested": {

          "path": "content",

          "query": {

            "query_string": {

              "query": "192.168.1.1*",

              "fields": [

                "content.*"

              ]

            }

          },

          "inner_hits": {

            "from": 0,

            "size": 0

          },

          "score_mode": "avg",

          "ignore_unmapped": true

        }

      }

    }

  },

  "size": 20,

  "timeout": "20s"

}

可以看到10ms就执行完成了；

{

    "took":10,

    "timed_out":false,

    "_shards":{

        "total":1,

        "successful":1,

        "skipped":0,

        "failed":0

    },

    "hits":{

        "total":1,

        "max_score":0.001722915,

        "hits":[

            {

                "_type":"_doc",

                "_score":0.001722915,

                "inner_hits":{

                    "content":{

                        "hits":{

                            "total":400000,

                            "max_score":0,

                            "hits":[

                            ]

                        }

                    }

                }

            }

        ]

    }

}

五、问题原因分析

通过以上分析我们可以知道，由于返回了10个子文档，导致了执行时间的增长；从直观考虑来说淡出的返回10个不大的文档，不至于会耗时这么长时间啊；

inner hits提供了from和size来控制返回子文档的数量，我们以为可以像普通的查询那样使用，但是这里size的默认值是3，from+size必须小于100；

{

                "type":"illegal_argument_exception",

                "reason":"Inner result window is too large, the inner hit definition's [null]'s from + size must be less than or equal to: [100] but was [101]. This limit can be set by changing the [index.max_inner_result_window] index level setting."

            }

既然有这个限制，那么肯定是inner hit的性能不是很好，肯定跟nested type的存储结构和inner hits的实现机制有关系了；其实由于主文档和所有相关的子文档数据都保存在父文档的source字段，导致返回子文档的时候

，需要加载和解析主文档的source字段，并定位处理子文档；通过上边的查询返回结果可以看到，虽然只匹配了一个主文档，但是这个主文档下有40W的子文档，这么多的文档势必会导致source很大，最终导致执行时间的暴涨；

ested document don’t have a _source field, because the entire source of document is stored with the root document under its _source field. To include the source of just the nested document, the source of the root document is parsed and just the relevant bit for the nested document is included as source in the inner hit. Doing this for each matching nested document has an impact on the time it takes to execute the entire search request, especially when size and the inner hits' size are set higher than the default. To avoid the relatively expensive source extraction for nested inner hits, one can disable including the source and solely rely on doc values fields.

六、解决方案

单个文档只会存储在单个分片上，无法通过增加分片提高查询的速度；
文档提到了禁用source，并依赖doc values字段，但是经测试查询时间基本没有任何改善；
减少返回的子文档个数，可以显著的降低查询时间，例如下边返回3个；

{

    "took":967,

    "timed_out":false,

    "_shards":{

        "total":1,

        "successful":1,

        "skipped":0,

        "failed":0

    },

    "hits":{

        "total":1,

        "max_score":0.001722915,

        "hits":[

            {

                "_type":"_doc",

                "_score":0.001722915,

                "inner_hits":{

                    "content":{

                        "hits":{

                            "total":100008,

                            "max_score":0.001722915

                        }

                    }

                }

            }

        ]

    }

}

elasticsearch之警惕inner hits的性能问题的更多相关文章

【分布式搜索引擎】Elasticsearch如何部署以及优化查询性能
一.Elasticsearch生产集群如何部署 (1)es生产集群部署5台机器,若每台机器是6核64G的,那么集群总内存是320G (2)假如我们es集群的日增量数据大概是2000万条,每天日增量数据 ...
Elasticsearch Rest模式和RPC模式性能比较
Elasticsearch 有两种链接模式,即Rest方式(对应端口9200)和RPC方式(对应端口9300)这两种访问效率到底差多少,在同样的业务逻辑下,测试了一波. 用的JMeter进行压力测试 ...
如何保存JMeter的性能测试数据到ElasticSearch上，并且使用Kibana进行可视化分析（1）
前言 Jmeter是一款性能测试,压力测试的开源工具,被大量的测试人员拿来测试产品的性能,负载等等. Jmeter除了强大的预置的各种插件,各种可视化图表工具以外,也有些固有的缺陷,例如: 我们往往只 ...
Elasticsearch 基础知识要点与性能监控
本文的来源是我翻译国外的一篇技术博客,感谢原作者Emily Chang,原文地址通过如下的知识,我们能大致学到关于ES的一些基本知识,进而对elasticsearch的性能进行监控和调优注意elas ...
elasticsearch 性能监控基础
一.Elasticsearch 是什么 Elasticsearch是一款用Java编写的开源分布式文档存储和搜索引擎,可以用于near real-time存储和数据检索. 1.Elasticsearc ...
让Elasticsearch飞起来!——性能优化实践干货
原文:让Elasticsearch飞起来!--性能优化实践干货版权声明:本文为博主原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明. 本文链接:https://blog ...
Elasticsearch强制重置未分配的分片(unassigned)
强制重置未分片的分片,这个问题源自于Elasticsearch维护中,Node意外退出的场景. 意外退出后Elasticsearch由于网络原因或者jvm性能压力,未能短时间内分配分片. 看一下分片的 ...
财务平台亿级数据量毫秒级查询优化之elasticsearch原理解析
财务平台进行分录分表以后,随着数据量的日渐递增,业务人员对账务数据的实时分析响应时间越来越长,体验性慢慢下降,之前我们基于mysql的性能优化做了一遍,可以说基于mysql该做的优化已经基本上都做了, ...
ES集群性能调优链接汇总
1. 集群稳定性的一些问题(一定量数据后集群变得迟钝) https://elasticsearch.cn/question/84 2. ELK 性能(2) — 如何在大业务量下保持 Elasticse ...

随机推荐

SpringBoot(3)：SpringData 数据访问
一. 简介 Spring Data是一个用于简化数据库访问,并支持云服务的开源框架:其主要目标是使得对数据的访问变得方便快捷.对于数据访问层,无论是 SQL(关系型数据库) 还是 NOSQL(非关系 ...
什么是微服务，SpringBoot和SpringCloud的关系和区别
什么是微服务? 就目前而言对于微服务业界没有一个统一的,标准的定义.但通常而言,微服务是一种架构模式或者说是一种架构风格,它提倡单一应用程序划分为一组小的服务,每个服务在其独立的自己的进程中,服务之间 ...
highchars操作集合
一.tooltip 与鼠标指针的距离想调整tooltip和鼠标指针的距离,官方api 和中文api中都没写,只有轴 label.distance . 但我觉得应该有这个,看源码果然有 tooltip ...
Redis哨兵日常维护
目录一.日常操作指定一个从做新主添加一个从节点添加一个Setinel节点一.日常操作指定一个从做新主有时候需要将当前主节点机器下线,并指定一个高一些性能的从节点接替将其它从节点的sla ...
MySQL慢日志优化
慢日志的性能问题造成 I/O 和 CPU 资源消耗:慢日志通常会扫描大量非目的的数据,自然就会造成 I/O 和 CPU 的资源消耗,影响到其他业务的正常使用,有可能因为单个慢 SQL 就能拖慢整个数 ...
Latex-安装_第一天
LaTex安装 Windows 小知识: \(Tex\)来源technology,希腊词根是\(tex\),Latex应该读成"拉泰赫". https://miktex.org/ ...
LuoguP7257 [COCI2009-2010#3] FILIP 题解
Content 有两个十进制三位数 \(a,b\),请输出这两个数翻转之后的较大数. 数据范围:\(100\leqslant a,b\leqslant 999\),\(a,b\) 中不包含 \(0\) ...
JAVA实现智能分词（通过文章标题生成tag标签）
导入jar包 IKAnalyzer2012_u6.jar下载链接:https://pan.xunlew.com/s86789 maven <dependency> <groupId& ...
js（jQuery）获取自定义data属性的值
有时候因为需要在标签上设置自定义data属性值, <div class="col-sm-6 col-md-4" id="get_id" data-c_id ...
MySQL实现主从库，AB复制配置
AB复制是一种数据复制技术,是myslq数据库提供的一种高可用.高性能的解决方案. AB复制的模式:一主一从 .一主多从.双主.多主多从复制的工作原理:要想实现ab复制,那么前提是master上必须 ...

elasticsearch之警惕inner hits的性能问题

elasticsearch之警惕inner hits的性能问题的更多相关文章

随机推荐

热门专题