把 Elasticsearch 当数据库使：聚合后排序

使用 https://github.com/taowen/es-monitor 可以用 SQL 进行 elasticsearch 的查询。有的时候分桶聚合之后会产生很多的桶，我们只对其中部分的桶关心。最简单的办法就是排序之后然后取前几位的结果。

ORDER BY _term

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200

SELECT ipo_year, COUNT(*) FROM symbol GROUP BY ipo_year ORDER BY ipo_year LIMIT 2

EOF

{"COUNT(*)": 4, "ipo_year": 1972}

{"COUNT(*)": 1, "ipo_year": 1973}

Elasticsearch

{

  "aggs": {

    "ipo_year": {

      "terms": {

        "field": "ipo_year",

        "order": [

          {

            "_term": "asc"

          }

        ],

        "size": 2

      },

      "aggs": {}

    }

  },

  "size": 0

}

因为 ipo_year 是 GROUP BY 的字段，所以按这个排序用_term指代。

{

  "hits": {

    "hits": [],

    "total": 6714,

    "max_score": 0.0

  },

  "_shards": {

    "successful": 1,

    "failed": 0,

    "total": 1

  },

  "took": 3,

  "aggregations": {

    "ipo_year": {

      "buckets": [

        {

          "key": 1972,

          "doc_count": 4

        },

        {

          "key": 1973,

          "doc_count": 1

        }

      ],

      "sum_other_doc_count": 2893,

      "doc_count_error_upper_bound": 0

    }

  },

  "timed_out": false

}

ORDER BY _count

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200

SELECT ipo_year, COUNT(*) AS ipo_count FROM symbol GROUP BY ipo_year ORDER BY ipo_count LIMIT 2

EOF

{"ipo_count": 1, "ipo_year": 1973}

{"ipo_count": 2, "ipo_year": 1980}

Elasticsearch

{

  "aggs": {

    "ipo_year": {

      "terms": {

        "field": "ipo_year",

        "order": [

          {

            "_count": "asc"

          }

        ],

        "size": 2

      },

      "aggs": {}

    }

  },

  "size": 0

}

{

  "hits": {

    "hits": [],

    "total": 6714,

    "max_score": 0.0

  },

  "_shards": {

    "successful": 1,

    "failed": 0,

    "total": 1

  },

  "took": 2,

  "aggregations": {

    "ipo_year": {

      "buckets": [

        {

          "key": 1973,

          "doc_count": 1

        },

        {

          "key": 1980,

          "doc_count": 2

        }

      ],

      "sum_other_doc_count": 2895,

      "doc_count_error_upper_bound": -1

    }

  },

  "timed_out": false

}

ORDER BY 指标

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200

    SELECT ipo_year, MAX(market_cap) AS max_market_cap FROM symbol

    GROUP BY ipo_year ORDER BY max_market_cap LIMIT 2

EOF

{"max_market_cap": 826830000.0, "ipo_year": 1982}

{"max_market_cap": 847180000.0, "ipo_year": 2016}

Elasticsearch

{

  "aggs": {

    "ipo_year": {

      "terms": {

        "field": "ipo_year",

        "order": [

          {

            "max_market_cap": "asc"

          }

        ],

        "size": 2

      },

      "aggs": {

        "max_market_cap": {

          "max": {

            "field": "market_cap"

          }

        }

      }

    }

  },

  "size": 0

}

{

  "hits": {

    "hits": [],

    "total": 6714,

    "max_score": 0.0

  },

  "_shards": {

    "successful": 1,

    "failed": 0,

    "total": 1

  },

  "took": 20,

  "aggregations": {

    "ipo_year": {

      "buckets": [

        {

          "max_market_cap": {

            "value": 826830000.0

          },

          "key": 1982,

          "doc_count": 4

        },

        {

          "max_market_cap": {

            "value": 847180000.0

          },

          "key": 2016,

          "doc_count": 6

        }

      ],

      "sum_other_doc_count": 2888,

      "doc_count_error_upper_bound": -1

    }

  },

  "timed_out": false

}

HISTOGRAM 和 ORDER BY

除了 terms aggregation，其他 aggregation 也支持 order by 但是并不完善。比如 histogram aggregation 支持 sort 但是并不支持 size （也就是可以ORDER BY 但是不能 LIMIT）。官方有计划增加一个通用的支持 LIMIT 的方式，不过还没有实现：https://github.com/elastic/elasticsearch/issues/14928

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200

    SELECT ipo_year_range, MAX(market_cap) AS max_market_cap FROM symbol

    GROUP BY histogram(ipo_year, 10) AS ipo_year_range ORDER BY ipo_year_range

EOF

{"ipo_year_range": 1970, "max_market_cap": 18370000000.0}

{"ipo_year_range": 1980, "max_market_cap": 522690000000.0}

{"ipo_year_range": 1990, "max_market_cap": 230940000000.0}

{"ipo_year_range": 2000, "max_market_cap": 470490000000.0}

{"ipo_year_range": 2010, "max_market_cap": 287470000000.0}

Elasticsearch

{

  "aggs": {

    "ipo_year_range": {

      "aggs": {

        "max_market_cap": {

          "max": {

            "field": "market_cap"

          }

        }

      },

      "histogram": {

        "field": "ipo_year",

        "interval": 10,

        "order": {

          "_key": "asc"

        }

      }

    }

  },

  "size": 0

}

{

  "hits": {

    "hits": [],

    "total": 6714,

    "max_score": 0.0

  },

  "_shards": {

    "successful": 1,

    "failed": 0,

    "total": 1

  },

  "took": 2,

  "aggregations": {

    "ipo_year_range": {

      "buckets": [

        {

          "max_market_cap": {

            "value": 18370000000.0

          },

          "key": 1970,

          "doc_count": 5

        },

        {

          "max_market_cap": {

            "value": 522690000000.0

          },

          "key": 1980,

          "doc_count": 155

        },

        {

          "max_market_cap": {

            "value": 230940000000.0

          },

          "key": 1990,

          "doc_count": 598

        },

        {

          "max_market_cap": {

            "value": 470490000000.0

          },

          "key": 2000,

          "doc_count": 745

        },

        {

          "max_market_cap": {

            "value": 287470000000.0

          },

          "key": 2010,

          "doc_count": 1395

        }

      ]

    }

  },

  "timed_out": false

}

把 Elasticsearch 当数据库使：聚合后排序的更多相关文章

es聚合后排序
注意: es版本至少6.1以上语句: GET 76/sessions/_search { "size": 0, "query": { "bool&q ...
ElasticSearch 2 (34) - 信息聚合系列之多值排序
ElasticSearch 2 (34) - 信息聚合系列之多值排序摘要多值桶(terms.histogram 和 date_histogram)动态生成很多桶,Elasticsearch 是如何 ...
ElasticSearch 2 (37) - 信息聚合系列之内存与延时
ElasticSearch 2 (37) - 信息聚合系列之内存与延时摘要控制内存使用与延时版本 elasticsearch版本: elasticsearch-2.x 内容 Fielddata ...
[SQL] SQL 基础知识梳理（三） - 聚合和排序
SQL 基础知识梳理(三) - 聚合和排序 [博主]反骨仔 [原文]http://www.cnblogs.com/liqingwen/p/5926689.html 序这是<SQL 基础知识梳理 ...
ElasticSearch 2 (35) - 信息聚合系列之近似聚合
ElasticSearch 2 (35) - 信息聚合系列之近似聚合摘要如果所有的数据都在一台机器上,那么生活会容易许多,CS201 课商教的经典算法就足够应付这些问题.但如果所有的数据都在一台机 ...
ElasticSearch 2 (29) - 信息聚合系列之测试驱动
ElasticSearch 2 (29) - 信息聚合系列之测试驱动摘要我们可以用以下几页定义不同的聚合和它们的语法,但学习聚合的最佳途径就是用实例来说明.一旦我们获得了聚合的思想,以及如何合理地 ...
Python全栈 MongoDB 数据库（聚合、二进制、GridFS、pymongo模块）
断网了2天今天补上聚合操作: 对文档的信息进行整理统计的操作返回:统计后的文档集合 db.collection.aggregate() 功能:聚合函数,完成聚合操作参数:聚合条件,配 ...
使用Multipath进行多链路聚合并对聚合后的设备固定命名
使用Multipath进行多链路聚合并对聚合后的设备固定命名 1.启用Multipath: (1)启动multipathd服务 #service multipathd start 或者 #/etc/i ...
ElasticSearch 2 (33) - 信息聚合系列之聚合过滤
ElasticSearch 2 (33) - 信息聚合系列之聚合过滤摘要聚合范围限定还有一个自然的扩展就是过滤.因为聚合是在查询结果范围内操作的,任何可以适用于查询的过滤器也可以应用在聚合上. 版 ...

随机推荐

【慕课网实战】Spark Streaming实时流处理项目实战笔记十一之铭文升级版
铭文一级: 第8章 Spark Streaming进阶与案例实战黑名单过滤访问日志 ==> DStream20180808,zs20180808,ls20180808,ww ==> ( ...
MyGeneration使用概述
1.首先要连接数据库,第一次启动myG的时候会弹出default settings对话框,以后也可以在Edit-default settings里面修改.default settings有3个tabs ...
从点到面，给Button的属性动画
属性动画是API 11加进来的一个新特性,其实在现在来说也没什么新的了.属性动画可以对任意view的属性做动画,实现动画的原理就是在给定的时间内把属性从一个值变为另一个值.因此可以说属性动画什么都可以 ...
2018-03-10 VCard备份恢复联系人
主要在VCardComposer类中备份联系人的逻辑导出流程: http://blog.csdn.net/michael_yt/article/details/78270537 导入流程: http ...
MVC框架-.net-摘
MVC模式(三层架构模式)(Model-View-Controller)是软件工程中的一种软件架构模式,把软件系统分为三个基本部分:模型(Model).视图(View)和控制器(Controller) ...
C++指针二（易错模型）
规则一:Main(主调函数)分配的内存(在堆区,栈区.全局区)都可以在被调用函数里使用.如果在被调用函数里面的临时区(栈)分配内存,主调用函数是不能使用的. #include "stdio. ...
QT源码查看001-QApplication和QCoreApplication
QCoreApplication和QApplication的区别(1) QApplication这个类是继承QCoreApplication的,而QCoreApplication有继承QObject的 ...
5、Makefile基础知识汇总（转自陈皓总述）
一.Makefile里有什么? Makefile里主要包含了五个东西:显式规则.隐晦规则.变量定义.文件指示和注释. 1.显式规则.显式规则说明了,如何生成一个或多的的目标文件.这是由Makefile ...
Spring中注入bean学习的总结
1.在类上直接加注解@Component,那么这个类就直接注入到Spring容器中了 ,像@Contrloller,@Service这些本质上都是@Component, 2.@Configurati ...
Android-Kotlin-具名参数
先看一个这样的案例,[案例一]: package cn.kotlin.kotlin_base05 fun showAction1(country: String, volk: String) { pr ...

把 Elasticsearch 当数据库使：聚合后排序

ORDER BY _term

ORDER BY _count

ORDER BY 指标

HISTOGRAM 和 ORDER BY

把 Elasticsearch 当数据库使：聚合后排序的更多相关文章

随机推荐

热门专题