Solr实现SQL的查询与统计--转载
原文地址:http://shiyanjun.cn/archives/78.html
Cloudera公司已经推出了基于Hadoop平台的查询统计分析工具Impala,只要熟悉SQL,就可以熟练地使用Impala来执行查询与分析的功能。不过Impala的SQL和关系数据库的SQL还是有一点微妙地不同的。
下面,我们设计一个表,通过该表中的数据,来将SQL查询与统计的语句,使用Solr查询的方式来与SQL查询对应。这个翻译的过程,是非常有趣的,你可以看到Solr一些很不错的功能。
用来示例的表结构设计,如图所示:
下面,我们通过给出一些SQL查询统计语句,然后对应翻译成Solr查询语句,然后对比结果。
查询对比
- 条件组合查询
SQL查询语句:
1 |
SELECT log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type |
2 |
FROM v_i_event |
3 |
WHERE prov_id = 1 AND net_type = 1 AND area_id = 10304 AND time_type = 1 AND time_id >= 20130801 AND time_id <= 20130815 |
4 |
ORDER BY log_id LIMIT 10; |
查询结果,如图所示:
Solr查询URL:
1 |
http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=prov_id:1 AND net_type:1 AND area_id:10304 AND time_type:1 AND time_id:[20130801 TO 20130815]&sort=log_id asc&start=0&rows=10 |
查询结果,如下所示:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">4</int>
</lst>
<result name="response" numFound="77" start="0">
<doc>
<int name="log_id">6827</int>
<long name="start_time">1375072117</long>
<long name="end_time">1375081683</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10304</int>
<int name="idt_id">11002</int>
<int name="cnt">0</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6827</int>
<long name="start_time">1375072117</long>
<long name="end_time">1375081683</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10304</int>
<int name="idt_id">11000</int>
<int name="cnt">0</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6851</int>
<long name="start_time">1375142158</long>
<long name="end_time">1375146391</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10304</int>
<int name="idt_id">14001</int>
<int name="cnt">5</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6851</int>
<long name="start_time">1375142158</long>
<long name="end_time">1375146391</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10304</int>
<int name="idt_id">11002</int>
<int name="cnt">23</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6851</int>
<long name="start_time">1375142158</long>
<long name="end_time">1375146391</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10304</int>
<int name="idt_id">10200</int>
<int name="cnt">55</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6851</int>
<long name="start_time">1375142158</long>
<long name="end_time">1375146391</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10304</int>
<int name="idt_id">14000</int>
<int name="cnt">4</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6851</int>
<long name="start_time">1375142158</long>
<long name="end_time">1375146391</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10304</int>
<int name="idt_id">11000</int>
<int name="cnt">1</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6851</int>
<long name="start_time">1375142158</long>
<long name="end_time">1375146391</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10304</int>
<int name="idt_id">10201</int>
<int name="cnt">31</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6851</int>
<long name="start_time">1375142158</long>
<long name="end_time">1375146391</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10304</int>
<int name="idt_id">8002</int>
<int name="cnt">8</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6851</int>
<long name="start_time">1375142158</long>
<long name="end_time">1375146391</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10304</int>
<int name="idt_id">8000</int>
<int name="cnt">30</int>
<int name="net_type">1</int>
</doc>
</result>
</response>
对比上面结果,除了根据idt_id排序方式不同以外(Impala是升序,Solr是降序),其他是相同的。
- 单个字段分组统计
SQL查询语句:
1 |
SELECT prov_id, SUM (cnt) AS sum_cnt, AVG (cnt) AS avg_cnt, MAX (cnt) AS max_cnt, MIN (cnt) AS min_cnt, COUNT (cnt) AS count_cnt |
2 |
FROM v_i_event |
3 |
GROUP BY prov_id; |
查询结果,如图所示:
Solr查询URL:
1 |
http://slave1:8888/solr-cloud/i_event/select?q=*:*&stats=true&stats.field=cnt&rows=0&indent=true |
查询结果,如下所示:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">2</int>
</lst>
<result name="response" numFound="4088" start="0"></result>
<lst name="stats">
<lst name="stats_fields">
<lst name="cnt">
<double name="min">0.0</double>
<double name="max">1258.0</double>
<long name="count">4088</long>
<long name="missing">0</long>
<double name="sum">32587.0</double>
<double name="sumOfSquares">9170559.0</double>
<double name="mean">7.971379647749511</double>
<double name="stddev">46.69344567709268</double>
<lst name="facets" />
</lst>
</lst>
</lst>
</response>
对比查询结果,Solr提供了更多的统计项,如标准差(stddev)等,与SQL查询结果是一致的。
- IN条件查询
SQL查询语句:
1 |
SELECT log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_typ |
2 |
FROM v_i_event |
3 |
WHERE prov_id = 1 AND net_type = 1 AND city_id IN (106,103) AND idt_id IN (12011,5004,6051,6056,8002) AND time_type = 1 AND time_id >= 20130801 AND time_id <= 20130815 |
4 |
ORDER BY log_id, start_time DESC LIMIT 10; |
查询结果,如图所示:
Solr查询URL:
http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id, cnt,net_type&fq=prov_id:1 AND net_type:1 AND (city_id:106 OR city_id:103) AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND time_id:[20130801 TO 20130815]&sort=log_id asc ,start_time desc&start=0&rows=10
或者:
http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id, cnt ,net_type&fq=prov_id:1&fq=net_type:1&fq=(city_id:106 OR city_id:103)&fq=(idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002)&fq=time_type:1&fq=time_id:[20130801 TO 20130815]&sort=log_id asc,start_time desc&start=0&rows=10
查询结果,如下所示:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">6</int>
</lst>
<result name="response" numFound="63" start="0">
<doc>
<int name="log_id">6553</int>
<long name="start_time">1374054184</long>
<long name="end_time">1374054254</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10307</int>
<int name="idt_id">12011</int>
<int name="cnt">0</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6553</int>
<long name="start_time">1374054184</long>
<long name="end_time">1374054254</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10307</int>
<int name="idt_id">5004</int>
<int name="cnt">2</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6555</int>
<long name="start_time">1374055060</long>
<long name="end_time">1374055158</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">70104</int>
<int name="idt_id">5004</int>
<int name="cnt">3</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6555</int>
<long name="start_time">1374055060</long>
<long name="end_time">1374055158</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">70104</int>
<int name="idt_id">12011</int>
<int name="cnt">0</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6595</int>
<long name="start_time">1374292508</long>
<long name="end_time">1374292639</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10307</int>
<int name="idt_id">5004</int>
<int name="cnt">4</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6611</int>
<long name="start_time">1374461233</long>
<long name="end_time">1374461245</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10307</int>
<int name="idt_id">5004</int>
<int name="cnt">1</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6612</int>
<long name="start_time">1374461261</long>
<long name="end_time">1374461269</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10307</int>
<int name="idt_id">5004</int>
<int name="cnt">1</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6612</int>
<long name="start_time">1374461261</long>
<long name="end_time">1374461269</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10307</int>
<int name="idt_id">12011</int>
<int name="cnt">0</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6613</int>
<long name="start_time">1374461422</long>
<long name="end_time">1374461489</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10307</int>
<int name="idt_id">6056</int>
<int name="cnt">1</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6613</int>
<long name="start_time">1374461422</long>
<long name="end_time">1374461489</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10307</int>
<int name="idt_id">6051</int>
<int name="cnt">1</int>
<int name="net_type">1</int>
</doc>
</result>
</response>
对比查询结果,是一致的。
- 开区间范围条件查询
SQL查询语句:
1 |
SELECT log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type |
2 |
FROM v_i_event |
3 |
WHERE net_type = 1 AND idt_id IN (12011,5004,6051,6056,8002) AND time_type = 1 AND start_time >= 1373598465 AND end_time < 1374055254 |
4 |
ORDER BY log_id, start_time, idt_id DESC LIMIT 30; |
查询结果,如图所示:
Solr查询URL:
1 |
http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1 AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND start_time:[1373598465 TO 1374055254]&fq =-start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30 |
或
1 |
http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1 AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND start_time:[1373598465 TO 1374055254] AND -start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30 |
或
1 |
http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1&fq=idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002&fq =time_type:1&fq=start_time:[1373598465 TO 1374055254]&fq =-start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30 |
查询结果,如下所示:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">5</int>
</lst>
<result name="response" numFound="4" start="0">
<doc>
<int name="log_id">6553</int>
<long name="start_time">1374054184</long>
<long name="end_time">1374054254</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10307</int>
<int name="idt_id">12011</int>
<int name="cnt">0</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6553</int>
<long name="start_time">1374054184</long>
<long name="end_time">1374054254</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">10307</int>
<int name="idt_id">5004</int>
<int name="cnt">2</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6555</int>
<long name="start_time">1374055060</long>
<long name="end_time">1374055158</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">70104</int>
<int name="idt_id">12011</int>
<int name="cnt">0</int>
<int name="net_type">1</int>
</doc>
<doc>
<int name="log_id">6555</int>
<long name="start_time">1374055060</long>
<long name="end_time">1374055158</long>
<int name="prov_id">1</int>
<int name="city_id">103</int>
<int name="area_id">70104</int>
<int name="idt_id">5004</int>
<int name="cnt">3</int>
<int name="net_type">1</int>
</doc>
</result>
</response>
- 多个字段分组统计(只支持count函数)
SQL查询语句:
1 |
SELECT city_id, area_id, COUNT (cnt) AS count_cnt |
2 |
FROM v_i_event |
3 |
WHERE prov_id = 1 AND net_type = 1 |
4 |
GROUP BY city_id, area_id; |
查询结果,如图所示:
Solr查询URL:
1 |
http://slave1:8888/solr-cloud/i_event/select?q=*:*&facet=true&facet.pivot=city_id,area_id&fq=prov_id:1 AND net_type:1&rows=0&indent=true |
查询结果,如下所示:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">72</int>
</lst>
<result name="response" numFound="1171" start="0"></result>
<lst name="facet_counts">
<lst name="facet_queries" />
<lst name="facet_fields" />
<lst name="facet_dates" />
<lst name="facet_ranges" />
<lst name="facet_pivot">
<arr name="city_id,area_id">
<lst>
<str name="field">city_id</str>
<int name="value">103</int>
<int name="count">678</int>
<arr name="pivot">
<lst>
<str name="field">area_id</str>
<int name="value">10307</int>
<int name="count">298</int>
</lst>
<lst>
<str name="field">area_id</str>
<int name="value">10315</int>
<int name="count">120</int>
</lst>
<lst>
<str name="field">area_id</str>
<int name="value">10317</int>
<int name="count">86</int>
</lst>
<lst>
<str name="field">area_id</str>
<int name="value">10304</int>
<int name="count">67</int>
</lst>
<lst>
<str name="field">area_id</str>
<int name="value">10310</int>
<int name="count">49</int>
</lst>
<lst>
<str name="field">area_id</str>
<int name="value">70104</int>
<int name="count">48</int>
</lst>
<lst>
<str name="field">area_id</str>
<int name="value">10308</int>
<int name="count">6</int>
</lst>
<lst>
<str name="field">area_id</str>
<int name="value">0</int>
<int name="count">2</int>
</lst>
<lst>
<str name="field">area_id</str>
<int name="value">10311</int>
<int name="count">2</int>
</lst>
</arr>
</lst>
<lst>
<str name="field">city_id</str>
<int name="value">0</int>
<int name="count">463</int>
<arr name="pivot">
<lst>
<str name="field">area_id</str>
<int name="value">0</int>
<int name="count">395</int>
</lst>
<lst>
<str name="field">area_id</str>
<int name="value">10307</int>
<int name="count">68</int>
</lst>
</arr>
</lst>
<lst>
<str name="field">city_id</str>
<int name="value">106</int>
<int name="count">10</int>
<arr name="pivot">
<lst>
<str name="field">area_id</str>
<int name="value">10304</int>
<int name="count">10</int>
</lst>
</arr>
</lst>
<lst>
<str name="field">city_id</str>
<int name="value">110</int>
<int name="count">8</int>
<arr name="pivot">
<lst>
<str name="field">area_id</str>
<int name="value">0</int>
<int name="count">8</int>
</lst>
</arr>
</lst>
<lst>
<str name="field">city_id</str>
<int name="value">118</int>
<int name="count">8</int>
<arr name="pivot">
<lst>
<str name="field">area_id</str>
<int name="value">10316</int>
<int name="count">8</int>
</lst>
</arr>
</lst>
<lst>
<str name="field">city_id</str>
<int name="value">105</int>
<int name="count">4</int>
<arr name="pivot">
<lst>
<str name="field">area_id</str>
<int name="value">0</int>
<int name="count">4</int>
</lst>
</arr>
</lst>
</arr>
</lst>
</lst>
</response>
对比上面结果,Solr查询结果,需要从上面的各组中进行合并,得到最终的统计结果,结果和SQL结果是一致的。
- 多个字段分组统计(支持count、sum、max、min等函数)
一次对多个字段进行独立分组统计,Solr可以很好的支持。这相当于执行两个带有GROUP BY子句的SQL,这两个GROUP BY分别只对一个字段进行汇总统计。
SQL查询语句:
1 |
SELECT city_id, area_id, COUNT (cnt) AS count_cnt |
2 |
FROM v_i_event |
3 |
WHERE prov_id = 1 AND net_type = 1 |
4 |
GROUP BY city_id; |
5 |
6 |
SELECT city_id, area_id, COUNT (cnt) AS count_cnt |
7 |
FROM v_i_event |
8 |
WHERE prov_id = 1 AND net_type = 1 |
9 |
GROUP BY area_id; |
查询结果,不再显示。
Solr查询URL:
1 |
>http://slave1:8888/solr-cloud/i_event/select?q=*:*&stats=true&stats.field=cnt&f.cnt.stats.facet=city_id&&f.cnt.stats.facet=area_id&fq=prov_id:1 AND net_type:1&rows=0&indent=true |
查询结果,如下所示:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">6</int>
</lst>
<result name="response" numFound="1171" start="0"></result>
<lst name="stats">
<lst name="stats_fields">
<lst name="cnt">
<double name="min">0.0</double>
<double name="max">167.0</double>
<long name="count">1171</long>
<long name="missing">0</long>
<double name="sum">3701.0</double>
<double name="sumOfSquares">249641.0</double>
<double name="mean">3.1605465414175917</double>
<double name="stddev">14.260812879164407</double>
<lst name="facets">
<lst name="city_id">
<lst name="0">
<double name="min">0.0</double>
<double name="max">167.0</double>
<long name="count">463</long>
<long name="missing">0</long>
<double name="sum">2783.0</double>
<double name="sumOfSquares">238819.0</double>
<double name="mean">6.010799136069115</double>
<double name="stddev">21.92524420257807</double>
<lst name="facets" />
</lst>
<lst name="110">
<double name="min">0.0</double>
<double name="max">1.0</double>
<long name="count">8</long>
<long name="missing">0</long>
<double name="sum">3.0</double>
<double name="sumOfSquares">3.0</double>
<double name="mean">0.375</double>
<double name="stddev">0.5175491695067657</double>
<lst name="facets" />
</lst>
<lst name="106">
<double name="min">0.0</double>
<double name="max">0.0</double>
<long name="count">10</long>
<long name="missing">0</long>
<double name="sum">0.0</double>
<double name="sumOfSquares">0.0</double>
<double name="mean">0.0</double>
<double name="stddev">0.0</double>
<lst name="facets" />
</lst>
<lst name="105">
<double name="min">0.0</double>
<double name="max">0.0</double>
<long name="count">4</long>
<long name="missing">0</long>
<double name="sum">0.0</double>
<double name="sumOfSquares">0.0</double>
<double name="mean">0.0</double>
<double name="stddev">0.0</double>
<lst name="facets" />
</lst>
<lst name="103">
<double name="min">0.0</double>
<double name="max">55.0</double>
<long name="count">678</long>
<long name="missing">0</long>
<double name="sum">915.0</double>
<double name="sumOfSquares">10819.0</double>
<double name="mean">1.3495575221238938</double>
<double name="stddev">3.7625525739676986</double>
<lst name="facets" />
</lst>
<lst name="118">
<double name="min">0.0</double>
<double name="max">0.0</double>
<long name="count">8</long>
<long name="missing">0</long>
<double name="sum">0.0</double>
<double name="sumOfSquares">0.0</double>
<double name="mean">0.0</double>
<double name="stddev">0.0</double>
<lst name="facets" />
</lst>
</lst>
<lst name="area_id">
<lst name="10308">
<double name="min">0.0</double>
<double name="max">1.0</double>
<long name="count">6</long>
<long name="missing">0</long>
<double name="sum">1.0</double>
<double name="sumOfSquares">1.0</double>
<double name="mean">0.16666666666666666</double>
<double name="stddev">0.408248290463863</double>
<lst name="facets" />
</lst>
<lst name="10310">
<double name="min">0.0</double>
<double name="max">5.0</double>
<long name="count">49</long>
<long name="missing">0</long>
<double name="sum">40.0</double>
<double name="sumOfSquares">108.0</double>
<double name="mean">0.8163265306122449</double>
<double name="stddev">1.2528878206593208</double>
<lst name="facets" />
</lst>
<lst name="0">
<double name="min">0.0</double>
<double name="max">167.0</double>
<long name="count">409</long>
<long name="missing">0</long>
<double name="sum">2722.0</double>
<double name="sumOfSquares">238550.0</double>
<double name="mean">6.6552567237163816</double>
<double name="stddev">23.243931908854</double>
<lst name="facets" />
</lst>
<lst name="10311">
<double name="min">0.0</double>
<double name="max">0.0</double>
<long name="count">2</long>
<long name="missing">0</long>
<double name="sum">0.0</double>
<double name="sumOfSquares">0.0</double>
<double name="mean">0.0</double>
<double name="stddev">0.0</double>
<lst name="facets" />
</lst>
<lst name="10304">
<double name="min">0.0</double>
<double name="max">55.0</double>
<long name="count">77</long>
<long name="missing">0</long>
<double name="sum">370.0</double>
<double name="sumOfSquares">9476.0</double>
<double name="mean">4.805194805194805</double>
<double name="stddev">10.064318107786017</double>
<lst name="facets" />
</lst>
<lst name="70104">
<double name="min">0.0</double>
<double name="max">3.0</double>
<long name="count">48</long>
<long name="missing">0</long>
<double name="sum">51.0</double>
<double name="sumOfSquares">117.0</double>
<double name="mean">1.0625</double>
<double name="stddev">1.1560433254047038</double>
<lst name="facets" />
</lst>
<lst name="10307">
<double name="min">0.0</double>
<double name="max">12.0</double>
<long name="count">366</long>
<long name="missing">0</long>
<double name="sum">274.0</double>
<double name="sumOfSquares">768.0</double>
<double name="mean">0.7486338797814208</double>
<double name="stddev">1.2418218134151426</double>
<lst name="facets" />
</lst>
<lst name="10315">
<double name="min">0.0</double>
<double name="max">4.0</double>
<long name="count">120</long>
<long name="missing">0</long>
<double name="sum">143.0</double>
<double name="sumOfSquares">359.0</double>
<double name="mean">1.1916666666666667</double>
<double name="stddev">1.2588899560996694</double>
<lst name="facets" />
</lst>
<lst name="10316">
<double name="min">0.0</double>
<double name="max">0.0</double>
<long name="count">8</long>
<long name="missing">0</long>
<double name="sum">0.0</double>
<double name="sumOfSquares">0.0</double>
<double name="mean">0.0</double>
<double name="stddev">0.0</double>
<lst name="facets" />
</lst>
<lst name="10317">
<double name="min">0.0</double>
<double name="max">5.0</double>
<long name="count">86</long>
<long name="missing">0</long>
<double name="sum">100.0</double>
<double name="sumOfSquares">262.0</double>
<double name="mean">1.1627906976744187</double>
<double name="stddev">1.3093371930442208</double>
<lst name="facets" />
</lst>
</lst>
</lst>
</lst>
</lst>
</lst>
</response>
- 多个字段联合分组统计(支持count、sum、max、min等函数)
SQL查询语句:
1 |
SELECT city_id, area_id, SUM (cnt) AS sum_cnt, AVG (cnt) AS avg_cnt, MAX (cnt) AS max_cnt, MIN (cnt) AS min_cnt, COUNT (cnt) AS count_cnt |
2 |
FROM v_i_event |
3 |
WHERE prov_id = 1 AND net_type = 1 |
4 |
GROUP BY city_id, area_id; |
查询结果,如图所示:
Solr目前不能简单的支持这种查询,如果想要满足这种查询统计,需要在schema的设计上,将一个字段设置为多值,然后通过多个值进行分组统计。如果应用中查询统计分析的模式比较固定,预先知道哪些字段会用于联合分组统计,完全可以在设计的时候,考虑设置多值字段来满足这种需求。
参考链接
Solr实现SQL的查询与统计--转载的更多相关文章
- Solr高效利用:Solr实现SQL的查询与统计
1.如何高效使用Solr查询功能 ?2.单个字段分组统计如何实现? 3.IN条件查询有几种方式? 4.多个字段分组统计是否只支持count? Cloudera公司已经推出了基于Hadoop平台的查询统 ...
- sql语句查询经纬度范围(转载,源链接失效)
MySQL性能调优 – 使用更为快速的算法进行距离 最近遇到了一个问题,通过不断的尝试最终将某句原本占据近1秒的查询优化到了0.01秒,效率提高了100倍. 问题是这样的,有一张存放用户居住地点经纬度 ...
- 服务器文档下载zip格式 SQL Server SQL分页查询 C#过滤html标签 EF 延时加载与死锁 在JS方法中返回多个值的三种方法(转载) IEnumerable,ICollection,IList接口问题 不吹不擂,你想要的Python面试都在这里了【315+道题】 基于mvc三层架构和ajax技术实现最简单的文件上传 事件管理
服务器文档下载zip格式 刚好这次项目中遇到了这个东西,就来弄一下,挺简单的,但是前台调用的时候弄错了,浪费了大半天的时间,本人也是菜鸟一枚.开始吧.(MVC的) @using Rattan.Co ...
- Linq to SQL 语法查询(链接查询,子查询 & in操作 & join,分组统计等)
Linq to SQL 语法查询(链接查询,子查询 & in操作 & join,分组统计等) 子查询 描述:查询订单数超过5的顾客信息 查询句法: var 子查询 = from c i ...
- mysql统计类似SQL语句查询次数
mysql统计类似SQL语句查询次数 vc-mysql-sniffer 工具抓取的sql分析. 1.先用shell脚本把所有enter符号替换为null,再根据语句前后的字符分隔语句 grep -Ev ...
- [转载]编写SQL语句查询出每个各科班分数最高的同学的名字,班级名称,课程名称,分数
[转载]编写SQL语句查询出每个各科班分数最高的同学的名字,班级名称,课程名称,分数 转载自:https://blog.csdn.net/one_money/article/details/56921 ...
- 【转载】C#常用数据库Sqlserver通过SQL语句查询数据库以及表的大小
在Sqlserver数据库中,一般我们查看数据库的大小可以通过查找到数据库文件来查看,但如果要查找数据表Table的大小的话,则不可通过此方法,在Sqlserver数据库中,提供了相应的SQL语句来查 ...
- thinkphp区间查询、统计查询、SQL直接查询
区间查询 $data['id']=array(array('gt',4),array('lt',10));//默认关系是(and)并且的关系 //SELECT * FROM `tp_user` WHE ...
- 浅谈MySQL中优化sql语句查询常用的30种方法 - 转载
浅谈MySQL中优化sql语句查询常用的30种方法 1.对查询进行优化,应尽量避免全表扫描,首先应考虑在 where 及 order by 涉及的列上建立索引. 2.应尽量避免在 where 子句中使 ...
随机推荐
- 20155338 2016-2017-2 《Java程序设计》第3周学习总结
20155338 2016-2017-2 <Java程序设计>第3周学习总结 教材学习内容总结 本周学习量比较多,但是知识点并不是特别难,学习了书本的第四五章,其中个人重点学习了数组对象. ...
- MySQL优化Explain命令简介(二)
type列 MySQL手册上注明type列用于描述join type,不过我们认为把这一列视为对access type--即MySQL决定如何在表中寻找数据的方式的描述,更加合适一些,以下所示从最坏情 ...
- L013-linux基础正则表达式手把手实战讲解小节
L013-linux基础正则表达式手把手实战讲解小节 这么一看又有10天没更新博客了,最近也一直在学就是时间比较闲散,再加上做上次老师留的十多道题,所以时间比较紧张,本来做完题准备直接先看L014讲解 ...
- 如何运用 Powershell 修改Office365和AD账户
这段时间需要大量地修改AD用户的一些属性,例如邮件,UPN,登录名等等,以便和Office365的登录账号保持一致.写了个简单脚本进行批量修改. #Import AD ModuleImport-Mod ...
- idea下增加scala
1 idea工具下,下载scala插件 2 idea下新建scala工程 File——New——module 如果按照上图,设置后点击下载,出现下图下载过慢情况下, 这里我选择了等待,大概等了半小时才 ...
- Parcel 打包器简单使用记录
本文是构造 UI 轮子过程中搭建项目初始化时使用 Parcel 作为打包器的简要使用记录. 安装 参考 官方文档 使用 npm 进行 parcel-bundler 的安装. npm i -D parc ...
- Java跨平台的实现原理
不同操作系统支持的指令集有所差异,只要在不同操作系统上安装对应的jvm,jvm负责把Java字节码翻译成对应机器的二进制码,从而实现java语言的跨平台.
- 无法找到 ContextLoaderListener 类
问题:java.lang.ClassNotFoundException: org.springframework.web.context.ContextLoaderListener 原因:Eclips ...
- 【win10系统问题】远程桌面登录一次后,第二次登录看不到用户名和密码输入框
[win10系统远程桌面登录问题] 远程桌面登录某服务器一次后,第二次登录看不到用户名和密码输入框 [解决方法] 在注册表里找到该路径下的远程服务器ip,删除即可: HKEY_CURRENT_USER ...
- “Hello World!团队”Beta发布—视频链接+文案+美工
视频链接:http://v.youku.com/v_show/id_XMzE3MjEyMzkyMA==.html?spm=a2h3j.8428770.3416059.1 文案+美工:http://ww ...