背景介绍
记录共128W条!
SELECT cpe_id, COUNT(*) restarts
FROM business_log
WHERE operate_time>='2012-12-05 00:00:00' AND operate_time<'2018-01-05 00:00:00' AND operate_type=3 AND result=0
GROUP BY cpe_id
尝试对原SQL语句进行优化后发现,统计速度依旧没有获得满意的提升。单独运行条件查询语句(不包含GROUP BY和COUNT函数)后发现,查询的结果数据量只有6655条,耗时0.825s;加上统计语句后,时间飙升至3s。
原理
mysql 解释:
MySQL说明文档中关于优化GROUP BY的部分指出:The most general way to satisfy a GROUP BY clause is to scan the whole table and create a new temporary table where all rows from each group are consecutive, and then use this temporary table to discover groups and apply aggregate functions (if any)。即,GROUP BY语句会扫描全表并新建一个临时表用来分组存放数据,然后根据临时表中的分组对数据执行聚合函数。现在的问题聚焦在:如果GROUP BY和WHERE在同一个语句中,这个“全表”指的是物理表还是WHERE过滤后的数据集合?
SELECT cpe_id, COUNT(*) restarts
FROM (
SELECT cpe_id
FROM business_log
WHERE operate_time>='2012-12-05 00:00:00' AND operate_time<'2018-01-05 00:00:00' AND operate_type=3 AND result=0
) t
GROUP BY cpe_id
---------------------
如上述语句所示,在查询语句外包了一个统计语句。执行结果:0.851s。时间消耗大幅减少!
结论
利用GROUP BY统计大数据时,应当将查询与统计分离,优化查询语句。
-
实际业务中
原查询过程
SELECT SUM(a.reportcount)reportcount,a.OrganizationId,a.organizationcode,a.organizationname,org.UpperComCode,org.organizationlevel FROM (SELECT COUNT(1) ReportCount, o.OrganizationCode ,o.organizationname, o.id OrganizationId,o.organizationlevel FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode WHERE o.OrganizationCode in ('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000') and r.ReportTime>'2019/4/25 0:00:00' and r.ReportReasonSubmitCode in ('A10006','A10007') and r.ReportTime<'2019/5/20 0:00:00' and r.LossTime>'2019/4/11 0:00:00' and r.LossTime<'2019/5/6 0:00:00' and r.InsuranceType in (5,6,7,8) AND r.CatasCollectionId=361 GROUP BY o.OrganizationCode,o.organizationname, o.id,o.organizationlevel union all SELECT COUNT(1) ReportCount, o.OrganizationCode,o.organizationname, o.id OrganizationId,o.organizationlevel FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode WHERE o.UpperComCode IN('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000')and r.ReportTime>'2019/4/25 0:00:00' and r.ReportReasonSubmitCode in ('A10006','A10007') and r.ReportTime<'2019/5/20 0:00:00' and r.LossTime>'2019/4/11 0:00:00' and r.LossTime<'2019/5/6 0:00:00' and r.InsuranceType in (5,6,7,8) AND r.CatasCollectionId=361 GROUP BY o.OrganizationCode,o.organizationname, o.id ,o.organizationlevel union all SELECT COUNT(1) ReportCount , o.UpperComCode OrganizationCode,o.ParentOrgName OrganizationName, o.ParentOrgId OrganizationId,o.organizationlevel FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode WHERE o.UpperComCode IN(SELECT OrganizationCode FROM organization WHERE UpperComCode IN ('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000' )and r.ReportTime>'2019/4/25 0:00:00' and r.ReportReasonSubmitCode in ('A10006','A10007') and r.ReportTime<'2019/5/20 0:00:00' and r.LossTime>'2019/4/11 0:00:00' and r.LossTime<'2019/5/6 0:00:00' and r.InsuranceType in (5,6,7,8) AND r.CatasCollectionId=361) GROUP BY o.UpperComCode ,o.ParentOrgName, o.ParentOrgId,o.organizationlevel union all SELECT SUM(reportcount) reportcount , organization.UpperComCode OrganizationCode, organization.ParentOrgName OrganizationName,organization.ParentOrgId OrganizationId,organization.OrganizationLevel FROM ( SELECT COUNT(1) reportcount , o.UpperComCode newcode, o.organizationlevel FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode WHERE o.UpperComCode IN( SELECT OrganizationCode FROM organization WHERE UpperComCode IN(SELECT OrganizationCode FROM organization WHERE UpperComCode IN ('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000' )and r.ReportTime>'2019/4/25 0:00:00' and r.ReportReasonSubmitCode in ('A10006','A10007') and r.ReportTime<'2019/5/20 0:00:00' and r.LossTime>'2019/4/11 0:00:00' and r.LossTime<'2019/5/6 0:00:00' and r.InsuranceType in (5,6,7,8) AND r.CatasCollectionId=361 ) ) GROUP BY o.UpperComCode, o.organizationlevel ) a JOIN organization ON a.newcode = organization.OrganizationCode GROUP BY organization.UpperComCode , organization.ParentOrgName ,organization.ParentOrgId ,organization.OrganizationLevel ) a JOIN organization org ON a.organizationcode=org.organizationcode GROUP BY a.organizationcode,a.organizationname,org.UpperComCode,org.organizationlevel,a.OrganizationId ORDER BY OrganizationLevel
优化结果
SELECT COUNT(1) AS reportcount,a.OrganizationId,a.organizationcode,a.organizationname,a.UpperComCode,a.organizationlevel FROM (
SELECT r.*,r.id AS reportid, o.OrganizationCode ,o.organizationname, o.id OrganizationId,o.organizationlevel ,o.UpperComCode
FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode
WHERE o.OrganizationCode IN ('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000') AND r.ReportTime>'2010/4/25 0:00:00' AND r.ReportReasonSubmitCode IN ('A10006','A10007') AND r.ReportTime<'2019/5/20 0:00:00' AND r.LossTime>'2010/4/11 0:00:00' AND r.LossTime<'2019/5/6 0:00:00' AND r.InsuranceType IN (5,6,7,8) AND r.CatasCollectionId=0
UNION ALL
SELECT r.id AS reportid , o.OrganizationCode,o.organizationname, o.id OrganizationId,o.organizationlevel ,o.UpperComCode
FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode
WHERE o.UpperComCode IN('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000')AND r.ReportTime>'2010/4/25 0:00:00' AND r.ReportReasonSubmitCode IN ('A10006','A10007') AND r.ReportTime<'2019/5/20 0:00:00' AND r.LossTime>'2010/4/11 0:00:00' AND r.LossTime<'2019/5/6 0:00:00' AND r.InsuranceType IN (5,6,7,8) AND r.CatasCollectionId=0
UNION ALL
SELECT r.id AS reportid , o.UpperComCode OrganizationCode,o.ParentOrgName OrganizationName, o.ParentOrgId OrganizationId,ou.organizationlevel ,ou.UpperComCode
FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode
JOIN organization ou ON o.UpperComCode=ou.OrganizationCode
WHERE o.UpperComCode IN
(SELECT OrganizationCode FROM organization WHERE UpperComCode IN
('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000' )AND r.ReportTime>'2010/4/25 0:00:00' AND r.ReportReasonSubmitCode IN ('A10006','A10007') AND r.ReportTime<'2019/5/20 0:00:00' AND r.LossTime>'2010/4/11 0:00:00' AND r.LossTime<'2019/5/6 0:00:00' AND r.InsuranceType IN (5,6,7,8) AND r.CatasCollectionId=0)
) a
GROUP BY a.OrganizationCode,a.organizationname, a.OrganizationId,a.organizationlevel,a.UpperComCode
ORDER BY a.organizationlevel
- Web 性能优化: 使用 Webpack 分离数据的正确方法
摘要: Webpack骚操作. 原文:Web 性能优化: 使用 Webpack 分离数据的正确方法 作者:前端小智 Fundebug经授权转载,版权归原作者所有. 制定向用户提供文件的最佳方式可能是一 ...
- MySQL查询语句执行过程及性能优化-查询过程及优化方法(JOIN/ORDER BY)
在上一篇文章MySQL查询语句执行过程及性能优化-基本概念和EXPLAIN语句简介中介绍了EXPLAIN语句,并举了一个慢查询例子:
- MySQL有关Group By的优化
昨天我写了有关MySQL的loose index scan的相关博文(http://www.cnblogs.com/wingsless/p/5037625.html),后来我发现上次提到的那个优化方法 ...
- Mysql Join语法以及性能优化
引言 内外联结的区别是内联结将去除所有不符合条件的记录,而外联结则保留其中部分.外左联结与外右联结的区别在于如果用A左联结B则A中所有记录都会保留在结果中,此时B中只有符合联结条件的记录,而右联结相反 ...
- mysql问题排查与性能优化
MySQL 问题排查都有哪些手段? 使用 show processlist 命令查看当前所有连接信息. 使用 explain 命令查询 SQL 语句执行计划. 开启慢查询日志,查看慢查询的 SQL. ...
- mysql之数据库添加索引优化查询效率
项目中如果表中的数据过多的话,会影响查询的效率,那么我们需要想办法优化查询,通常添加索引就是我们的选择之一: 1.添加PRIMARY KEY(主键索引) mysql>ALTER TABLE `t ...
- mysql use index () 优化查询的例子
USE INDEX在你查询语句中表名的后面,添加 USE INDEX 来提供你希望 MySQ 去参考的索引列表,就可以让 MySQL 不再考虑其他可用的索引.Eg:SELECT * FROM myta ...
- mysql explain的使用(优化查询)
explain显示了mysql如何使用索引来处理select语句以及连接表.可以帮助选择更好的索引和写出更优化的查询语句. 1.创建数据库 创建的sql语句如下: /* Navicat MySQL D ...
- SqlServer性能优化 查询和索引优化(十二)
查询优化的过程: 查询优化: 功能:分析语句后最终生成执行计划 分析:获取操作语句参数 索引选择 Join算法选择 创建测试的表: select * into EmployeeOp from Adve ...
随机推荐
- Idea配置sbt(window环境)
近开发spark项目使用到scala语言,这里介绍如何在idea上使用sbt来编译项目. 开发环境:windows 1. 下载sbt http://www.scala-sbt.org/download ...
- mybatis 一对多关系
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE mapper PUBLIC "- ...
- 将sqlServer上的数据库文件进行盘目的迁移
在数据库客户端创建数据库时要改为.mdf文件,因为附加问价时附加的是.mdf文件: 在里选中相应的数据库 右键->任务-分离 在 剪切到相应的想放置的盘目. 例如迁移到E盘下: 在数据库-> ...
- e654. 获得文本的缩略图
Shape getTextShape(Graphics2D g2d, String str, Font font) { FontRenderContext frc = g2d.getFontRende ...
- 在Access中执行SQL语句
Access在小型系统开发中等到了广泛使用.虽然Access提供了可视化的操作方法,但许多开发人员还是喜欢直接用SQL语句操作数据表.如何在Access中打开SQL视图,对于初次使用Access的程序 ...
- C语言简单选择排序
#include <stdio.h> int main(int argc, char const *argv[]) { // 将数组按照从小到大排序 , , , , , }; int i, ...
- JavaSE(十)之反射
开始接触的时候可能大家都会很模糊到底什么是反射,大家都以为这个东西不重要,其实很重要的,几乎所有的框架都要用到反射,增加灵活度.到了后面几乎动不动就要用到反射. 首先我们先来认识一下对象 学生---- ...
- ubuntu-14.04.2-desktop-amd64.iso:ubuntu-14.04.2-desktop-amd64:安装Oracle11gR2
ubuntu 桌面版的安装不介绍. 如何安装oracle:核心步骤和关键点. ln -sf /bin/bash /bin/sh ln -sf /usr/bin/basename /bin/basena ...
- 生成验证码程序C#
using System; using System.Data; using System.Configuration; using System.Collections; using System. ...
- JS怎样捕获浏览器关闭时间弹出自定义对话框
<script type="text/javascript">window.onbeforeunload = function (e) { e = e || windo ...