【mysql】一条慢查询sql的的分析

这个是我在jobbole.com 上看到的

先给出数据表table结构

mysql> show create table tt \G

*************************** 1. row ***************************

       Table: tt

Create Table: CREATE TABLE `tt` (

  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,

  `name` char(16) NOT NULL DEFAULT '',

  `rule_id` int(10) unsigned NOT NULL DEFAULT '0',

  `status` tinyint(3) unsigned NOT NULL DEFAULT '1',

  PRIMARY KEY (`id`),

  KEY `ttx` (`name`,`rule_id`)

) ENGINE=InnoDB AUTO_INCREMENT=1176504 DEFAULT CHARSET=utf8

1 row in set (0.00 sec)

目前数据库中数据大概有2000W条，之后可能还会不断增长，现在想要查询的是：从表中找出相同姓名，相同规则ID，并且status = 1对应的最后一条记录

写个程序随机加入数据

<?php

mysql_connect('10.0.0.234','root','123456');

mysql_select_db('testdb');

mysql_query("set names utf8");

$strs ='';

$start = 'insert into tt values ';

for($i=0;$i<3000000;$i++){

    $k = range(a,z);

    shuffle($k);

    $str = implode('',$k);

    $num = mt_rand(8,16);

    $in = substr($str,0,$num);

    $strs .= " (NULL,'$in',$num,1),";

    if($i%10000 == 0){

        echo $i."\n";

        $sql = trim($start.$strs,',');

        mysql_query($sql);

        $strs ='';

    }

}

这样300W数据就有了

目前的查询语句是这样的

SELECT * FROM tt WHERE id IN (

SELECT max(id) AS max_id FROM tt GROUP BY name, rule_id)

AND status = 1

看一下查询计划

mysql> explain SELECT * FROM tt WHERE id IN ( SELECT max(id) AS max_id FROM tt GROUP BY name, rule_id) and status=1;

+----+--------------------+-------+-------+---------------+------+---------+------+---------+-------------+

| id | select_type        | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra       |

+----+--------------------+-------+-------+---------------+------+---------+------+---------+-------------+

|  1 | PRIMARY            | tt    | ALL   | NULL          | NULL | NULL    | NULL | 1176818 | Using where |

|  2 | DEPENDENT SUBQUERY | tt    | index | NULL          | ttx  | 52      | NULL |       1 | Using index |

+----+--------------------+-------+-------+---------------+------+---------+------+---------+-------------+

索引用上了，也没有filesort，这是不是就很快了，然后执行查询，就卡主了，卡主了，最后只能被Ctrl+C了

那么这个不行，就换一个，这里我想到了，子查询换成join看一看效果

SELECT a.id,b.name,b.rule_id  FROM (select max(id) as id from tt   group by name, rule_id) as a left join tt b on a.id = b.id

看一下查询计划

mysql> explain SELECT a.id,b.name,b.rule_id  FROM (select max(id) as id from tt   group by name, rule_id) as a left join tt b on a.id = b.id;

+----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+

| id | select_type | table      | type   | possible_keys | key     | key_len | ref  | rows    | Extra       |

+----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+

|  1 | PRIMARY     | <derived2> | ALL    | NULL          | NULL    | NULL    | NULL | 1176503 |             |

|  1 | PRIMARY     | b          | eq_ref | PRIMARY       | PRIMARY | 4       | a.id |       1 |             |

|  2 | DERIVED     | tt         | index  | NULL          | ttx     | 52      | NULL | 1176818 | Using index |

+----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+

执行结果 1.77579775 sec

mysql> show profiles;

+----------+------------+-------------------------------------------------------------------------------------------------------------------------------+

| Query_ID | Duration   | Query                                                                                                                         |

+----------+------------+-------------------------------------------------------------------------------------------------------------------------------+

|        6 | 1.77579775 | SELECT a.id,b.name,b.rule_id  FROM (select max(id) as id from tt   group by name, rule_id) as a left join tt b on a.id = b.id |

+----------+------------+-------------------------------------------------------------------------------------------------------------------------------+

7 rows in set (0.00 sec)

看一看执行过程

mysql> show profile for query  6;

+------------------------------+----------+

| Status                       | Duration |

+------------------------------+----------+

| Sending data                 | 0.000596 |

| Waiting for query cache lock | 0.000002 |

| Sending data                 | 0.000579 |

| Waiting for query cache lock | 0.000002 |

| Sending data                 | 0.000534 |

| Waiting for query cache lock | 0.000002 |

| Sending data                 | 1.101490 |

| end                          | 0.000010 |

| query end                    | 0.000004 |

| closing tables               | 0.000003 |

| removing tmp table           | 0.001369 |

| closing tables               | 0.000010 |

| freeing items                | 0.000024 |

| logging slow query           | 0.000002 |

| logging slow query           | 0.000035 |

| cleaning up                  | 0.000003 |

+------------------------------+----------+

100 rows in set (0.00 sec)

太多的查询缓存lock，关掉它，再看一下查询结果

mysql> show profile for query  10;

+---------------------------+----------+

| Status                    | Duration |

+---------------------------+----------+

| starting                  | 0.000154 |

| checking permissions      | 0.000006 |

| checking permissions      | 0.000005 |

| Opening tables            | 0.000124 |

| System lock               | 0.000122 |

| optimizing                | 0.000007 |

| statistics                | 0.000035 |

| preparing                 | 0.000018 |

| executing                 | 0.000012 |

| Sorting result            | 0.000006 |

| Sending data              | 0.518897 |

| converting HEAP to MyISAM | 0.070147 |

| Sending data              | 0.067123 |

| init                      | 0.000033 |

| optimizing                | 0.000003 |

| statistics                | 0.000020 |

| preparing                 | 0.000009 |

| executing                 | 0.000001 |

| Sending data              | 1.193679 |

| end                       | 0.000011 |

| query end                 | 0.000010 |

| closing tables            | 0.000002 |

| removing tmp table        | 0.001491 |

| closing tables            | 0.000011 |

| freeing items             | 0.000020 |

| logging slow query        | 0.000002 |

| logging slow query        | 0.000050 |

| cleaning up               | 0.000003 |

+---------------------------+----------+

再次分析，我们发现 converting HEAP to MyISAM 这个很耗时

mysql> select @@max_heap_table_size/1024/1024;

+---------------------------------+

| @@max_heap_table_size/1024/1024 |

+---------------------------------+

|                     16.00000000 |

+---------------------------------+

mysql> set max_heap_table_size = 16777216*4;

Query OK, 0 rows affected (0.00 sec)

mysql> select @@max_heap_table_size/1024/1024;

+---------------------------------+

| @@max_heap_table_size/1024/1024 |

+---------------------------------+

|                     64.00000000 |

+---------------------------------+

再看看执行结果 1.77579775 > 1.68962725 ，还是有效果的

mysql> show profiles;

+----------+------------+-------------------------------------------------------------------------------------------------------------------------------+

| Query_ID | Duration   | Query                                                                                                                         |

+----------+------------+-------------------------------------------------------------------------------------------------------------------------------+

|       17 | 1.68962725 | SELECT a.id,b.name,b.rule_id  FROM (select max(id) as id from tt   group by name, rule_id) as a left join tt b on a.id = b.id |

+----------+------------+-------------------------------------------------------------------------------------------------------------------------------+

分析

mysql> show profile for query  17;

+----------------------+----------+

| Status               | Duration |

+----------------------+----------+

| starting             | 0.000120 |

| checking permissions | 0.000004 |

| checking permissions | 0.000003 |

| Opening tables       | 0.000106 |

| System lock          | 0.000113 |

| optimizing           | 0.000007 |

| statistics           | 0.000044 |

| preparing            | 0.000011 |

| executing            | 0.000006 |

| Sorting result       | 0.000002 |

| Sending data         | 0.567858 |

| init                 | 0.000032 |

| optimizing           | 0.000004 |

| statistics           | 0.000017 |

| preparing            | 0.000015 |

| executing            | 0.000002 |

| Sending data         | 1.120159 |

| end                  | 0.000011 |

| query end            | 0.000005 |

| closing tables       | 0.000002 |

| removing tmp table   | 0.001020 |

| closing tables       | 0.000011 |

| freeing items        | 0.000018 |

| logging slow query   | 0.000002 |

| logging slow query   | 0.000056 |

| cleaning up          | 0.000004 |

+----------------------+----------+

好看多了，耗时的地方都在 Sending data 上了，如果硬盘换成PCI-SSD 估计又能提高不少

还有没有其他方法呢，当然有，那么换一种写法

先看查询计划

mysql> explain select max(concat_ws(' ',lpad(id,5,' '),status)) as res, name, rule_id from tt group by name, rule_id ;

+----+-------------+-------+-------+---------------+------+---------+------+---------+-------+

| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra |

+----+-------------+-------+-------+---------------+------+---------+------+---------+-------+

|  1 | SIMPLE      | tt    | index | NULL          | ttx  | 52      | NULL | 1176818 |       |

+----+-------------+-------+-------+---------------+------+---------+------+---------+-------+

　再看看执行结果

mysql> show profiles;

+----------+------------+-------------------------------------------------------------------------------------------------------------------------------+

| Query_ID | Duration   | Query                                                                                                                         |

+----------+------------+-------------------------------------------------------------------------------------------------------------------------------+

|       22 | 1.82505025 | select max(concat_ws(' ',lpad(id,5,' '),status)) as res, name, rule_id from tt group by name, rule_id                         |

+----------+------------+-------------------------------------------------------------------------------------------------------------------------------+

好像更慢了 1.82505025 > 1.68962725

分析一下

mysql> show profile for query  22;

+----------------------+----------+

| Status               | Duration |

+----------------------+----------+

| starting             | 0.000157 |

| checking permissions | 0.000007 |

| Opening tables       | 0.000019 |

| System lock          | 0.000020 |

| init                 | 0.000032 |

| optimizing           | 0.000005 |

| statistics           | 0.000016 |

| preparing            | 0.000012 |

| executing            | 0.000008 |

| Sorting result       | 0.000003 |

| Sending data         | 1.824677 |

| end                  | 0.000012 |

| query end            | 0.000005 |

| closing tables       | 0.000009 |

| freeing items        | 0.000016 |

| logging slow query   | 0.000002 |

| logging slow query   | 0.000049 |

| cleaning up          | 0.000004 |

+----------------------+----------+

时间基本上都花费在send data了

这里说一下，max + group by 完全就是一个坑，如果有多个字段要返回数据不能这样写

select  max(id) ,type ,name from table  where type=xx group by name

因为group by默认返回第一条记录

如果像下面那样写，肯定会遇到坑的

select max(id) as res, name, rule_id from tt group by name, rule_id ;

这样也可以，而且快多了，但是如果有其他字段怎么办

这样的语句基本上没有什么可以再优化了，只能换换其他方式了，比如：换SSD+raid10 ，分区/分表/分库+中间件了