MySQL 5.6查询优化器新特性的“BUG” eq_range_index_dive

本文转自 http://www.imysql.cn

最近碰到一个慢SQL问题，解决过程有点小曲折，和大家分享下。 SQL本身不复杂，表结构、索引也比较简单，不过个别字段存在于多个索引中。

CREATE TABLE `pre_forum_post` (
`pid` int(10) unsigned NOT NULL,
`fid` mediumint(8) unsigned NOT NULL DEFAULT ‘0’,
`tid` mediumint(8) unsigned NOT NULL DEFAULT ‘0’,
`first` tinyint(1) NOT NULL DEFAULT ‘0’,
`author` varchar(40) NOT NULL DEFAULT ”,
`authorid` int(10) unsigned NOT NULL DEFAULT ‘0’,
`subject` varchar(80) NOT NULL DEFAULT ”,
`dateline` int(10) unsigned NOT NULL DEFAULT ‘0’,
`message` mediumtext NOT NULL,
`useip` varchar(15) NOT NULL DEFAULT ”,
`invisible` tinyint(1) NOT NULL DEFAULT ‘0’,
`anonymous` tinyint(1) NOT NULL DEFAULT ‘0’,
`usesig` tinyint(1) NOT NULL DEFAULT ‘0’,
`htmlon` tinyint(1) NOT NULL DEFAULT ‘0’,
`bbcodeoff` tinyint(1) NOT NULL DEFAULT ‘0’,
`smileyoff` tinyint(1) NOT NULL DEFAULT ‘0’,
`parseurloff` tinyint(1) NOT NULL DEFAULT ‘0’,
`attachment` tinyint(1) NOT NULL DEFAULT ‘0’,
`rate` smallint(6) NOT NULL DEFAULT ‘0’,
`ratetimes` tinyint(3) unsigned NOT NULL DEFAULT ‘0’,
`status` int(10) NOT NULL DEFAULT ‘0’,
`tags` varchar(255) NOT NULL DEFAULT ‘0’,
`comment` tinyint(1) NOT NULL DEFAULT ‘0’,
`replycredit` int(10) NOT NULL DEFAULT ‘0’,
`position` int(8) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`tid`,`position`),
UNIQUE KEY `pid` (`pid`),
KEY `fid` (`fid`),
KEY `displayorder` (`tid`,`invisible`,`dateline`),
KEY `first` (`tid`,`first`),
KEY `new_auth` (`authorid`,`invisible`,`tid`),
KEY `idx_dt` (`dateline`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

“root@localhost Fri Aug 1 11:59:56 2014 11:59:56 [test]>show table status like ‘pre_forum_post’\G
*************************** 1. row ***************************
Name: pre_forum_post
Engine: MyISAM
Version: 10
Row_format: Dynamic
Rows: 23483977
Avg_row_length: 203
Data_length: 4782024708
Max_data_length: 281474976710655
Index_length: 2466093056
Data_free: 0
Auto_increment: 1
Create_time: 2014-08-01 11:00:56
Update_time: 2014-08-01 11:08:49
Check_time: 2014-08-01 11:12:23
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment:

mysql> show index from pre_forum_post;
+—————-+————+————–+————–+————-+———–+————-+———-+——–+——+————+———+—————+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+—————-+————+————–+————–+————-+———–+————-+———-+——–+——+————+———+—————+
| pre_forum_post | 0 | PRIMARY | 1 | tid | A | 838713 | NULL | NULL | | BTREE | | |
| pre_forum_post | 0 | PRIMARY | 2 | position | A | 23483977 | NULL | NULL | | BTREE | | |
| pre_forum_post | 0 | pid | 1 | pid | A | 23483977 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | fid | 1 | fid | A | 1470 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | displayorder | 1 | tid | A | 838713 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | displayorder | 2 | invisible | A | 869776 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | displayorder | 3 | dateline | A | 23483977 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | first | 1 | tid | A | 838713 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | first | 2 | first | A | 1174198 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | new_auth | 1 | authorid | A | 1806459 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | new_auth | 2 | invisible | A | 1956998 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | new_auth | 3 | tid | A | 11741988 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | idx_dt | 1 | dateline | A | 23483977 | NULL | NULL | | BTREE | | |
+—————-+————+————–+————–+————-+———–+————-+———-+——–+——+————+———+—————+
我们来看下这个SQL的执行计划：

mysql> select * from pre_forum_post where tid=7932612 and `invisible` in(‘0′,’-2′) order by dateline limit 15;
15 rows in set (26.78 sec)
看下MySQL的会话状态值：Handler_read_next

| Handler_read_next | 17274153 |
从1700多万数据中选取15条记录，结果可想而知，非常慢。我们强制指定比较靠谱的索引再看下：

mysql> explain select * from pre_forum_post force index(displayorder) where tid=7932612 and `invisible` in(‘0′,’-2′) order by dateline limit 15\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pre_forum_post
type: range
possible_keys: displayorder
key: displayorder
key_len: 4
ref: NULL
rows: 46131
Extra: Using index condition; Using filesort
看下实际执行的耗时：

mysql> select * from pre_forum_post force index(displayorder) where tid=7932612 and `invisible` in(‘0′,’-2′) order by dateline limit 15;
15 rows in set (0.08 sec)
尼玛，怎么可以这么快，查询优化器未免太坑爹了吧。再看下MySQL的会话状态值：Handler_read_next

| Handler_read_next | 31188 |
和不强制索引的情况相比，差了553倍！所幸，5.6以上除了EXPLAIN外，还支持OPTIMIZER_TRACE，我们来观察下两种执行计划的区别，发现不强制指定索引时的执行计划有诈，会在最后判断到 ORDER BY 子句时，修改执行计划：

{\
“reconsidering_access_paths_for_index_ordering”: {\
“clause”: “ORDER BY”,\
“index_order_summary”: {\
“table”: “`pre_forum_post`”,\
“index_provides_order”: true,\
“order_direction”: “asc”,\
“index”: “idx_dt”,\
“plan_changed”: true,\
“access_type”: “index_scan”\
} /* index_order_summary */\
} /* reconsidering_access_paths_for_index_ordering */\
而在前面analyzing_range_alternatives和considered_execution_plans阶段，都认为其他几个索引也是可选择的，直到这里才给强X了，你Y的… 看起来像是MySQL 5.6查询优化器的bug了，GOOGLE了一下，还真发有人已经反馈过类似的问题： MySQL bug 70245: incorrect costing for range scan causes optimizer to choose incorrect index

看完才发现，其实不是神马BUG，而是原来从5.6开始，增加了一个选项叫eq_range_index_dive_limit 的高级货，这货大概的用途是：在较多等值查询（例如多值的IN查询）情景中，预估可能会扫描的记录数，从而选择相对更合适的索引，避免所谓的index dive问题。

当面临下面两种选择时：

1、索引代价较高，但结果较为精确；
2、索引代价较低，但结果可能不够精确；
简单说，选项 eq_range_index_dive_limit 的值设定了 IN列表中的条件个数上线，超过设定值时，会将执行计划分支从 1 变成 2。

该值默认为10，但社区众多人反馈较低了，因此在5.7版本后，将默认值调整为200了。

不过，今天我们这里的案例却是想反的，因为优化器选择了看似代价低但精确的索引，实际却选择了更低效的索引。因此，我们需要将其阈值调低，尝试设置 eq_range_index_dive_limit = 2 后（上面的例子中，IN条件里有2个值），再看下新的查询计划：

mysql> set eq_range_index_dive_limit = 2;

mysql> explain select * from pre_forum_post where tid=7932612 and `invisible` in(‘0′,’-2′) order by dateline limit 15\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pre_forum_post
type: range
possible_keys: PRIMARY,displayorder,first
key: displayorder
key_len: 4
ref: NULL
rows: 54
Extra: Using index condition; Using filesort
卧槽，预估扫描记录数又降了557倍，相比最开始降了接近32万倍！在这个案例中，虽然通过修改选项 eq_range_index_dive_limit 的阈值可以达到优化效果，但事实上更靠谱的做法是：直接删除 idx_dt 索引。是的，没错，删除这个垃圾重复索引，因为实际上这个索引的用处不大，够坑爹吧

MySQL 5.6查询优化器新特性的“BUG” eq_range_index_dive_limit的更多相关文章

MySQL 8.0.2复制新特性（翻译）
译者:知数堂星耀队 MySQL 8.0.2复制新特性 MySQL 8 正在变得原来越好,而且这也在我们MySQL复制研发团队引起了一阵热潮.我们一直致力于全面提升MySQL复制,通过引入新的和一些有趣 ...
MySQL 8.0的关系数据库新特性详解
前言 MySQL 8.0 当前的最新版本是 8.0.4 rc,估计正式版本出来也快了.本文介绍几个 8.0 在关系数据库方面的主要新特性. 你可能已经知道 MySQL 从版本 5.7 开始提供了 No ...
高性能MySql进化论(九):查询优化器常用的优化方式
1 介绍 1.1 处理流程当MYSQL 收到一条查询请求时,会首先通过关键字对SQL语句进行解析,生成一颗“解析树”,然后预处理器会校验“解析树”是否合法(主要校验数据列和表明 ...
Atitit.mysql 5.0 5.5 5.6 5.7 新特性新功能
Atitit.mysql 5.0 5.5 5.6 5.7 新特性新功能 1. MySQL 5.6 5 大新特性1 1.1. 优化器的改进1 1.2. InnoDB 改进1 1.3. 使用 ...
Atitit.mysql 5.0 5.5 5.6 5.7 新特性新功能
Atitit.mysql 5.0 5.5 5.6 5.7 新特性新功能 1. MySQL 5.6 5 大新特性1 1.1. 优化器的改进1 1.2. InnoDB 改进1 1.3. 使用 ...
MySQL 5.7新特性之Generated Column（函数索引）
MySQL 5.7引入了Generated Column,这篇文章简单地介绍了Generated Column的使用方法和注意事项,为读者了解MySQL 5.7提供一个快速的.完整的教程.这篇文章围绕 ...
MySQL 5.7新特性之generated column
MySQL 5.7引入了generated column,这篇文章简单地介绍了generated column的使用方法和注意事项,为读者了解MySQL 5.7提供一个快速的.完整的教程.这篇文章围绕 ...
跨时代的MySQL8.0新特性解读
目录 MySQL发展历程 MySQL8.0新特性秒级加列性能提升文档数据库 SQL增强共用表表达式(CTEs) 不可见索引(Invisible Indexes) 降序索引(Descending ...
webpack 4.0.0-beta.0 新特性介绍
webpack 可以看做是模块打包机.它做的事情是:分析你的项目结构,找到JavaScript模块以及其它的一些浏览器不能直接运行的拓展语言(Scss,TypeScript等),并将其打包为合适的格式 ...

随机推荐

java反射类内容获取
private void DtoReflect(Object obj, MqDto mqDto) throws Exception { Map map = getMap(mqDto); if(obj= ...
移动端重构系列-移动端html页面优化
对于访问量大的网站来说,前端的优化是必须的,即使是优化1KB的大小对其影响也很大,下面来看看来自ISUX的米随随讲讲移动手机平台的HTML5前端优化,或许对你有帮助和启发. 概述 1. PC优化手段在 ...
SqlBulkCopy批量添加数据
var sqlconn = ConfigurationManager.ConnectionStrings["SQLConnStringRead"].ConnectionString ...
WebForm Repeater Response以及地址栏
Repeater重复器: Repeater中有五个模板,这里需要注意的是4个 <HeaderTemplate> - 开头,只执行一次的内容 <ItemTemplate> - 需 ...
高通camera学习笔记
http://www.2cto.com/kf/201609/548725.html http://www.android100.org/html/201508/24/176303.html
关于ř与画面的集成---- k均值聚类
1.利用R内置数据集iris: 2.通过Rserve 包连接tableau,服务器:localhost,默认端口6311: 3.加载数据集iris: 4.编辑字段:Cluster <span s ...
如何在R语言中使用Logistic回归模型
在日常学习或工作中经常会使用线性回归模型对某一事物进行预测,例如预测房价.身高.GDP.学生成绩等,发现这些被预测的变量都属于连续型变量.然而有些情况下,被预测变量可能是二元变量,即成功或失败.流失或 ...
Css绘制圆形，环形，椭圆等图形
转载自http://blog.csdn.net/gongstrong123/article/details/50888758 绘制圆形,环形,椭圆 <!DOCTYPE html> < ...
wordcount 过程
hdfs原始数据 hello a hello b map阶段: 输入数据:<0,"hello a"> <8,"hello b"> key ...
浏览器禁止js打开新窗口
在项目中,有个需求是需要ajax获取新地址,然后去打开该页面地址,这样会被浏览器拦截,可以采取以下方式:1.再ajax请求先前,先创建一个新窗口 var newTab = window.open('' ...

MySQL 5.6查询优化器新特性的“BUG” eq_range_index_dive_limit

MySQL 5.6查询优化器新特性的“BUG” eq_range_index_dive_limit的更多相关文章

随机推荐

热门专题