mysql物理优化器代价模型分析【原创】

1. 引言

　　mysql的sql server在根据where condition检索数据的时候，一般会有多种数据检索的方法，其会根据各种数据检索方法代价的大小，选择代价最小的那个数据检索方法。

　　比如说这个语句，where col1=x and col2=y and col3 >z , 同时存在inx_col1，inx_col2，inx_col3，inx_col1_col2_col3这四个索引，sql server要解决的问题有1）选择哪个索引、 2）是索引range扫描还是ref扫描、3）table scan的方式是否可行。

　　mysql会根据以下几种数据检索策略选择代价最小的策略来从数据表中获取数据，1）各个索引的range scan代价 2）各个索引的ref scan代价 3）table scan的代价。如何计算这些代价，是本文详细说明的重点。

　　总代价cost = cpu cost + io cost。

2 . 代价因子

　　mysql的代价因子在内存中有一份副本，由Server_cost_constants 和SE_cost_constants两个类组成。这两个类的具体数据成员如下。

Mysql Server 代价因子

Server_cost_constants {

　　m_row_evaluate_cost //行记录条件谓词评估代价

　　m_key_compare_cost //键值比较代价

　　m_memory_temptable_create_cost //内存临时表创建代价

　　m_memory_temptable_row_cost //内存临时表的行代价

　　m_disk_temptable_create_cost //磁盘临时表创建代价

　　m_disk_temptable_row_cost

}

存储引擎代价因子

SE_cost_constants{

　　m_memory_block_read_cost //从buffer pool中读取一个页面的代价

　　m_io_block_read_cost //从文件系统中读取一个页面的代价，buffer miss的场景

　　m_memory_block_read_cost_default

　　m_io_block_read_cost_default

}

　　mysql的代价因子在系统的持久化系统表中也有一份副本，对应mysql.server_cost 和 mysql.engine_cost两个表，这两个表中的字段与内存中的类字段相同。DBA可以根据实际的硬件情况测试，测试出最适合的代价因子，然后update系统表中对应的字段。再然后执行flush OPTIMIZER_COSTS命令，将修改反应到内存中数据，这样新连接上来的mysql session会读取到内存中数据，然后以新的代价因子计算代价数。

　　代价因子如何根据实际的硬件环境与负载压力自适应地调整，是一个重要的研究课题。

3 . 统计信息

　　sql server需要的统计信息是由存储引擎innodb提供的，调用innodb提供的api可以获取这些统计信息，本文的后半部分会罗列这些api。innodb的统计信息根据需要可以持久化到系统表中。mysql.innodb_table_stats和mysql.innodb_index_stats存储了表的统计信息和索引的统计信息。

　　mysql.innodb_table_stats表中字段说明

 　　database_name 库名

　　 table_name 表名

　　 n_rows 表中的数据行数

　　 clustered_index_size 聚集索引的页面数

 　　sum_of_other_index_sizes 其他非主键索引的页面数

　　 last_update 最后更新这张表的时间

　　mysql.innodb_index_stats 表中字段说明

 　database_name 库名

   table_name 表名

　　index_name 索引名

　　stat_name 统计项名称

　　stat_value 统计项值

　　sample_size 采样的页面数

　　last_update 最后更新这张表的时间

　　其中stat_name 统计项名称包括：

 　　  　n_diff_pfxNN 为不同前缀列的cardinality，即不同前缀字段的 distinct value个数

 　　　　n_leaf_page 索引叶子节点页面数目

　　　 　size 索引页面数目

4. 代价的计算公式

cpu代价计算

double row_evaluate_cost(double rows)

 {

    return rows * m_server_cost_constants->row_evaluate_cost();

 }

table scan IO代价计算

Cost_estimate handler::table_scan_cost()

{

    double io_cost= scan_time() * table->cost_model()->page_read_cost(1.0);

}

ref and range scan IO代价计算

聚集索引扫描IO代价计算公式

Cost_estimate handler::read_cost(uint index, double ranges, double rows)

{

    double io_cost= read_time(index, static_cast<uint>(ranges),

        static_cast<ha_rows>(rows)) *

            table->cost_model()->page_read_cost(1.0);

}

二级索引覆盖扫描（不需要回表）IO代价计算公式

Cost_estimate handler::index_scan_cost(uint index, double ranges, double rows)

{

    double io_cost= index_only_read_time(index, rows) *

            table->cost_model()->page_read_cost_index(index, 1.0);

}

二级索引非覆盖扫描（需要回表）IO代价计算公式

min( table→cost_model()→page_read_cost(tmp_fanout), tab→worst_seeks )

估算读取 pages个聚集索引页面所花费的代价， page数乘以代价因子

double Cost_model_table::page_read_cost(double pages)

估算读取 pages个指定 index索引页面所花费的代价数。

double Cost_model_table::page_read_cost_index(uint index, double pages)

5. innodb统计信息api

全表扫描聚集索引时，聚集索引（主键）占用的所有页面数

double ha_innobase::scan_time()

估算在聚集索引上，扫描 rows 条记录，需要读取的页面数

double ha_innobase::read_time(uint index, double ranges, double rows)

估算在指定 keynr索引进行覆盖扫描（不需要回表），扫描 records条记录，需要读取的索引页面数

double handler::index_only_read_time(uint keynr, double records)

估算指定 keynr索引在范围（min_key,max_key）中的记录数量

ha_innobase::records_in_range(

　　uint keynr, /*!< in: index number */

　　key_range *min_key, /*!< in: start key value of the

　　key_range *max_key) /*!< in: range end key val, may

)

估算聚集索引内存中页面数占其所有页面数的比率

double handler::table_in_memory_estimate()

估算二级索引内存中页面数占其所有页面数的比率

double handler::index_in_memory_estimate(uint keyno)

6.开启优化器跟踪

set session optimizer_trace="enabled=on";

explain your sql

select * from information_schema.optimizer_trace;

7.优化器跟踪示例

     "rows_estimation": [

              {

                "table": "`tab`",

                "range_analysis": {

                  "table_scan": {

                    "rows": 5,

                    "cost": 4.1

                  },

                  "potential_range_indexes": [

                    {

                      "index": "PRIMARY",

                      "usable": false,

                      "cause": "not_applicable"

                    },

                    {

                      "index": "inx_clo2",

                      "usable": true,

                      "key_parts": [

                        "clo2",

                        "clo1"

                      ]

                    },

                    {

                      "index": "inx_clo3",

                      "usable": true,

                      "key_parts": [

                        "clo3",

                        "clo1"

                      ]

                    },

                    {

                      "index": "inx_clo2_clo3",

                      "usable": true,

                      "key_parts": [

                        "clo2",

                        "clo3",

                        "clo1"

                      ]

                    }

                  ],

                  "best_covering_index_scan": {

                    "index": "inx_clo2_clo3",

                    "cost": 2.0606,

                    "chosen": true

                  },

                  "setup_range_conditions": [

                  ],

                  "group_index_range": {

                    "chosen": false,

                    "cause": "not_group_by_or_distinct"

                  },

                  "analyzing_range_alternatives": {

                    "range_scan_alternatives": [

                      {

                        "index": "inx_clo2",

                        "ranges": [

                          "hu <= clo2 <= hu"

                        ],

                        "index_dives_for_eq_ranges": true,

                        "rowid_ordered": true,

                        "using_mrr": false,

                        "index_only": false,

                        "rows": 2,

                        "cost": 3.41,

                        "chosen": false,

                        "cause": "cost"

                      },

                      {

                        "index": "inx_clo3",

                        "ranges": [

                          "huan <= clo3 <= huan"

                        ],

                        "index_dives_for_eq_ranges": true,

                        "rowid_ordered": true,

                        "using_mrr": false,

                        "index_only": false,

                        "rows": 1,

                        "cost": 2.21,

                        "chosen": false,

                        "cause": "cost"

                      },

                      {

                        "index": "inx_clo2_clo3",

                        "ranges": [

                          "hu <= clo2 <= hu AND huan <= clo3 <= huan"

                        ],

                        "index_dives_for_eq_ranges": true,

                        "rowid_ordered": true,

                        "using_mrr": false,

                        "index_only": true,

                        "rows": 1,

                        "cost": 1.21,

                        "chosen": true

                      }

                    ],

                    "analyzing_roworder_intersect": {

                      "intersecting_indexes": [

                        {

                          "index": "inx_clo2_clo3",

                          "index_scan_cost": 1,

                          "cumulated_index_scan_cost": 1,

                          "disk_sweep_cost": 0,

                          "cumulated_total_cost": 1,

                          "usable": true,

                          "matching_rows_now": 1,

                          "isect_covering_with_this_index": true,

                          "chosen": true

                        }

                      ],

                      "clustered_pk": {

                        "clustered_pk_added_to_intersect": false,

                        "cause": "no_clustered_pk_index"

                      },

                      "chosen": false,

                      "cause": "too_few_indexes_to_merge"

                    }

                  },

                  "chosen_range_access_summary": {

                    "range_access_plan": {

                      "type": "range_scan",

                      "index": "inx_clo2_clo3",

                      "rows": 1,

                      "ranges": [

                        "hu <= clo2 <= hu AND huan <= clo3 <= huan"

                      ]

                    },

                    "rows_for_plan": 1,

                    "cost_for_plan": 1.21,

                    "chosen": true

                  }

                }

              }

            ]

          },

          {

            "considered_execution_plans": [

              {

                "plan_prefix": [

                ],

                "table": "`tab`",

                "best_access_path": {

                  "considered_access_paths": [

                    {

                      "access_type": "ref",

                      "index": "inx_clo2",

                      "rows": 2,

                      "cost": 2.4,

                      "chosen": true

                    },

                    {

                      "access_type": "ref",

                      "index": "inx_clo3",

                      "rows": 1,

                      "cost": 1.2,

                      "chosen": true

                    },

                    {

                      "access_type": "ref",

                      "index": "inx_clo2_clo3",

                      "rows": 1,

                      "cost": 1.2,

                      "chosen": false

                    },

                    {

                      "rows_to_scan": 1,

                      "access_type": "range",

                      "range_details": {

                        "used_index": "inx_clo2_clo3"

                      },

                      "resulting_rows": 1,

                      "cost": 1.41,

                      "chosen": false

                    }

                  ]

                },

                "condition_filtering_pct": 40,

                "rows_for_plan": 0.4,

                "cost_for_plan": 1.2,

                "chosen": true

              }

            ]

          },

          {

            "attaching_conditions_to_tables": {

              "original_condition": "((`tab`.`clo2` = 'hu') and (`tab`.`clo3` = 'huan'))",

              "attached_conditions_computation": [

              ],

              "attached_conditions_summary": [

                {

                  "table": "`tab`",

                  "attached": "(`tab`.`clo2` = 'hu')"

                }

              ]

            }

          },

          {

            "refine_plan": [

              {

                "table": "`tab`"

              }

            ]

          }

        ]

mysql物理优化器代价模型分析【原创】的更多相关文章

[源码解析] PyTorch分布式优化器(3)---- 模型并行
[源码解析] PyTorch分布式优化器(3)---- 模型并行目录 [源码解析] PyTorch分布式优化器(3)---- 模型并行 0x00 摘要 0x01 前文回顾 0x02 单机模型 2.1 ...
MySQL追踪优化器小试
首先看一下MySQL追踪优化器的典型用法: 打开:SET optimizer_trace="enabled=on"; 查询优化器的信息:SELECT * FROM INFORMAT ...
mysql之优化器、执行计划、简单优化
mysql之优化器.执行计划.简单优化 2018-12-12 15:11 烟雨楼人阅读(794) 评论(0) 编辑收藏引用连接: https://blog.csdn.net/DrDanger/a ...
keras channels_last、preprocess_input、全连接层Dense、SGD优化器、模型及编译
channels_last 和 channels_first keras中 channels_last 和 channels_first 用来设定数据的维度顺序(image_data_format). ...
MySQL 性能优化神器 Explain 使用分析
简介 MySQL 提供了一个 EXPLAIN 命令, 它可以对 SELECT 语句进行分析, 并输出 SELECT 执行的详细信息, 以供开发人员针对性优化. EXPLAIN 命令用法十分简单, 在 ...
MySQL Optimization 优化原理
MySQL Optimization 优化原理 MySQL逻辑架构如果能在头脑中构建一幅MySQL各组件之间如何协同工作的架构图,有助于深入理解MySQL服务器.下图展示了MySQL的逻辑架构图. ...
详解MYSQL各种优化原理
说起MySQL的查询优化,相信大家收藏了一堆奇技淫巧:不能使用SELECT *.不使用NULL字段.合理创建索引.为字段选择合适的数据类型..... 你是否真的理解这些优化技巧?是否理解其背后的工作原 ...
mysql数据库优化（转）
今天,数据库的操作越来越成为整个应用的性能瓶颈了,这点对于Web应用尤其明显.关于数据库的性能,这并不只是DBA才需要担心的事,而这更是我们程序员需要去关注的事情.当我们去设计数据库表结构,对操作数据 ...
MySQL 高性能优化实战总结
1 前言 2 优化的哲学 3 优化思路 3.1 优化什么 3.2 优化的范围有哪些 3.3 优化维度 4 优化工具有啥? 4.1 数据库层面 4.2 数据库层面问题解决思路 4.3 系统层面 4.4 ...

随机推荐

Django 入门介绍
Django介绍 Django框架是PythonWeb三大主流框架之一,以其功能强大全面而受到众多开发者追捧,现如今Django已经更新到3版本,但是并不推荐使用,更多建议使用1版本. Django版 ...
在MyBatis中采用模糊查询变量的引用标志应当是$而不是#
具体如下例: @Select("select count(*) from hy_stock where name like '%${keyword}%' or code like '%${k ...
CentOS7 中常用命令
1.开放端口开放50070端口 firewall-cmd --zone=public --add-port=50070/tcp --permanent 关闭50070端口 firewall-cmd ...
linux系统漏洞扫描工具lynis
lynis 是一款运行在 Unix/Linux 平台上的基于主机的.开源的安全审计软件.Lynis是针对Unix/Linux的安全检查工具,可以发现潜在的安全威胁.这个工具覆盖可疑文件监测.漏洞.恶意 ...
[LeetCode]152. 乘积最大子序列(DP)
题目给定一个整数数组 nums ,找出一个序列中乘积最大的连续子序列(该序列至少包含一个数). 示例 1: 输入: [2,3,-2,4] 输出: 6 解释: 子数组 [2,3] 有最大乘积 6. 示 ...
zepto | 用事件委托去解决无法给新增添的DOM添加事件的问题
前段时间在做一个任务的时候,碰见了一个问题:zepto无法用on事件去监听新增加的dom事件.这个问题用live可解决, 但是live在ios下失效,为了解决这个问题,我采用了暴力的方法去解决,每次添 ...
windows操作系统的电脑越用越卡？简说几种原因和解决方法。
很多人在使用windows操作系统的发现电脑越用越卡,但是不知道什么原因,只知道电脑越便宜的越卡(电脑配置低), 然而导致电脑卡顿缓慢的原因有很多,总结出来就是软件和硬件的问题,那怎么办呢? 电脑系统 ...
2.Strom-入门案例
记一次Java获取本地摄像头（基于OpenCV）
OpenCV官网下载地址(下载安装后,在安装目录可以找到动态链接库和OpenCv.jar) https://opencv.org/releases/ 安装完成后,这是我的安装目录 maven 依赖(这 ...
获取NX装配结构信息
最近在做一个项目,需要获取NX装配结构信息,这里把代码分享给大家,希望对各位有帮助,注意以下几点: 1)代码获取了PART的属性.表达式等,因此一些细节可能需要您根据实际情况修改. 2)读写XML用的 ...

mysql物理优化器代价模型分析【原创】

mysql物理优化器代价模型分析【原创】的更多相关文章

随机推荐

热门专题