mysql优化器在统计全表扫描的代价时的方法
innodb 的聚集索引 的叶子结点 存放的 是 索引值以及数据页的偏移量
那么在计算全表扫描的代价是怎么计算的呢?
我们知道代价 为 cpu代价+io代价
cpu代价 就是 每5条记录比对 计算一个代价 (这里的记录并不是我们数据记录,而是索引记录) 是数据记录个数
又是如何取出全表的总记录呢 (即全表的总索引记录)
具体方法是 通过索引能拿到叶子结点的page数,page页默认16K ,那么总容量为 leaf page num * 16k
再计算这个索引的长度,因为索引可能是由多个字段构成,因此要遍历,假设为 m
total_records = leaf page num * 16k /m 就是 索引记录个数了, 一条聚焦索引记录对应一条数据记录,所以这里是总的记录数
还是有问题 这个leaf page是数据页,而m是主键的长度,上面的total_records计算出来的结果 并不是准确的记录个数,按理说m为一条记录的长度,但代码里是主键的长度
那么cpu cost 就是 total_records/5+1
io cost 就是 (double) (prebuilt->table->stat_clustered_index_size(聚簇索引叶页面数);
- /******************************************************************//**
- Calculate the time it takes to read a set of ranges through an index
- This enables us to optimise reads for clustered indexes.
- @return estimated time measured in disk seeks */
- UNIV_INTERN
- double
- ha_innobase::read_time(
- /*===================*/
- uint index, /*!< in: key number */
- uint ranges, /*!< in: how many ranges */
- ha_rows rows) /*!< in: estimated number of rows in the ranges */
- {
- ha_rows total_rows;
- double time_for_scan;
- if (index != table->s->primary_key) {
- /* Not clustered */
- return(handler::read_time(index, ranges, rows));
- }
- if (rows <= ) {
- return((double) rows);
- }
- /* Assume that the read time is proportional to the scan time for all
- rows + at most one seek per range. */
- time_for_scan = scan_time();
- //estimate_rows_upper_bound这里就是计算全表总记录的函数
- if ((total_rows = estimate_rows_upper_bound()) < rows) {
- return(time_for_scan);
- }
- return(ranges + (double) rows / (double) total_rows * time_for_scan);
- }
- /*********************************************************************//**
- Gives an UPPER BOUND to the number of rows in a table. This is used in
- filesort.cc.
- @return upper bound of rows */
- UNIV_INTERN
- ha_rows
- ha_innobase::estimate_rows_upper_bound(void)
- /*======================================*/
- {
- dict_index_t* index;
- ulonglong estimate;
- ulonglong local_data_file_length;
- ulint stat_n_leaf_pages;
- //取得该表的第一个索引,就是聚集索引
- index = dict_table_get_first_index(prebuilt->table);
- //聚焦索引的叶子结点个数
- stat_n_leaf_pages = index->stat_n_leaf_pages;
- //大小为 叶子结点个数*16k
- local_data_file_length = ((ulonglong) stat_n_leaf_pages) * UNIV_PAGE_SIZE;
- /* Calculate a minimum length for a clustered index record and from
- that an upper bound for the number of rows. Since we only calculate
- new statistics in row0mysql.c when a table has grown by a threshold
- factor, we must add a safety factor 2 in front of the formula below. */
- //计算这个聚集索引的大小
- // 2* 总叶子个数*16K / 聚焦索引大小 得到聚集索引记录个数
- estimate = * local_data_file_length /
- dict_index_calc_min_rec_len(index);
- DBUG_RETURN((ha_rows) estimate);
- }
- /*********************************************************************//**
- Calculates the minimum record length in an index. */
- UNIV_INTERN
- ulint
- dict_index_calc_min_rec_len(
- /*========================*/
- const dict_index_t* index) /*!< in: index */
- {
- ulint sum = ;
- ulint i;
- //记录为compack 紧凑模式,因为有可能这个索引是由多个字段组成,要遍历,求出总字节数
- ulint comp = dict_table_is_comp(index->table);
- if (comp) {
- ulint nullable = ;
- sum = REC_N_NEW_EXTRA_BYTES;
- for (i = ; i < dict_index_get_n_fields(index); i++) {
- const dict_col_t* col
- = dict_index_get_nth_col(index, i);
- ulint size = dict_col_get_fixed_size(col, comp);
- sum += size;
- if (!size) {
- size = col->len;
- sum += size < ? : ;
- }
- if (!(col->prtype & DATA_NOT_NULL)) {
- nullable++;
- }
- }
- /* round the NULL flags up to full bytes */
- sum += UT_BITS_IN_BYTES(nullable);
- return(sum);
- }
- }
结构体dict_index_t
- /** InnoDB B-tree index */
- typedef struct dict_index_struct dict_index_t;
- /** Data structure for an index. Most fields will be
- initialized to 0, NULL or FALSE in dict_mem_index_create(). */
- struct dict_index_struct{
- index_id_t id; /*!< id of the index */
- mem_heap_t* heap; /*!< memory heap */
- const char* name; /*!< index name */
- const char* table_name;/*!< table name */
- dict_table_t* table; /*!< back pointer to table */ //
- #ifndef UNIV_HOTBACKUP
- unsigned space:;
- /*!< space where the index tree is placed */
- unsigned page:;/*!< index tree root page number */
- #endif /* !UNIV_HOTBACKUP */
- unsigned type:DICT_IT_BITS;
- /*!< index type (DICT_CLUSTERED, DICT_UNIQUE,
- DICT_UNIVERSAL, DICT_IBUF, DICT_CORRUPT) */
- #define MAX_KEY_LENGTH_BITS 12
- unsigned trx_id_offset:MAX_KEY_LENGTH_BITS;
- /*!< position of the trx id column
- in a clustered index record, if the fields
- before it are known to be of a fixed size,
- 0 otherwise */
- #if (1<<MAX_KEY_LENGTH_BITS) < MAX_KEY_LENGTH
- # error (<<MAX_KEY_LENGTH_BITS) < MAX_KEY_LENGTH
- #endif
- unsigned n_user_defined_cols:;
- /*!< number of columns the user defined to
- be in the index: in the internal
- representation we add more columns */
- unsigned n_uniq:;/*!< number of fields from the beginning
- which are enough to determine an index
- entry uniquely */
- unsigned n_def:;/*!< number of fields defined so far */
- unsigned n_fields:;/*!< number of fields in the index */
- unsigned n_nullable:;/*!< number of nullable fields */
- unsigned cached:;/*!< TRUE if the index object is in the
- dictionary cache */
- unsigned to_be_dropped:;
- /*!< TRUE if this index is marked to be
- dropped in ha_innobase::prepare_drop_index(),
- otherwise FALSE. Protected by
- dict_sys->mutex, dict_operation_lock and
- index->lock.*/
- dict_field_t* fields; /*!< array of field descriptions */
- #ifndef UNIV_HOTBACKUP
- UT_LIST_NODE_T(dict_index_t)
- indexes;/*!< list of indexes of the table */
- btr_search_t* search_info; /*!< info used in optimistic searches */
- /*----------------------*/
- /** Statistics for query optimization */
- /* @{ */
- ib_int64_t* stat_n_diff_key_vals;
- /*!< approximate number of different
- key values for this index, for each
- n-column prefix where n <=
- dict_get_n_unique(index); we
- periodically calculate new
- estimates */
- ib_int64_t* stat_n_non_null_key_vals;
- /* approximate number of non-null key values
- for this index, for each column where
- n < dict_get_n_unique(index); This
- is used when innodb_stats_method is
- "nulls_ignored". */
- ulint stat_index_size;
- /*!< approximate index size in
- database pages */
- ulint stat_n_leaf_pages;
- /*!< approximate number of leaf pages in the
- index tree */
- /* @} */
- rw_lock_t lock; /*!< read-write lock protecting the
- upper levels of the index tree */
- trx_id_t trx_id; /*!< id of the transaction that created this
- index, or 0 if the index existed
- when InnoDB was started up */
- #endif /* !UNIV_HOTBACKUP */
- #ifdef UNIV_BLOB_DEBUG
- mutex_t blobs_mutex;
- /*!< mutex protecting blobs */
- void* blobs; /*!< map of (page_no,heap_no,field_no)
- to first_blob_page_no; protected by
- blobs_mutex; @see btr_blob_dbg_t */
- #endif /* UNIV_BLOB_DEBUG */
- #ifdef UNIV_DEBUG
- ulint magic_n;/*!< magic number */
- /** Value of dict_index_struct::magic_n */
- # define DICT_INDEX_MAGIC_N
- #endif
- };
mysql优化器在统计全表扫描的代价时的方法的更多相关文章
- MySQL查询优化:LIMIT 1避免全表扫描
在某些情况下,如果明知道查询结果只有一个,SQL语句中使用LIMIT 1会提高查询效率. 例如下面的用户表(主键id,邮箱,密码): create table t_user(id int primar ...
- 记录一次没有收集直方图优化器选择全表扫描导致CPU耗尽
场景:数据库升级第二天,操作系统CPU使用率接近100%. 查看ash报告: 再看TOP SQL 具体SQL: select count(1) as chipinCount, sum(bets) as ...
- SQL SERVER中关于OR会导致索引扫描或全表扫描的浅析
在SQL SERVER的查询语句中使用OR是否会导致不走索引查找(Index Seek)或索引失效(堆表走全表扫描 (Table Scan).聚集索引表走聚集索引扫描(Clustered Index ...
- SQL SERVER中关于OR会导致索引扫描或全表扫描的浅析 (转载)
在SQL SERVER的查询语句中使用OR是否会导致不走索引查找(Index Seek)或索引失效(堆表走全表扫描 (Table Scan).聚集索引表走聚集索引扫描(Clustered Index ...
- MySql避免全表扫描【转】
原文地址:http://blog.163.com/ksm19870304@126/blog/static/37455233201251901943705/ 对查询进行优化,应尽量避免全表扫描,首先应考 ...
- Mysql避免全表扫描sql查询优化 .
对查询进行优化,应尽量避免全表扫描,首先应考虑在 where 及 order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: · 使用ANALYZE TABLE tbl_n ...
- MySql避免全表扫描
对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: · 使用ANALYZE TABLE tbl_name为 ...
- 【转】避免全表扫描的sql优化
对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描:· 使用ANALYZE TABLE tbl_name为扫 ...
- 避免全表扫描的sql优化
对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: · 使用ANALYZE TABLE tbl_na ...
随机推荐
- at java.net.InetAddress.getLocalHost(InetAddress.java:1475)
今天在centos 安装hadoop安装完成后执行wordcount的时候报如下错误: at java.net.InetAddress.getLocalHost(InetAddress.java:14 ...
- a标签的四个伪类
A标签的css样式 CSS为一些特殊效果准备了特定的工具,我们称之为“伪类”.其中有几项是我们经常用到的,下面我们就详细介绍一下经常用于定义链接样式的四个伪类,它们分别是: :link :v ...
- python之零碎知识
一 join方法 主要是做字符串的拼接:join后面跟的类型必须是要可迭代得到对象 for循环的对象是可迭代对象 # result = "".join(li) # print(re ...
- hadoop学习笔记(一):概念和组成
一.什么是hadoop Apache Hadoop是一款支持数据密集型分布式应用并以Apache 2.0许可协议发布的开源软件框架.它支持在商品硬件构建的大型集群上运行的应用程序.Hadoop是根据G ...
- Linux服务器部署系列之三—DNS篇
网上介绍DNS的知识很多,在这里我就不再讲述DNS原理及做名词解释了.本篇我们将以一个实例为例来讲述DNS的配置,实验环境如下: 域名:guoxuemin.cn, 子域:shenzhen.guoxue ...
- HDU 1718 Rank (排序)
题意:给你n个学号和成绩,并且给定一个学号,让找这个学号是多少名. 析:用个结构体,按成绩排序,然后找那个学号,这个题有一个小坑,那就是并列的情况, 可能并列多少名,这个要考虑一下,其他的easy! ...
- 16)maven lifecycle
http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html http://maven.apache.o ...
- Global Mapper
https://blog.csdn.net/mrib/article/details/75116373 http://www.bluemarblegeo.com/products/global-map ...
- C++总的const使用说明
C++总的const使用说明 1. const修饰类成员变量 程序: #include <iostream> using namespace std; class A { public: ...
- 虚拟化之Xen简介
1>相关知识简介: 1>常用的磁盘IO调度器: CFQ:完全公平队列算法: deadline:最后期限算法: anticipatory:顺序读写队列算法/预期算法: NOOP:no op ...