mysql优化器在统计全表扫描的代价时的方法

innodb 的聚集索引的叶子结点存放的是索引值以及数据页的偏移量

那么在计算全表扫描的代价是怎么计算的呢？

我们知道代价为 cpu代价+io代价

cpu代价就是每5条记录比对计算一个代价 (这里的记录并不是我们数据记录，而是索引记录) 是数据记录个数

又是如何取出全表的总记录呢 (即全表的总索引记录)

具体方法是通过索引能拿到叶子结点的page数，page页默认16K ，那么总容量为 leaf page num * 16k

再计算这个索引的长度，因为索引可能是由多个字段构成，因此要遍历，假设为 m

total_records = leaf page num * 16k /m 就是索引记录个数了， 一条聚焦索引记录对应一条数据记录，所以这里是总的记录数

还是有问题这个leaf page是数据页，而m是主键的长度，上面的total_records计算出来的结果并不是准确的记录个数，按理说m为一条记录的长度，但代码里是主键的长度

那么cpu cost 就是 total_records/5+1

　　io cost 就是 (double) (prebuilt->table->stat_clustered_index_size(聚簇索引叶页面数);

/******************************************************************//**
Calculate the time it takes to read a set of ranges through an index
This enables us to optimise reads for clustered indexes.
@return    estimated time measured in disk seeks */
UNIV_INTERN
double
ha_innobase::read_time(
/*===================*/
    uint    index,    /*!< in: key number */
    uint    ranges,    /*!< in: how many ranges */
    ha_rows rows)    /*!< in: estimated number of rows in the ranges */
{
    ha_rows total_rows;
    double    time_for_scan;
 
    if (index != table->s->primary_key) {
        /* Not clustered */
        return(handler::read_time(index, ranges, rows));
    }
 
    if (rows <= ) {
 
        return((double) rows);
    }
 
    /* Assume that the read time is proportional to the scan time for all
    rows + at most one seek per range. */
 
    time_for_scan = scan_time();
 
    //estimate_rows_upper_bound这里就是计算全表总记录的函数
    if ((total_rows = estimate_rows_upper_bound()) < rows) {
 
        return(time_for_scan);
    }
 
    return(ranges + (double) rows / (double) total_rows * time_for_scan);
}
 
/*********************************************************************//**
Gives an UPPER BOUND to the number of rows in a table. This is used in
filesort.cc.
@return    upper bound of rows */
UNIV_INTERN
ha_rows
ha_innobase::estimate_rows_upper_bound(void)
/*======================================*/
{
    dict_index_t*    index;
    ulonglong    estimate;
    ulonglong    local_data_file_length;
    ulint        stat_n_leaf_pages;
 
    //取得该表的第一个索引，就是聚集索引
    index = dict_table_get_first_index(prebuilt->table);
 
    //聚焦索引的叶子结点个数
    stat_n_leaf_pages = index->stat_n_leaf_pages;
 
    //大小为 叶子结点个数*16k
    local_data_file_length = ((ulonglong) stat_n_leaf_pages) * UNIV_PAGE_SIZE;
 
    /* Calculate a minimum length for a clustered index record and from
    that an upper bound for the number of rows. Since we only calculate
    new statistics in row0mysql.c when a table has grown by a threshold
    factor, we must add a safety factor 2 in front of the formula below. */
 
    //计算这个聚集索引的大小
    // 2* 总叶子个数*16K / 聚焦索引大小 得到聚集索引记录个数
    estimate =  * local_data_file_length /
                     dict_index_calc_min_rec_len(index);
 
    DBUG_RETURN((ha_rows) estimate);
}
 
/*********************************************************************//**
Calculates the minimum record length in an index. */
UNIV_INTERN
ulint
dict_index_calc_min_rec_len(
/*========================*/
    const dict_index_t*    index)    /*!< in: index */
{
    ulint    sum    = ;
    ulint    i;
 
    //记录为compack 紧凑模式，因为有可能这个索引是由多个字段组成，要遍历，求出总字节数
    ulint    comp    = dict_table_is_comp(index->table);
 
    if (comp) {
        ulint nullable = ;
        sum = REC_N_NEW_EXTRA_BYTES;
        for (i = ; i < dict_index_get_n_fields(index); i++) {
            const dict_col_t*    col
                = dict_index_get_nth_col(index, i);
            ulint    size = dict_col_get_fixed_size(col, comp);
            sum += size;
            if (!size) {
                size = col->len;
                sum += size <  ?  : ;
            }
            if (!(col->prtype & DATA_NOT_NULL)) {
                nullable++;
            }
        }
 
        /* round the NULL flags up to full bytes */
        sum += UT_BITS_IN_BYTES(nullable);
 
        return(sum);
    }
}

结构体dict_index_t

/** InnoDB B-tree index */
typedef struct dict_index_struct dict_index_t;
 
/** Data structure for an index.  Most fields will be
initialized to 0, NULL or FALSE in dict_mem_index_create(). */
struct dict_index_struct{
    index_id_t    id;    /*!< id of the index */
    mem_heap_t*    heap;    /*!< memory heap */
    const char*    name;    /*!< index name */
    const char*    table_name;/*!< table name */
    dict_table_t*    table;    /*!< back pointer to table */ //
#ifndef UNIV_HOTBACKUP
    unsigned    space:;
                /*!< space where the index tree is placed */
    unsigned    page:;/*!< index tree root page number */
#endif /* !UNIV_HOTBACKUP */
    unsigned    type:DICT_IT_BITS;
                /*!< index type (DICT_CLUSTERED, DICT_UNIQUE,
                DICT_UNIVERSAL, DICT_IBUF, DICT_CORRUPT) */
#define MAX_KEY_LENGTH_BITS 12
    unsigned    trx_id_offset:MAX_KEY_LENGTH_BITS;
                /*!< position of the trx id column
                in a clustered index record, if the fields
                before it are known to be of a fixed size,
                0 otherwise */
#if (1<<MAX_KEY_LENGTH_BITS) < MAX_KEY_LENGTH
# error (<<MAX_KEY_LENGTH_BITS) < MAX_KEY_LENGTH
#endif
    unsigned    n_user_defined_cols:;
                /*!< number of columns the user defined to
                be in the index: in the internal
                representation we add more columns */
    unsigned    n_uniq:;/*!< number of fields from the beginning
                which are enough to determine an index
                entry uniquely */
    unsigned    n_def:;/*!< number of fields defined so far */
    unsigned    n_fields:;/*!< number of fields in the index */
    unsigned    n_nullable:;/*!< number of nullable fields */
    unsigned    cached:;/*!< TRUE if the index object is in the
                dictionary cache */
    unsigned    to_be_dropped:;
                /*!< TRUE if this index is marked to be
                dropped in ha_innobase::prepare_drop_index(),
                otherwise FALSE. Protected by
                dict_sys->mutex, dict_operation_lock and
                index->lock.*/
    dict_field_t*    fields;    /*!< array of field descriptions */
#ifndef UNIV_HOTBACKUP
    UT_LIST_NODE_T(dict_index_t)
            indexes;/*!< list of indexes of the table */
    btr_search_t*    search_info; /*!< info used in optimistic searches */
    /*----------------------*/
    /** Statistics for query optimization */
    /* @{ */
    ib_int64_t*    stat_n_diff_key_vals;
                /*!< approximate number of different
                key values for this index, for each
                n-column prefix where n <=
                dict_get_n_unique(index); we
                periodically calculate new
                estimates */
    ib_int64_t*    stat_n_non_null_key_vals;
                /* approximate number of non-null key values
                for this index, for each column where
                n < dict_get_n_unique(index); This
                is used when innodb_stats_method is
                "nulls_ignored". */
    ulint        stat_index_size;
                /*!< approximate index size in
                database pages */
    ulint        stat_n_leaf_pages;
                /*!< approximate number of leaf pages in the
                index tree */
    /* @} */
    rw_lock_t    lock;    /*!< read-write lock protecting the
                upper levels of the index tree */
    trx_id_t    trx_id; /*!< id of the transaction that created this
                index, or 0 if the index existed
                when InnoDB was started up */
#endif /* !UNIV_HOTBACKUP */
#ifdef UNIV_BLOB_DEBUG
    mutex_t        blobs_mutex;
                /*!< mutex protecting blobs */
    void*        blobs;    /*!< map of (page_no,heap_no,field_no)
                to first_blob_page_no; protected by
                blobs_mutex; @see btr_blob_dbg_t */
#endif /* UNIV_BLOB_DEBUG */
#ifdef UNIV_DEBUG
    ulint        magic_n;/*!< magic number */
/** Value of dict_index_struct::magic_n */
# define DICT_INDEX_MAGIC_N
#endif
};

mysql优化器在统计全表扫描的代价时的方法的更多相关文章

MySQL查询优化:LIMIT 1避免全表扫描
在某些情况下,如果明知道查询结果只有一个,SQL语句中使用LIMIT 1会提高查询效率. 例如下面的用户表(主键id,邮箱,密码): create table t_user(id int primar ...
记录一次没有收集直方图优化器选择全表扫描导致CPU耗尽
场景:数据库升级第二天,操作系统CPU使用率接近100%. 查看ash报告: 再看TOP SQL 具体SQL: select count(1) as chipinCount, sum(bets) as ...
SQL SERVER中关于OR会导致索引扫描或全表扫描的浅析
在SQL SERVER的查询语句中使用OR是否会导致不走索引查找(Index Seek)或索引失效(堆表走全表扫描 (Table Scan).聚集索引表走聚集索引扫描(Clustered Index ...
SQL SERVER中关于OR会导致索引扫描或全表扫描的浅析（转载）
在SQL SERVER的查询语句中使用OR是否会导致不走索引查找(Index Seek)或索引失效(堆表走全表扫描 (Table Scan).聚集索引表走聚集索引扫描(Clustered Index ...
MySql避免全表扫描【转】
原文地址:http://blog.163.com/ksm19870304@126/blog/static/37455233201251901943705/ 对查询进行优化,应尽量避免全表扫描,首先应考 ...
Mysql避免全表扫描sql查询优化 .
对查询进行优化,应尽量避免全表扫描,首先应考虑在 where 及 order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: · 使用ANALYZE TABLE tbl_n ...
MySql避免全表扫描
对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: · 使用ANALYZE TABLE tbl_name为 ...
【转】避免全表扫描的sql优化
对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描:· 使用ANALYZE TABLE tbl_name为扫 ...
避免全表扫描的sql优化
对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: · 使用ANALYZE TABLE tbl_na ...

随机推荐

at java.net.InetAddress.getLocalHost(InetAddress.java:1475)
今天在centos 安装hadoop安装完成后执行wordcount的时候报如下错误: at java.net.InetAddress.getLocalHost(InetAddress.java:14 ...
a标签的四个伪类
A标签的css样式 CSS为一些特殊效果准备了特定的工具,我们称之为“伪类”.其中有几项是我们经常用到的,下面我们就详细介绍一下经常用于定义链接样式的四个伪类,它们分别是: :link :v ...
python之零碎知识
一 join方法主要是做字符串的拼接:join后面跟的类型必须是要可迭代得到对象 for循环的对象是可迭代对象 # result = "".join(li) # print(re ...
hadoop学习笔记（一）：概念和组成
一.什么是hadoop Apache Hadoop是一款支持数据密集型分布式应用并以Apache 2.0许可协议发布的开源软件框架.它支持在商品硬件构建的大型集群上运行的应用程序.Hadoop是根据G ...
Linux服务器部署系列之三—DNS篇
网上介绍DNS的知识很多,在这里我就不再讲述DNS原理及做名词解释了.本篇我们将以一个实例为例来讲述DNS的配置,实验环境如下: 域名:guoxuemin.cn, 子域:shenzhen.guoxue ...
HDU 1718 Rank （排序）
题意:给你n个学号和成绩,并且给定一个学号,让找这个学号是多少名. 析:用个结构体,按成绩排序,然后找那个学号,这个题有一个小坑,那就是并列的情况, 可能并列多少名,这个要考虑一下,其他的easy! ...
16）maven lifecycle
http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html http://maven.apache.o ...
Global Mapper
https://blog.csdn.net/mrib/article/details/75116373 http://www.bluemarblegeo.com/products/global-map ...
C++总的const使用说明
C++总的const使用说明 1. const修饰类成员变量程序: #include <iostream> using namespace std; class A { public: ...
虚拟化之Xen简介
1>相关知识简介: 1>常用的磁盘IO调度器: CFQ:完全公平队列算法: deadline:最后期限算法: anticipatory:顺序读写队列算法/预期算法: NOOP:no op ...

mysql优化器在统计全表扫描的代价时的方法

结构体dict_index_t

mysql优化器在统计全表扫描的代价时的方法的更多相关文章

随机推荐

热门专题