innodb 的聚集索引 的叶子结点 存放的 是 索引值以及数据页的偏移量

那么在计算全表扫描的代价是怎么计算的呢?

我们知道代价 为 cpu代价+io代价

cpu代价 就是 每5条记录比对 计算一个代价 (这里的记录并不是我们数据记录,而是索引记录) 是数据记录个数

又是如何取出全表的总记录呢 (即全表的总索引记录)

具体方法是 通过索引能拿到叶子结点的page数,page页默认16K ,那么总容量为 leaf page num * 16k

再计算这个索引的长度,因为索引可能是由多个字段构成,因此要遍历,假设为 m

total_records = leaf page num * 16k /m 就是 索引记录个数了, 一条聚焦索引记录对应一条数据记录,所以这里是总的记录数

还是有问题 这个leaf page是数据页,而m是主键的长度,上面的total_records计算出来的结果 并不是准确的记录个数,按理说m为一条记录的长度,但代码里是主键的长度

那么cpu cost 就是 total_records/5+1

  io cost 就是  (double) (prebuilt->table->stat_clustered_index_size(聚簇索引叶页面数);

  1. /******************************************************************//**
  2. Calculate the time it takes to read a set of ranges through an index
  3. This enables us to optimise reads for clustered indexes.
  4. @return estimated time measured in disk seeks */
  5. UNIV_INTERN
  6. double
  7. ha_innobase::read_time(
  8. /*===================*/
  9. uint index, /*!< in: key number */
  10. uint ranges, /*!< in: how many ranges */
  11. ha_rows rows) /*!< in: estimated number of rows in the ranges */
  12. {
  13. ha_rows total_rows;
  14. double time_for_scan;
  15.  
  16. if (index != table->s->primary_key) {
  17. /* Not clustered */
  18. return(handler::read_time(index, ranges, rows));
  19. }
  20.  
  21. if (rows <= ) {
  22.  
  23. return((double) rows);
  24. }
  25.  
  26. /* Assume that the read time is proportional to the scan time for all
  27. rows + at most one seek per range. */
  28.  
  29. time_for_scan = scan_time();
  30.  
  31. //estimate_rows_upper_bound这里就是计算全表总记录的函数
  32. if ((total_rows = estimate_rows_upper_bound()) < rows) {
  33.  
  34. return(time_for_scan);
  35. }
  36.  
  37. return(ranges + (double) rows / (double) total_rows * time_for_scan);
  38. }
  39.  
  40. /*********************************************************************//**
  41. Gives an UPPER BOUND to the number of rows in a table. This is used in
  42. filesort.cc.
  43. @return upper bound of rows */
  44. UNIV_INTERN
  45. ha_rows
  46. ha_innobase::estimate_rows_upper_bound(void)
  47. /*======================================*/
  48. {
  49. dict_index_t* index;
  50. ulonglong estimate;
  51. ulonglong local_data_file_length;
  52. ulint stat_n_leaf_pages;
  53.  
  54. //取得该表的第一个索引,就是聚集索引
  55. index = dict_table_get_first_index(prebuilt->table);
  56.  
  57. //聚焦索引的叶子结点个数
  58. stat_n_leaf_pages = index->stat_n_leaf_pages;
  59.  
  60. //大小为 叶子结点个数*16k
  61. local_data_file_length = ((ulonglong) stat_n_leaf_pages) * UNIV_PAGE_SIZE;
  62.  
  63. /* Calculate a minimum length for a clustered index record and from
  64. that an upper bound for the number of rows. Since we only calculate
  65. new statistics in row0mysql.c when a table has grown by a threshold
  66. factor, we must add a safety factor 2 in front of the formula below. */
  67.  
  68. //计算这个聚集索引的大小
  69. // 2* 总叶子个数*16K / 聚焦索引大小 得到聚集索引记录个数
  70. estimate = * local_data_file_length /
  71. dict_index_calc_min_rec_len(index);
  72.  
  73. DBUG_RETURN((ha_rows) estimate);
  74. }
  75.  
  76. /*********************************************************************//**
  77. Calculates the minimum record length in an index. */
  78. UNIV_INTERN
  79. ulint
  80. dict_index_calc_min_rec_len(
  81. /*========================*/
  82. const dict_index_t* index) /*!< in: index */
  83. {
  84. ulint sum = ;
  85. ulint i;
  86.  
  87. //记录为compack 紧凑模式,因为有可能这个索引是由多个字段组成,要遍历,求出总字节数
  88. ulint comp = dict_table_is_comp(index->table);
  89.  
  90. if (comp) {
  91. ulint nullable = ;
  92. sum = REC_N_NEW_EXTRA_BYTES;
  93. for (i = ; i < dict_index_get_n_fields(index); i++) {
  94. const dict_col_t* col
  95. = dict_index_get_nth_col(index, i);
  96. ulint size = dict_col_get_fixed_size(col, comp);
  97. sum += size;
  98. if (!size) {
  99. size = col->len;
  100. sum += size < ? : ;
  101. }
  102. if (!(col->prtype & DATA_NOT_NULL)) {
  103. nullable++;
  104. }
  105. }
  106.  
  107. /* round the NULL flags up to full bytes */
  108. sum += UT_BITS_IN_BYTES(nullable);
  109.  
  110. return(sum);
  111. }
  112. }

结构体dict_index_t

  1. /** InnoDB B-tree index */
  2. typedef struct dict_index_struct dict_index_t;
  3.  
  4. /** Data structure for an index. Most fields will be
  5. initialized to 0, NULL or FALSE in dict_mem_index_create(). */
  6. struct dict_index_struct{
  7. index_id_t id; /*!< id of the index */
  8. mem_heap_t* heap; /*!< memory heap */
  9. const char* name; /*!< index name */
  10. const char* table_name;/*!< table name */
  11. dict_table_t* table; /*!< back pointer to table */ //
  12. #ifndef UNIV_HOTBACKUP
  13. unsigned space:;
  14. /*!< space where the index tree is placed */
  15. unsigned page:;/*!< index tree root page number */
  16. #endif /* !UNIV_HOTBACKUP */
  17. unsigned type:DICT_IT_BITS;
  18. /*!< index type (DICT_CLUSTERED, DICT_UNIQUE,
  19. DICT_UNIVERSAL, DICT_IBUF, DICT_CORRUPT) */
  20. #define MAX_KEY_LENGTH_BITS 12
  21. unsigned trx_id_offset:MAX_KEY_LENGTH_BITS;
  22. /*!< position of the trx id column
  23. in a clustered index record, if the fields
  24. before it are known to be of a fixed size,
  25. 0 otherwise */
  26. #if (1<<MAX_KEY_LENGTH_BITS) < MAX_KEY_LENGTH
  27. # error (<<MAX_KEY_LENGTH_BITS) < MAX_KEY_LENGTH
  28. #endif
  29. unsigned n_user_defined_cols:;
  30. /*!< number of columns the user defined to
  31. be in the index: in the internal
  32. representation we add more columns */
  33. unsigned n_uniq:;/*!< number of fields from the beginning
  34. which are enough to determine an index
  35. entry uniquely */
  36. unsigned n_def:;/*!< number of fields defined so far */
  37. unsigned n_fields:;/*!< number of fields in the index */
  38. unsigned n_nullable:;/*!< number of nullable fields */
  39. unsigned cached:;/*!< TRUE if the index object is in the
  40. dictionary cache */
  41. unsigned to_be_dropped:;
  42. /*!< TRUE if this index is marked to be
  43. dropped in ha_innobase::prepare_drop_index(),
  44. otherwise FALSE. Protected by
  45. dict_sys->mutex, dict_operation_lock and
  46. index->lock.*/
  47. dict_field_t* fields; /*!< array of field descriptions */
  48. #ifndef UNIV_HOTBACKUP
  49. UT_LIST_NODE_T(dict_index_t)
  50. indexes;/*!< list of indexes of the table */
  51. btr_search_t* search_info; /*!< info used in optimistic searches */
  52. /*----------------------*/
  53. /** Statistics for query optimization */
  54. /* @{ */
  55. ib_int64_t* stat_n_diff_key_vals;
  56. /*!< approximate number of different
  57. key values for this index, for each
  58. n-column prefix where n <=
  59. dict_get_n_unique(index); we
  60. periodically calculate new
  61. estimates */
  62. ib_int64_t* stat_n_non_null_key_vals;
  63. /* approximate number of non-null key values
  64. for this index, for each column where
  65. n < dict_get_n_unique(index); This
  66. is used when innodb_stats_method is
  67. "nulls_ignored". */
  68. ulint stat_index_size;
  69. /*!< approximate index size in
  70. database pages */
  71. ulint stat_n_leaf_pages;
  72. /*!< approximate number of leaf pages in the
  73. index tree */
  74. /* @} */
  75. rw_lock_t lock; /*!< read-write lock protecting the
  76. upper levels of the index tree */
  77. trx_id_t trx_id; /*!< id of the transaction that created this
  78. index, or 0 if the index existed
  79. when InnoDB was started up */
  80. #endif /* !UNIV_HOTBACKUP */
  81. #ifdef UNIV_BLOB_DEBUG
  82. mutex_t blobs_mutex;
  83. /*!< mutex protecting blobs */
  84. void* blobs; /*!< map of (page_no,heap_no,field_no)
  85. to first_blob_page_no; protected by
  86. blobs_mutex; @see btr_blob_dbg_t */
  87. #endif /* UNIV_BLOB_DEBUG */
  88. #ifdef UNIV_DEBUG
  89. ulint magic_n;/*!< magic number */
  90. /** Value of dict_index_struct::magic_n */
  91. # define DICT_INDEX_MAGIC_N
  92. #endif
  93. };

mysql优化器在统计全表扫描的代价时的方法的更多相关文章

  1. MySQL查询优化:LIMIT 1避免全表扫描

    在某些情况下,如果明知道查询结果只有一个,SQL语句中使用LIMIT 1会提高查询效率. 例如下面的用户表(主键id,邮箱,密码): create table t_user(id int primar ...

  2. 记录一次没有收集直方图优化器选择全表扫描导致CPU耗尽

    场景:数据库升级第二天,操作系统CPU使用率接近100%. 查看ash报告: 再看TOP SQL 具体SQL: select count(1) as chipinCount, sum(bets) as ...

  3. SQL SERVER中关于OR会导致索引扫描或全表扫描的浅析

    在SQL SERVER的查询语句中使用OR是否会导致不走索引查找(Index Seek)或索引失效(堆表走全表扫描 (Table Scan).聚集索引表走聚集索引扫描(Clustered Index ...

  4. SQL SERVER中关于OR会导致索引扫描或全表扫描的浅析 (转载)

    在SQL SERVER的查询语句中使用OR是否会导致不走索引查找(Index Seek)或索引失效(堆表走全表扫描 (Table Scan).聚集索引表走聚集索引扫描(Clustered Index ...

  5. MySql避免全表扫描【转】

    原文地址:http://blog.163.com/ksm19870304@126/blog/static/37455233201251901943705/ 对查询进行优化,应尽量避免全表扫描,首先应考 ...

  6. Mysql避免全表扫描sql查询优化 .

    对查询进行优化,应尽量避免全表扫描,首先应考虑在 where 及 order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: ·   使用ANALYZE TABLE tbl_n ...

  7. MySql避免全表扫描

    对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: · 使用ANALYZE TABLE tbl_name为 ...

  8. 【转】避免全表扫描的sql优化

    对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描:· 使用ANALYZE TABLE tbl_name为扫 ...

  9. 避免全表扫描的sql优化

    对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引:  .尝试下面的技巧以避免优化器错选了表扫描: ·   使用ANALYZE TABLE tbl_na ...

随机推荐

  1. at java.net.InetAddress.getLocalHost(InetAddress.java:1475)

    今天在centos 安装hadoop安装完成后执行wordcount的时候报如下错误: at java.net.InetAddress.getLocalHost(InetAddress.java:14 ...

  2. a标签的四个伪类

    A标签的css样式   CSS为一些特殊效果准备了特定的工具,我们称之为“伪类”.其中有几项是我们经常用到的,下面我们就详细介绍一下经常用于定义链接样式的四个伪类,它们分别是: :link    :v ...

  3. python之零碎知识

    一 join方法 主要是做字符串的拼接:join后面跟的类型必须是要可迭代得到对象 for循环的对象是可迭代对象 # result = "".join(li) # print(re ...

  4. hadoop学习笔记(一):概念和组成

    一.什么是hadoop Apache Hadoop是一款支持数据密集型分布式应用并以Apache 2.0许可协议发布的开源软件框架.它支持在商品硬件构建的大型集群上运行的应用程序.Hadoop是根据G ...

  5. Linux服务器部署系列之三—DNS篇

    网上介绍DNS的知识很多,在这里我就不再讲述DNS原理及做名词解释了.本篇我们将以一个实例为例来讲述DNS的配置,实验环境如下: 域名:guoxuemin.cn, 子域:shenzhen.guoxue ...

  6. HDU 1718 Rank (排序)

    题意:给你n个学号和成绩,并且给定一个学号,让找这个学号是多少名. 析:用个结构体,按成绩排序,然后找那个学号,这个题有一个小坑,那就是并列的情况, 可能并列多少名,这个要考虑一下,其他的easy! ...

  7. 16)maven lifecycle

    http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html http://maven.apache.o ...

  8. Global Mapper

    https://blog.csdn.net/mrib/article/details/75116373 http://www.bluemarblegeo.com/products/global-map ...

  9. C++总的const使用说明

    C++总的const使用说明 1. const修饰类成员变量 程序: #include <iostream> using namespace std; class A { public: ...

  10. 虚拟化之Xen简介

    1>相关知识简介: 1>常用的磁盘IO调度器: CFQ:完全公平队列算法: deadline:最后期限算法: anticipatory:顺序读写队列算法/预期算法: NOOP:no  op ...