In a Nested Loops Join, for example, the first accessed table is called the outer table and the second one the inner table. In a Hash Join, the first accessed table is the build input and the second one the probe input.

Stream Aggregate and Merge Join, require data to be already sorted. To provide sorted data, the Query Optimizer may employ an existing index, or it may explicitly introduce a Sort operator.

hashing is used by the Hash Aggregate and Hash Join operators, both of which work by building a hash table in memory. The Hash Join operator uses memory only for the smaller of its two inputs, which is defined by the Query Optimizer.

Queries using an aggregate function and no GROUP BY clause are called

scalar aggregates, as they return a single value, and are always implemented by the Stream Aggregate operator.

Stream Aggregate operator is to aggregate values based on groups, its algorithm relies on the fact that its input is already sorted by the GROUP BY clause, and thus records from the same group are next to each other.

The Query Optimizer can select a Hash Aggregate for big tables where the data is not sorted, there is no need to sort it, and its cardinality estimates only a few groups.

The input shown at the top in a Nested Loops Join plan is known as the outer input and the one at the bottom is the inner input. The algorithm for the Nested Loops Join is very simple: the operator used to access the outer input is executed only once, and the operator used to access the inner input is executed once for every record that qualifies on the outer input.

the Query Optimizer is more likely to choose a Nested Loops Join when the outer input is small and the inner input has an index on the join key. This join type can be especially effective when the inner input is potentially large.

One difference between this and a Nested Loops Join is that, in a Merge Join, both input operators are executed only once. You can verify this by looking at the properties of

both operators, and you'll find that the number of executions is 1. Another difference is that a Merge Join requires an equality operator and its inputs sorted on the join predicate. In this example, the join predicate has an equality operator.

given the nature of the Merge Join, the Query Optimizer is more likely to choose this algorithm when faced with medium to large inputs, where there is an equality operator on the join predicate, and their inputs are sorted.

In the same way as the Merge Join, the Hash Join requires an equality operator on the

join predicate but, unlike the Merge Join, it does not require its inputs to be sorted. In addition, its operations in both inputs are executed only once, which you can verify by looking at the operator properties as shown before. However, a Hash Join works by

creating a hash table in memory. The Query Optimizer will use a cardinality estimation to detect the smaller of the two inputs, called the build input, and will use it to build a hash table in memory. If there is not enough memory to host the hash table, SQL Server can use disk space, creating a workfile in tempdb. A Hash Join will also block, but only during the time the build input is hashed. After the build input is hashed, the second table, called the probe input, will be read and compared to the hash table. If rows are matched they will be returned. On the execution plan, the table at the top will be used as the build input, and the table at the bottom as the probe input.

Finally, note that a behavior called "role reversal" may appear. If the Query Optimizer is not able to correctly estimate which of the two inputs is smaller, the build and probe roles may be reversed at execution time, and this will not be shown on the execution plan.

In summary, the Query Optimizer can choose a Hash Join for large inputs where there is an equality operator on the join predicate. Since both tables are scanned, the cost of a Hash Join is the sum of both inputs.

The SQL Server Query Optimizer is a cost-based optimizer, and therefore the quality of the execution plans it generates is directly related to the accuracy of its cost

estimations.

Statistics contain three major pieces of information: the histogram, the density information, and the string statistics, all of which help with different parts of the cardinality estimation process.

A cardinality estimate is the estimated number of records that will be returned by filtering, JOIN predicates or GROUP BY operations. Selectivity is a concept similar to cardinality estimation, which can be described as the percentage of rows from an input that satisfy a predicate.

Statistics are created in several ways: automatically by the Query Optimizer (if the default option to automatically create statistics, AUTO_CREATE_STATISTICS, is on); when an index is created; or when they are explicitly created, for example, by using the CREATE STATISTICS statement. Statistics can be created on one or more columns, and both the index and explicit creation methods support single- and multi-column statistics.

However, the statistics which are automatically generated by the Query Optimizer are always single-column statistics.

Both histograms and string statistics are created only for the first

column of a statistics object, the latter only if the column is of a string data type.

Density information is calculated for each set of columns forming a prefix in the statistics object.

The Query Optimizer always uses a sample of the target table when it creates or updates statistics, and the minimum sample size is 8 MB, or the size of the table if it's smaller than 8 MB. The sample size will increase for bigger tables, but it may still only be a small percentage of the table.

String statistics contain the data distribution for string columns, and can help to estimate the cardinality of queries with LIKE conditions.

Density information can be used to improve the Query Optimizer's estimates for GROUP BY operations.

GROUP BY queries can benefit from the estimated number of distinct values, and this information is already available in the density value.

In SQL Server, histograms are created only for the first column of a statistics object, and they compress the information of the distribution of values in those columns by partitioning that information into subsets called buckets or steps. The maximum number of steps in a histogram is 200, but even if the input has 200 or more unique values, a histogram may still have less than 200 steps.

The purpose of the Query Optimizer, as we're all aware, is to provide an optimum

execution plan and, in order to do so, it generates possible alternative execution plans through the use of transformation rules. These alternative plans are stored for the duration of the optimization process in a structure called the memo.

SQL TUNNING的更多相关文章

  1. creating indexing for SQL tunning

    1. Not so long time ago, I got a report from customer. It's reported that they had a report getted v ...

  2. 为什么需要SQL Profile

    为什么需要SQL Profile Why oracle need SQL Profiles,how it work and what are SQL Profiles... 使用DBMS_XPLAN. ...

  3. ORACLE SQL TUNING ADVISOR 使用方法

    sql tunning advisor 使用的主要步骤: 1 建立tunning task 2 执行task 3 显示tunning 结果 4 根据建议来运行相应的调优方法  下面来按照这个顺序来实施 ...

  4. Performance Tunning - OCP

    This artical is forcused on Oracle 11g Release 2.  It is an summary from the OCP documentation. The ...

  5. advisor调优工具优化sql(基于sql_id)

    advisor调优工具优化sql(基于sql_id) 问题背景:客户反馈数据库迁移后cpu负载激增,帮忙查看原因 解决思路:1> 查看问题系统发现有大量的latch: cache buffers ...

  6. Oracle-优化SQL语句

    建议不使用(*)来代替所有列名 用truncate代替delete 在SQL*Plus环境中直接使用truncate table即可:要在PL/SQL中使用,如: 创建一个存储过程,实现使用trunc ...

  7. PLSQL_Oracle面试整理(汇总)

    2014-08-16 Created By BaoXinjian

  8. Tuning 01 Overview of Oracle Performance Tuning

    永无止境的调优 service level agreements: 是一个量化的调优的指标. performance 只要满足业务OK就可以了, 没必要调的很多, 因为有得必有失, 一方面调的特别优化 ...

  9. dbms_sqltune.report_sql_monitor 自动调优

    --创建 dbms_sqltune.create_tuning_task ; --执行 dbms_sqltune.execute_tuning_task; --产看创建的task 和 status S ...

随机推荐

  1. Visual Studio《加载此属性页时出错》的解决办法

    打开aspx页面时不能切换到设计视图,vs 2008工具箱中无控件.打开vs 2008的工具>选项>HTML设计器时提示:加载此属性页时出错 有时还会有其它错误提示,比如打开一个Windo ...

  2. asp.net.web如何简单生成和保存二维码图片的例子

    首先,要有生成二维码图片,需要二维码生成的类库,到官网下载thoughtWorks.QRCode.dll 例子的步骤: 1.创建项目QRCodeTest1,选择asp.net.web窗体应用程序

  3. PHP的静态变量和引用函数

    直接贴代码,结果的原因写在备注了 <?php /** * Created by PhpStorm. * User: Administrator * Date: 16-8-25 * Time: 上 ...

  4. 如何使用mybatis《三》

    在前边阐述了单独使用mybatis的方法,在实际开发过程中mybatis经常和spring一起使用,即mybatis和spring进行集成,现在我们来看如何集成. mybatis和spring进行集成 ...

  5. Hibernate框架之入门

    1.Hibernate框架简述 Hibernate的核心组件在基于MVC设计模式的JAVA WEB应用中,Hibernate可以作为模型层/数据访问层.它通过配置文件(hibernate.proper ...

  6. 利用多写Redis实现分布式锁原理与实现分析(转)

    利用多写Redis实现分布式锁原理与实现分析   一.关于分布式锁 关于分布式锁,可能绝大部分人都会或多或少涉及到. 我举二个例子:场景一:从前端界面发起一笔支付请求,如果前端没有做防重处理,那么可能 ...

  7. SQL Server Insert时开启显式事务

    如果没法避免一条一条的写入,那么在处理前显示开启一个事务 begin tran  在处理完成后 commit 这样也要比不开显示事务会快很多! while i < 10000begin inse ...

  8. 区别CALL SCREEN/SET SCREEN/LEAVE TO SCREEN

    分类: 1,CALL SCREEN XXXX将在Screen调用栈(CALL STACK)上面添加一层调用(进栈),调用XXXX的PBO和PAI,如果XXXX的Next Screen不为0,那么将继续 ...

  9. GTD3年来读的52本书

    2012年   1.一生的计划 平衡:人生要在精神.理财.教育和娱乐4个方面进行平衡.   2.重来REWORK 小型软件公司的创业与软件项目的管理 不要管全年计划,只要找出下一项最重要的任务,然后起 ...

  10. java多线程系列7-停止线程

    本文主要总结在java中停止线程的方法 在java中有以下三种方法可以终止正在运行的线程: 1.使用退出标志 2.使用stop方法强行终止线程,但是不推荐,因为stop和suspend.resume一 ...