Common Scenarios to avoid with DataWarehousing
Database Design
Rule |
Description |
Value |
Source |
Problem Description |
1 |
Excessive sorting and RID lookup operations should be reduced with covered indexes. |
Sys.dm_exec_sql_text Sys.dm_exec_cached_plans |
Large data warehouse can benefit from more indexes. Indexes can be used to cover queries and avoid sorting. The cost of index overhead is only paid when data is loaded. |
|
2 |
Excessive fragmentation: Average fragmentation_in_percent should be <25% |
>25% |
sys.dm_db _index_physical_stats |
Reducing index fragmentation through index rebuilds can benefit big range scans, common in data warehouse and Reporting scenarios. |
3 |
Scans and ranges are common. Look for missing indexes |
>= 1 |
Perfmon object SQL Server Access Methods Sys.dm_db_missing_index_group_stats Sys.dm_db_missing_index_groups Sys.dm_db_missing_index_details |
A missing index flushes the cache. |
4 |
Unused Indexes should be avoided |
If an index is NEVER used, it will not appear in the DMV sys.dm_db_index_usage_stats |
Index maintenance for unused indexes should be avoided. |
Resource issue: CPU
Rule |
Description |
Value |
Source |
Problem Description |
1 |
Signal Waits |
> 25% |
Sys.dm_os_wait_stats |
Time in runnable queue is pure CPU wait. |
2 |
Avoid plan reuse |
> 25% |
Perfmon object SQL Server Statistics |
Data warehouse has fewer transactions than OLTP, each with significantly bigger IO. Therefore, having the correct plan is more important than reusing a plan. Unlike OLTP, data warehouse queries are not identical. |
3 |
Parallelism: Cxpacket waits |
<10% |
Sys.dm_os_wait_stats |
Parallelism is desirable in data warehouse or reporting workloads. |
Resource issue: Memory
Rule |
Description |
Value |
Source |
Problem Description |
1 |
Memory grants pending |
>1 |
Perfmon object SQL Server Memory Manager |
Memory grant not available for query to run. Check for Sufficient memory and page life expectancy. |
2 |
Page life expectancy |
Drops by 50% |
Perfmon object SQL Server Buffer Manager |
Page life expectancy is the average number of seconds a data page stays in cache. Low values could indicate a cache flush that is caused by a big read. Look for possible missing index. |
Resource issue: IO
Rule |
Description |
Value |
Source |
Problem Description |
1 |
Average Disk sec/read |
>20 ms |
Perfmon object Physical Disk |
Reads should take 4-8ms without any IO pressure. |
2 |
Average Disk sec/write |
>20 ms |
Perfmon object Physical Disk |
Writes (sequential) can be as fast as 1 ms for transaction log. |
3 |
Big scans |
>1 |
Perfmon object SQL Server Access Methods |
A missing index flushes the cache. |
4 |
If Top 2 values for wait stats are any of the following: ASYNCH_IO_COMPLETION IO_COMPLETION LOGMGR WRITELOG PAGEIOLATCH_x |
Top 2 |
Sys.dm_os_wait_stats |
If top 2 wait_stats values include IO, there is an IO bottleneck |
Resource issue: Blocking
Rule |
Description |
Value |
Source |
Problem Description |
1 |
Block percentage |
>2% |
Sys.dm_db_index_operational_stats |
Frequency of blocks. |
2 |
Block process report |
30 sec |
Sp_configure, profiler |
Report of statements. |
3 |
Average Row Lock Waits |
>100ms |
Sys.dm_db_index_operational_stats |
Duration of blocks. |
4 |
If Top 2 values for wait stats are any of the following: LCK_M_BU LCK_M_IS LCK_M_IU LCK_M_IX LCK_M_RIn_NL LCK_M_RIn_S LCK_M_RIn_U LCK_M_RIn_X LCK_M_RS_S LCK_M_RS_U LCK_M_RX_S LCK_M_RX_U LCK_M_RX_X LCK_M_S LCK_M_SCH_M LCK_M_SCH_S LCK_M_SIU LCK_M_SIX LCK_M_U LCK_M_UIX LCK_M_X |
Top 2 |
Sys.dm_os_wait_stats |
If top 2 wait_stats values include IO, there is a blocking bottleneck. Consider using row versioning to minimize shared locking blocks. |
Exactly the opposite of OLTP applications, reporting or relational data warehouse applications are characterized by small numbers of (different) big transactions. These are frequently SELECT intensive operations. The implications are significant for database design, resource usage, and system performance.
Reporting and data warehouse performance objectives are as follows:
- Data warehouse and relational data warehouse designs can have more indexes as the cost of index maintenance is paid only one time, during the batch update process.
- Plan reuse should generally be avoided. Plan reuse may result in picking up a plan that was good for some other query (with different data distribution), but may not be good for this query. The time taken for plan generation of a large DataWarehouse query is not nearly as important as having the right plan.
- Sorts can and should be minimized with correct index usage.
- Missing index situations should be investigated and corrected.
- Large IOs such as range scans benefits from on disk contiguity. Index fragmentation should be frequently monitored and kept to a minimum with index rebuilds.
- Blocking is generally uncommon as most data warehouse transactions are read operations.
- Parallelism is generally desirable for data warehouse applications.
Common Scenarios to avoid with DataWarehousing的更多相关文章
- Common scenarios to avoid in OLTP
Database Design Rule Description Value Source Problem Description 1 High Frequency queries having a ...
- 8 Mistakes to Avoid while Using RxSwift. Part 1
Part 1: not disposing a subscription Judging by the number of talks, articles and discussions relate ...
- Android Lint Checks
Android Lint Checks Here are the current list of checks that lint performs as of Android Studio 2.3 ...
- (WPF) 基本题
What is WPF? WPF (Windows Presentation foundation) is a graphical subsystem for displaying user inte ...
- Processing Images
https://developer.apple.com/library/content/documentation/GraphicsImaging/Conceptual/CoreImaging/ci_ ...
- IMS Global Learning Tools Interoperability™ Implementation Guide
Final Version 1.1 Date Issued: 13 March 2012 Latest version: http://www.imsglobal ...
- 9.Parameters
1.Optional and Named Parameters calls these methods can optionally not specify some of the arguments ...
- C# Development 13 Things Every C# Developer Should Know
https://dzone.com/refcardz/csharp C#Development 13 Things Every C# Developer Should Know Written by ...
- Introducing Microsoft Sync Framework: Sync Services for File Systems
https://msdn.microsoft.com/en-us/sync/bb887623 Introduction to Microsoft Sync Framework File Synchro ...
随机推荐
- Python之路【第十八篇】:Web框架们
Python之路[第十八篇]:Web框架们 Python的WEB框架 Bottle Bottle是一个快速.简洁.轻量级的基于WSIG的微型Web框架,此框架只由一个 .py 文件,除了Pytho ...
- 第八十七天请假 PHP smarty模板配置以及简单的调用方式
smarty模板的配置文件 <?php define("ROOT",str_replace("\\","/",dirname(__FI ...
- VC++ 标准C++中的string类的用法总结
相信使用过MFC编程的朋友对CString这个类的印象应该非常深刻吧?的确,MFC中的CString类使用起来真的非常的方便好用.但是如果离开了MFC框架,还有没有这样使用起来非常方便的类呢?答案是肯 ...
- position:absolute和float会隐式的改变display类型
position:absolute和float会隐式的改变display类型,不论之前是什么类型的元素(display:none除外),只要设置了position:absolute或float,都会让 ...
- 快考试了,尽快写完HashTable。
(1)Count Primes 质数(素数):在大于1 的自然数中,除了1和它本身之外,不能被任何其他整数整除. 解题思路:使用一个boolean类型的数组,从i(2) 开始循环,将小于N的i的倍数都 ...
- JSP(forward动作)登录功能
<%@ page language= "java" contentType="text/html;charset=UTF-8" %><html ...
- html-5 --html5教程article、footer、header、nav、section使用
header header元素是一种具有引导和导航作用的辅助元素.通常,header元素可以包含一个区块的标题(如h1至h6,或者hgroup元素标签),但也可以包含其他内容,例如数据表格.搜索表单或 ...
- jsp学习---使用jsp和JavaBean实现超简单网页计算器
一.需求 如题,用jsp实现一个超简单的网页计算器. 二.实现 1.效果图 1)初始界面: 2)随便输入两个数进行相乘: 3)当除数为零时提示报错: 2.代码 Calculator.java pack ...
- SAP 设置周期性的后台程序,SM36,图解操作 (转)
SM36是设置SAP周期性运行的事务码 来测试一下,首先先写一个程序: 我有一个zzp_people2的数据表. DATA : INT1 TYPE I. DATA : ITAB LIKE ZZP_PE ...
- Android打包签名
Ⅰ.用jdk和sdk自带工具打包签名 a.把jdk下的keytool.exe和jarsigner.exe所在目录(两个工具在同一目录) 添加到环境变量path 1)新建环境变量package,pack ...