I/O requests taking longer than 15 seconds to complete on file I/O瓶颈问题

http://mssqlwiki.com/2012/08/27/io-requests-taking-longer-than-15-seconds-to-complete-on-file/

https://blogs.msdn.microsoft.com/sqlsakthi/2011/02/09/troubleshooting-sql-server-io-requests-taking-longer-than-15-seconds-io-stalls-disk-latency/。

http://www.cnblogs.com/lyhabc/p/3720666.html(从分析SQLSERVER ERRORLOG查找错误折射出的工作效率问题)

SQLSERVER HAS ENCOUNTERED 1 OCCURRENCE(S) OF I/O REQUESTS TAKING LONGER THAN 15 SECONDS TO COMPLETE ON FILE[E:\DataBase\bar9115]

上面的错误在2012年的时候开始出现一直到现在

Do you see warnings like one below in your SQL Server error log?

SQL Server has encountered  x occurrence(s) of I/O requests taking longer than 15 seconds to complete on file .

The OS file handle is 0x000006A4. The offset of the latest long I/O is: 0x00000

(or)

BobMgr::GetBuf: Sort Big Output Buffer write not complete after n seconds.

This indicates SQL Server I/O Bottlenecks. SQL Server performance highly relies on the Disk performance.  SQL Server I/O Bottleneck can be identified through

1. PAGEIOLATCH_xx or WRITELOG wait types in Sys.Sysprocesses and other DMV’s

2. I/O taking longer than 15 seconds in SQL Server Error log.

{

SQL Server has encountered X occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [ ] in database [IOTEST (7). The OS file handle is 0x000006A4. The offset of the latest long I/O is: 
0x000001

}

3. By looking at I/O latch wait statistics in sys.dm_os_wait_stats

{

Select  wait_type,         waiting_tasks_count,         wait_time_ms  from    sys.dm_os_wait_stats where    wait_type like ‘PAGEIOLATCH%’ 
order by wait_type

}

4. By looking at pending I/O requests and isolating the disks,File and database in which we have I/O Bottleneck.

{

select     database_id,     file_id,     io_stall,     io_pending_ms_ticks,     scheduler_address from    sys.dm_io_virtual_file_stats(NULL, NULL)t1,         sys.dm_io_pending_io_requests as t2 
where    t1.file_handle = t2.io_handle

}

Following are common reasons for I/O Bottleneck in SQL Server.

1. SQL Server is spawning more I/O requests than what I/O disk subsystem could handle.

2 . There could be an Issue with I/O subsystem (or) driver/firmware issue (or) Misconfiguration in I/O Subsystem (or) Compression and  so the Disks are performing very slow and hence SQL Server is affected.

3. Some other process on the system is saturating the disks with I/O requests. Common application includes AV Scan,System Backup Etc. So I/O requests posted by SQL Server becomes slow.

How to  troubleshoot?

1.  Exclude SQL Server files from antivirus scan.

2. Do not place SQL Server FILES on compressed drives.

3. Distribute SQL Server data files and transaction log files across drives.

4. If the “I/O request taking longer” warning is for tempdb , Enable trace flag 1118 and increase the tempdb data files refer:http://support.microsoft.com/kb/2154845

5. If none of the above resolves the issue collect the below perfmon counters.

Perfmon counters can help us in understanding “If disk is slow” or  “SQL Server is spawning more I/O then what disk could handle” or “Some other process is saturating disk with I/O”

Note:It is important to get  throughput of the disk subsystem in MB/SEC before we look at disk counters. Normally it will be more than 150 MB for SAN disk and greater 50 MB for Single disk .When you look at the perfmon counter look at Max value.

Avg. Disk sec/Transfer –> Time taken to perform the I/O operation

Ideal value for Disk sec/Transfer is 0.005-0.010 sec. If you consistently notice this counter is beyond 0.015 then there is a serious I/O bottleneck.

1. Look for Disk Bytes /sec when Avg. Disk sec/Transfer  is greater than 0.015. If it is below 200 MB for SAN disk and Below 50 MB for Single disk then the problem is with I/O subsystem Engage hardware vendor.

2. If the Disk Bytes /sec  is greater than  200 MB for SAN disk or greater than 50 MB for Single disk when the  Avg. Disk sec/Transfer  is greater than 0.015. Look at theProcess:IO Data Bytes/Sec for the same time and identify which process is spawning I/O. If the identified process is not SQL Server involve the team which supports that process. If the  the identified process is SQL Server tune SQL Server queries which are I/O intensive by creating dropping indexes etc.

Disk Bytes /sec  –> Total read and write to disk per second in bytes.

Collect the values for each logical disks in which SQL Server files are placed and look at the Max value for this counter ideally it has to be greater than the throughput of the disk subsystem. If you don’t have the throughput for the disk then this value to be greater than 200MB for SAN or greater than 50 MB for single disk.

If it is below the expected value you can consider that your disks are not performing well. Involve the hardware vendor.

Important: Value for this counter will be low when there is no I/O happening on the drives. So you have to look at the this counter during the time you see I/O warnings or When Disk sec/Transfer >0.010 for the same drive.

Process:IO Data Bytes/Sec –> Total read and write to disk per second in bytes by each process.

Collect this counter for all the processes running on the server. This counter will help us understand if any other process is saturating the disk with excessive I/O.

Example: Let us consider a disk with max throughput of 250MB per second. If antivirus is spawning 200MB of I/O per second and if SQL Server data files are placed in same drive and SQL Server is spawning 150MB obviously there will be I/O waits.

Buffer Manager: Page Read/sec + Page Writes/sec –>Total read and write to disk per second in bytes by SQL Server process.

Note: If you are analyzing the .BLG file collected and not live perfmon focus on Maximum value for each counter don’t look at average.

 

If (Avg. Disk sec/Transfer> ==0.015 ) and ( (Disk Bytes /sec < 150MB (For San)) or (Disk Bytes /sec < 50MB (For Local) or (Disk Bytes /sec < Speed of disk as per Vendor ))

{

There is Issue with I/O subsystem (or) driver/firmware issue (or) Misconfiguration in I/O Subsystem.

}

If (Disk sec/Transfer > ==0.015 Consistently) and ( (Disk Bytes /sec >= 150 (For San)) or (Disk Bytes /sec >= 50MB (For Local) or (Disk Bytes /sec >= Speed of disk as per Vendor ))

{

Identify the process which is posting excessive I/O request using Process:IO Data Bytes/Sec.

If ( Identified process == SQLServer.exe )

{

Identify and tune the queries which is Spawning excessive I/O.

(Reads+Writes column in profiler, Dashboard reports or sys.dm_exec_query_stats and sys.dm_exec_sql_text

can be used to identify the query). Use DTA to tune the query

}

If ( Identified process != SQLServer.exe )

{

Engage the owner of application which is spawning excessive I/O

}

}

Many thanks to Joseph Pilov from whom I learned many techniques like the one above.

If you liked this post, do like us on FaceBook at https://www.facebook.com/mssqlwiki and join our FaceBook grouphttps://www.facebook.com/mssqlwiki#!/groups/454762937884205/

SQL Server has encountered 64 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [D:XXX in database id 7. The OS file handle is 0x00000000000017E8. The offset of the latest long I/O is: 0x0000264a5b2000
https://social.technet.microsoft.com/Forums/zh-CN/058bc090-02b7-4ec3-9c22-3d486a1b2526/sql-server-alwayson-?forum=sqlserverzhchs

I/O请求超过15s,可能是硬盘的延迟,也有可能是I/O请求延迟,这儿有一篇博客专门写这一块的,你可以看一下:https://blogs.msdn.microsoft.com/sqlsakthi/2011/02/09/troubleshooting-sql-server-io-requests-taking-longer-than-15-seconds-io-stalls-disk-latency/。

每个调度器cpu scheduler(dm_os_schedulers)都会维护一个列表来记录每次上下文切换的时候I/O挂起是否完成,如果一个I/O pending超过了15s,就会把计数器加一,后台线程每五分钟将这些超过15s的I/O pending写入日志
troubleshoot的方向大概就三点,Disk latency, stuck I/O 或者 stalled I/O

首先先用下面的查询看看I/O pending的数目是多少:
SELECT SUM(pending_disk_io_count) AS [Number of pending I/Os] FROM sys.dm_os_schedulers

然后用SELECT * FROM sys.dm_io_pending_io_requests获取详细的数据。

接着用下面语句判断具体哪儿出现了问题。
SELECT DB_NAME(database_id) AS [Database],[file_id], [io_stall_read_ms],[io_stall_write_ms],[io_stall] FROM sys.dm_io_virtual_file_stats(NULL,NULL)

你可以仔细阅读一下上面给出的博客,非常详细,你可以按照他的步骤一步一步找出问题所在。


 

Sync IOs in nonpreemptive mode longer than 1000 ms

 

在SQL Server里面一般有两种I/O,同步IO,异步IO
There are two general types of I/O performed by SQL Server.
Async – Vast majority of SQL Server I/Os, as outlined in the provided link: https://technet.microsoft.com/en-us/library/aa175396(v=sql.80).aspx
Sync

同步IO调用超过1000ms就会打印日志

总结:磁盘IO性能跟不上导致,建议更换更快的磁盘

参考:https://blogs.msdn.microsoft.com/bobsql/2016/08/17/how-it-works-sync-ios-in-nonpreemptive-mode-longer-than-1000-ms/

I/O requests taking longer than 15 seconds to complete on file I/O瓶颈问题的更多相关文章

  1. 缓解 SQL Server has encountered 727 occurrence(s) of I/O requests taking longer than 15 seconds

    sql server 会记录IO等待时间超过15 seconds的请求,这时application会有 time out 现象,dba需要判断是workload,concurrecy 所致还是sql ...

  2. Server2012R2 ADFS3.0 The same client browser session has made '6' requests in the last '13'seconds

    本问题是在windows server2012R2系统ADFS3.0环境下遇到的,CRM2013部署ADFS后运行一段时间(大概有一两个月)后在IE浏览器中访问登陆界面点击登陆后就报以下错误 &quo ...

  3. 项目读取数据,一直出现 Closing connections idle longer than 30 SECONDS,卡死现象

    项目读取数据,一直出现 Closing connections idle longer than 30 SECONDS,卡死现象. 我的是在读取oracle数据的时候出现这种错误. 可以参考这篇文章 ...

  4. libmysqlclient.so.15: cannot open shared object file: No such file or directory

    错误: ./mafsInRegion: error while loading shared libraries: libmysqlclient.so.15: cannot open shared o ...

  5. 转 error while loading shared libraries: libmysqlclient.so.15: cannot open shared object file

    我是今天再用emboss得时候发现出现问题了,再网上搜索了一下,发现有人和我一样得问题,解决得方法是: wget -O /usr/lib64/libmysqlclient.so.15 http://f ...

  6. gm: error while loading shared libraries: libpng15.so.15: cannot open shared object file: No such file or directory

    安装gm库产生问题 解决方案: # cat /etc/ld.so.confinclude ld.so.conf.d/*.conf# echo "/usr/local/lib" &g ...

  7. WaitType:ASYNC_IO_COMPLETION

    项目组有一个数据库备份的Job运行异常,该Job将备份数据存储到remote server上,平时5个小时就能完成的备份操作,现在运行19个小时还没有完成,backup命令的Wait type是 AS ...

  8. pagelatch等待在tempdb的gsm页面上

    Each data file has a gam page, sql will update it when allocate space in the file. Will see contenti ...

  9. ASYNC_IO_COMPLETION

    项目组有一个数据库备份的Job运行异常,该Job将备份数据存储到remote server上,平时5个小时就能完成的备份操作,现在运行19个小时还没有完成,backup命令的Wait type是 AS ...

随机推荐

  1. WIN32_LEAN_AND_MEAN宏

    网上说: 不加载MFC所需的模块. 用英语解释:Say no to MFC 如果你的工程不使用MFC,就加上这句,这样一来在编译链接时,包括最后生成的一些供调试用的模块时,速度更快,容量更小. 我们经 ...

  2. jdk8 Lambda表达式与匿名内部类比较

    Labmda表达式与匿名内部类 前言 Java Labmda表达式的一个重要用法是简化某些匿名内部类(Anonymous Classes)的写法.实际上Lambda表达式并不仅仅是匿名内部类的语法糖, ...

  3. DBlink与同义词

    DBLink就是数据库链接,而同义词就已经具体到某个用户下的表了 原文链接:http://www.linuxidc.com/Linux/2013-01/77579.htm 这里所需要的信息: 从MM库 ...

  4. ps 的一些小东西

    1.画圈 画框 新建图层--矩形选框工具(U)--左上角选 '路径'--画圆/画框--编辑(右键)--描边--ok. 2 ctrl+t 大小变换问题

  5. 安装完ODAC,出现ORA-12560:TNS:协议适配器错误 12541 无监听程序的解决

    进入系统环境变量设置,查看Path路径,发现D:\oracle\product\11.2.0\client_1等路径放到了oracle11g数据库路径前面,将新加入的路径置后即可解决ORA-12560 ...

  6. 其它数据类型和Json的转化

    1.ResultSet→Json public static String resultSetToJson(ResultSet rs) throws SQLException,JSONExceptio ...

  7. seaJS 简单例子,理解seaJS

    学习心得: 记得第一次学underscore的时候,去的官网(不管什么都是官网好),呼啦一长列语法,我就一个个看,看完也不知道underscore是做什么的.就是现在underscore我也用不上,学 ...

  8. JavaScript中对象的含义与this的指向

    JavaScript中的对象:无序属性的集合 -其属性可以包含基本值.对象或函数.对象就是一组没有顺序的值.我们可以吧JavaScript中的对象想象成键值对,其中值可以是数据和函数.对象的行为和特征 ...

  9. C语言通过timeval结构设置周期

    在C语言中,我们经常需要设置一个时间周期.在这里,我们通过Timeval结构实现时间周期的设置.首先,我们介绍timeval,其定义如下(转载http://www.cnblogs.com/wainiw ...

  10. CentOS7下GNOME桌面的安装

    1,搭建yum源仓库.(yum的配置文件在/etc/yum.repos.d目录) (详见http://www.cnblogs.com/zyh120/p/6020781.html) 2,列出yum仓库里 ...