【问题描述】:

前段时间有一项目反馈,常用的审批功能有时的响应较慢,多个管理员功能不定期的出现客户端无响应的状况,并且管理员功能一旦出现卡死,也会影响到普通的业务用户致使很多用户无法操作。该系统为混合场景,管理员功能使用的CS客户端,业务用户使用的是WEB。

【环境信息】:

应用服务器:Windows Server 2008 R2 IIS7  .NET 4.0

数据库服务器:AIX Oracle 10.2.0.4

【处理过程】:

经过跟踪发现两个情况:

  1. 确实有部分待审单据响应很慢,可以反复重现,貌似跟单据有关(慢的单据反复打开都会很慢,其他单据则一直没有类似问题),感觉很奇怪;
  2. 管理员功能发生了客户端不响应的情况后,查看对应数据库端的性能视图,没有SQL阻塞信息,也没有明显的执行时间很长、效率很低的SQL;

在出现系统挂起时,抓取应用服务器的dump文件后,发现存在明显的线程阻塞:

  1. 105号线程阻塞了其他4个线程{(9-1)/2},而105号线程在等待数据库的返回,OracleLob,这是个特别的数据类型,需要引起警惕了;
  2. 根据线程堆栈:OperateDataBaseManager.GetPacekage(System.String),我们可以很快定位到程序代码及对应的SQL脚本;
  3. 根据windbg还可以看到,其他线程都在MetadataCache.GetObject()方法上处于被阻塞状态,查看对应的源代码,发现该方法使用了静态对象作为同步锁对象;

下图是线程阻塞对应windbg的log日志

Microsoft (R) Windows Debugger Version 6.12.0002.633 AMD64
Copyright (c) Microsoft Corporation. All rights reserved. Loading Dump File [D:\dumps\w3wp.DMP]
User Mini Dump File with Full Memory: Only application data is available Symbol search path is: SRV*c:\symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows 7 Version 7601 (Service Pack 1) MP (8 procs) Free x64
Product: Server, suite: Enterprise TerminalServer SingleUserTS
Machine Name:
Debug session time: Tue Apr 14 14:22:10.000 2015 (UTC + 8:00)
System Uptime: 19 days 20:26:50.903
Process Uptime: 0 days 5:33:15.000
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
......................................
ntdll!NtWaitForSingleObject+0xa:
00000000`7711135a c3 ret
0:000> .load D:\dumps\sos.dll
0:000> !tp
CPU utilization: 0%
Worker Thread: Total: 29 Running: 6 Idle: 23 MaxLimit: 800 MinLimit: 8
Work Request in Queue: 0
--------------------------------------
Number of Timers: 43
--------------------------------------

Completion Port Thread:Total: 1 Free: 1 MaxFree: 16 CurrentLimit: 0 MaxLimit: 800 MinLimit: 8
0:000> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner
47 0000000005532488 9 1 000000001385b1e0 93a0 105 000000017ffefbe0 System.Object
-----------------------------

Total 6125
CCW 4
RCW 11
ComClassFactory 0
Free 4333
0:000> ~105s
ntdll!NtWaitForSingleObject+0xa:
00000000`7711135a c3 ret
0:105> !clrstack
OS Thread Id: 0x93a0 (105)
Child SP IP Call Site
000000000fbac038 000000007711135a [NDirectMethodFrameStandalone: 000000000fbac038] System.Data.Common.UnsafeNativeMethods.OCILobRead(System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, UInt32 ByRef, UInt32, IntPtr, UInt32, IntPtr, IntPtr, UInt16, CHARSETFORM)
000000000fbabfb0 000007fef3455640 DomainNeutralILStubClass.IL_STUB_PInvoke(System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, UInt32 ByRef, UInt32, IntPtr, UInt32, IntPtr, IntPtr, UInt16, CHARSETFORM)*** WARNING: Unable to verify checksum for System.Data.OracleClient.ni.dll 000000000fbac0f0 000007fef344ea9b System.Data.OracleClient.TracedNativeMethods.OCILobRead(System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, Int32 ByRef, UInt32, IntPtr, UInt32, UInt16, CHARSETFORM)
000000000fbac1d0 000007fef347c63d System.Data.OracleClient.OracleLob.Read(Byte[], Int32, Int32)
000000000fbac2a0 000007fef347d483 System.Data.OracleClient.OracleLob.get_Value()
000000000fbac320 000007fef34721b6 System.Data.OracleClient.OracleColumn.GetValue(System.Data.OracleClient.NativeBuffer_RowBuffer)
000000000fbac3d0 000007fef346efd2 System.Data.OracleClient.OracleDataReader.GetValues(System.Object[])
000000000fbac420 000007fedd16faec System.Data.ProviderBase.SchemaMapping.LoadDataRow()*** WARNING: Unable to verify checksum for System.Data.ni.dll 000000000fbac460 000007fedd1674fd System.Data.Common.DataAdapter.FillLoadDataRow(System.Data.ProviderBase.SchemaMapping)
000000000fbac4f0 000007fedd1673d4 System.Data.Common.DataAdapter.FillFromReader(System.Data.DataSet, System.Data.DataTable, System.String, System.Data.ProviderBase.DataReaderContainer, Int32, Int32, System.Data.DataColumn, System.Object)
000000000fbac5b0 000007fedd167271 System.Data.Common.DataAdapter.Fill(System.Data.DataSet, System.String, System.Data.IDataReader, Int32, Int32)
000000000fbac670 000007fedd167ef6 System.Data.Common.DbDataAdapter.FillInternal(System.Data.DataSet, System.Data.DataTable[], Int32, Int32, System.String, System.Data.IDbCommand, System.Data.CommandBehavior)
000000000fbac720 000007fedd167cc4 System.Data.Common.DbDataAdapter.Fill(System.Data.DataSet, Int32, Int32, System.String, System.Data.IDbCommand, System.Data.CommandBehavior)
000000000fbac7d0 000007fedd167ac2 System.Data.Common.DbDataAdapter.Fill(System.Data.DataSet)
000000000fbac880 000007ff002393ee xxxxxx.Platform.Core.DataAccess.Database.GetDataSet(System.String, System.String, System.Data.IDbDataParameter[], Boolean)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.Core.DataAccess.dll 000000000fbac950 000007ff002391cd xxxxxx.Platform.Core.DataAccess.Database.GetDataSet(System.String, System.Data.IDbDataParameter[], Boolean)
000000000fbac990 000007ff0023906a xxxxxx.Platform.Core.DataAccess.Database.ExecuteDataSet(System.String, System.Data.IDbDataParameter[])
000000000fbac9f0 000007ff00238810 xxxxxx.Platform.Core.DataAccess.Database.ExecuteDataSet(System.String, System.Object[])
000000000fbaca60 000007ff01566749 xxxxxx.Platform.Workflow.Server.Designer.OperateDataBaseManager.GetPacekage(System.String)*** WARNING: Unable to verify checksum for xxxxxx.Platform.Workflow.Server.dll
*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.Workflow.Server.dll 000000000fbacab0 000007ff01566548 xxxxxx.Platform.Workflow.Server.Designer.OperateDataBaseManager.Load(System.String)
000000000fbacb00 000007ff0024bbfe xxxxxx.Platform.Resource.ResourceManager.MetadataService.GetCheckedinVersionContent(System.String, System.Data.DataRow, System.String ByRef)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.Resource.MetadataManager.dll 000000000fbacb60 000007ff0024b842 xxxxxx.Platform.Resource.ResourceManager.MetadataService.GetMetadataContentString(System.String, System.String ByRef, System.String)
000000000fbacbe0 000007ff0024b53b xxxxxx.Platform.Resource.ResourceManager.MetadataService.GetMetadataContentFromDB(System.String)
000000000fbacc40 000007ff00243d1a xxxxxx.Platform.Resource.ResourceManager.MetadataCache.GetObject(System.String, xxxxxx.Platform.AppFramework.Service.xxxState)
000000000fbaccb0 000007ff00fb036b xxxxxx.Platform.Resource.MetadataManager.MetadataRestFul.TryGetNewVersionContent(System.IO.BinaryReader, System.IO.BinaryWriter)
000000000fbad418 000007fef98d10b4 [DebuggerU2MCatchHandlerFrame: 000000000fbad418]
000000000fbad500 000007fef98d10b4 [CustomGCFrame: 000000000fbad500]
000000000fbad4c8 000007fef98d10b4 [GCFrame: 000000000fbad4c8]
000000000fbad458 000007fef98d10b4 [GCFrame: 000000000fbad458]
000000000fbad828 000007fef98d10b4 [HelperMethodFrame_PROTECTOBJ: 000000000fbad828] System.RuntimeMethodHandle._InvokeMethodFast(System.IRuntimeMethodInfo, System.Object, System.Object[], System.SignatureStruct ByRef, System.Reflection.MethodAttributes, System.RuntimeType)
000000000fbad970 000007fef895587f System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo, Boolean)*** WARNING: Unable to verify checksum for mscorlib.ni.dll 000000000fbadab0 000007fef89a0426 System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo)
000000000fbadb00 000007ff00b4f3b3 xxxxxx.Platform.AppFramework.Service.xxxRestfulContext.InternalInvoke(System.String, System.Reflection.MethodInfo, System.Object, System.Object[], Int32[] ByRef, System.Object[] ByRef)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.AppFramework.RestfulService.dll 000000000fbadb60 000007ff00b4ebf0 xxxxxx.Platform.AppFramework.Service.xxxRestfulContext.Invoke(System.String, System.String, System.String, Boolean, System.Object[], Int32[] ByRef, System.Object[] ByRef)
000000000fbadc40 000007ff00c28ae8 xxxxxx.Platform.AppFramework.Service.xxxRestfulContext.Invoke(System.String, System.String, System.String, Boolean, System.Object[])
000000000fbadca0 000007ff00c2876b xxxxxx.Platform.AppFramework.RESTFulWebService.xxxHttpWebHandler.InvokeStream(System.IO.BinaryReader, System.Web.HttpContext, Boolean)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.AppFramework.RESTFulWebService.dll 000000000fbadd80 000007ff00b4de0c xxxxxx.Platform.AppFramework.RESTFulWebService.xxxHttpWebHandler.ProcessRequest(System.Web.HttpContext)
000000000fbade70 000007fef1eb5c25 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()*** WARNING: Unable to verify checksum for System.Web.ni.dll 000000000fbadf30 000007fef1e8337a System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
000000000fbadfe0 000007fef1e97030 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)
000000000fbae0a0 000007fef1e82879 System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
000000000fbae100 000007fef1e8777c System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
000000000fbae180 000007fef1e86477 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
000000000fbae2b0 000007fef25ed6dd DomainNeutralILStubClass.IL_STUB_COMtoCLR(Int64, Int32, Int32 ByRef)
000000000fbae5e8 000007fef993e93e [GCFrame: 000000000fbae5e8]
000000000fbae628 000007fef993e93e [ContextTransitionFrame: 000000000fbae628]
000000000fbae668 000007fef993e93e [GCFrame: 000000000fbae668]
000000000fbae850 000007fef993e93e [ComMethodFrame: 000000000fbae850]
0:105> ~* e!clrstack
OS Thread Id: 0x2aec (0)
Unable to walk the managed stack. The current thread is likely not a
managed thread. You can run !threads to get a list of managed threads in
the process
OS Thread Id: 0x15f0 (106)
Child SP         IP               Call Site

000000000ed5df70 00000000771118ca [GCFrame: 000000000ed5df70]
000000000ed5e1c8 00000000771118ca [GCFrame: 000000000ed5e1c8]

000000000ed5e178 00000000771118ca [HelperMethodFrame: 000000000ed5e178] System.Threading.Monitor.Enter(System.Object)

000000000ed5e2d0 000007ff00243ce8 xxxxxx.Platform.Resource.ResourceManager.MetadataCache.GetObject(System.String, xxxxxx.Platform.AppFramework.Service.GSPState)

000000000ed5e340 000007ff00d40c8d xxxxxx.Platform.Resource.ResourceManager.DefaultValueManager.IsFormMobileForm(System.String)

000000000ed5e380 000007ff00d40ac0 xxxxxx.Platform.Resource.ResourceManager.DefaultValueManager.GetDefaultValueXmlExactly(System.String, System.String, System.String)

000000000ed5e3d0 000007ff00d409df xxxxxx.Platform.Engine.DataAccess.Common.DAEUtils.GetDefaultValueXml(System.String, System.String, System.String)

000000000ed5e420 000007ff00d40583 xxxxxx.Platform.Engine.DataAccess.GSPDataAccessEngine.NewObjectWithDefault(xxxxxx.Platform.Resource.Metadata.DataModel.GspDataModel, System.String, System.String)

000000000ed5e500 000007ff0091495d xxxxxx.Platform.BizComponent.JSProxy.BizComponentRESTFulService.InvokeInternal(System.String, System.Object[], System.Collections.Generic.Dictionary`2<Int32,System.Object> ByRef, Boolean ByRef)

000000000ed5e590 000007ff0023fe16 xxxxxx.Platform.BizComponent.JSProxy.BizComponentRESTFulService.Get(System.Collections.Generic.Dictionary`2<System.String,System.String>)

000000000ed5e610 000007ff002276d8 xxxxxx.Platform.WebFramework.RestFulService.RESTFulHandlerForWeb.ProcessRequest(System.Web.HttpContext)

000000000ed5e840 000007fef1eb5c25 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()

000000000ed5e900 000007fef1e8337a System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)

000000000ed5e9b0 000007fef1e97030 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)

000000000ed5ea70 000007fef1e82879 System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)

000000000ed5ead0 000007fef1e8777c System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)

000000000ed5eb50 000007fef1e86477 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)

000000000ed5ec80 000007fef25ed6dd DomainNeutralILStubClass.IL_STUB_COMtoCLR(Int64, Int32, Int32 ByRef)

000000000ed5efb8 000007fef993e93e [GCFrame: 000000000ed5efb8]

000000000ed5eff8 000007fef993e93e [ContextTransitionFrame: 000000000ed5eff8]

000000000ed5f038 000007fef993e93e [GCFrame: 000000000ed5f038]

000000000ed5f220 000007fef993e93e [ComMethodFrame: 000000000ed5f220]

下图为阻塞源头105号线程(被数据库返回阻塞),执行SQL对应的程序代码:

下图为线程阻塞的同步程序代码:

至此问题应该比较清楚了,执行SQL时间长的线程阻塞了其他线程,而线程同步锁的粒度太大(仅为了做数据缓存时,防止插入重复值,结果锁定了所有的元数据对象)。但是,该SQL脚本真的有问题吗?

【网络分析】:

将该SQL放到plsql中执行,客户端响应时间很长,经过对不同单据进行测试后发现,当LOB类型的数据超过16K以后问题就会出现,但该SQL在测试环境并没有问题。前面已经跟踪过数据库服务器,没有发现什么问题,因此目标开始转移到应用服务器到数据库服务器的网络环境上。

直接使用ADO.NET和JDBC分别直接访问数据库,均发现响应延迟问题。与客户方IT人员沟通了解到,从应用服务器到数据库经过了一个思科的防火墙。我们就分别在应用服务器、应用服务器端的交换机、数据库服务器端的交换机、数据库服务器,4个点进行网络抓包。经过对比发现,经过防火墙前后两个交换机的数据包明显有问题:有非常明显的包乱序、包数量不一致、window size递减的情况;并且所有经过防火墙的网络包,window size的值均从原来的64k篡改为16k;

如下图:

把这个情况反馈给思科的代理后,他们调整防火墙设置:好像是改为按流量控制,不是特别懂。

至此,问题彻底解决!

【性能诊断】十一、性能问题综合分析(案例2,windbg、wireshark)的更多相关文章

  1. 利用Oracle RUEI+EM12c进行应用的“端到端”性能诊断

    概述 我们知道,影响一个B/S应用性能的因素,粗略地说,有以下几个大的环节: 1. 客户端环节 2. 网络环节(可能包括WAN和LAN) 3. 应用及中间层环节 4. 数据库层环节 能够对各个环节的问 ...

  2. 性能计数器与profiler的组合性能诊断

    性能计数器和sql profiler都是常用的性能诊断工具和优化工具,最近和群友聊天发现很多人竟然不知道这两个可以“组合”使用,所以这篇算是一篇扫盲贴吧. 两种工具简述 通过计数器可以收集两部分内容: ...

  3. 60,000毫秒内对Linux的性能诊断效的方法

    转载于:http://www.itxuexiwang.com/a/liunxjishu/2016/0225/168.html?1456484140 60,000 毫秒内对 Linux 的性能诊断 当你 ...

  4. 使用vs自带的性能诊断工具

    visual studio是个强大的集成开发环境,内置了程序性能诊断工具.下面通过两段代码进行介绍. static void Main( string[] args) { Test1(); Test2 ...

  5. 关于UtilTimerStack类的使用--XWork2、Struts2内置性能诊断类

    关于UtilTimerStack类的使用--XWork2.Struts2内置性能诊断类 一.UtilTimerStack做什么用的? 这个本来是Xwork2(Struts2的核心)的相关的工具类,可以 ...

  6. 如何诊断windows性能问题

    直接使用perfmon中的性能诊断工具

  7. [翻译]60,000毫秒内对Linux进行性能诊断

    原文链接:http://techblog.netflix.com/2015/11/linux-performance-analysis-in-60s.html 原文作者:Brendan Gregg,L ...

  8. [C#] 网站程序ASP.NET的性能诊断 - CPU分析

    微软提供了标准的CLR性能分析类库 https://github.com/Microsoft/clrmd 这个类库是开源的代码.能够获取CLR runtime里面几乎所有的信息. 如何获取clrmd编 ...

  9. Oracle ADDM性能诊断利器及报告解读

    性能优化是一个永恒的话题,性能优化也是最具有价值,最值得花费精力深入研究的一个课题,因为资源是有限的,时间是有限的.在Oracle数据库中,随着Oracle功能的不断强大和完善,Oralce数据库在性 ...

  10. MySQL性能诊断与调优 转

    http://www.cnblogs.com/preftest/ http://www.highperfmysql.com/     BOOK LAMP 系统性能调优,第 3 部分: MySQL 服务 ...

随机推荐

  1. MooFest_二维树状数组

    Description Every year, Farmer John's N (1 <= N <= 20,000) cows attend "MooFest",a s ...

  2. JLOI 斯迈利的赌注

    直接高精度模拟,加上简单贪心 Program XJOI2263; ..] of longint; var a,b:arr; s1,s2:ansistring; i,j:longint; sum:int ...

  3. window.opener调用父窗体方法的用法

    应用实例:        function BindWindowCloss() {            $(window).bind('beforeunload', function () {    ...

  4. const char* <----- > string

    (1) const char*      <-----     string const char* const_txt_path=txt_path.c_str(); (2)  string  ...

  5. Ubuntu安装文泉驿-微米黑字体

    sudo apt-get install ttf-wqy-microhei #文泉驿-微米黑

  6. Sprint第二个冲刺(第五天)

    一.Sprint 计划会议: 容杰龙继续完善昨天的SQLite修改数据操作,待全部操作完善后交给炜杰进行布局规范和整合. 二.Sprint周期:   看板: 燃尽图:

  7. 徹底刪除atom

    rm -f ~/.atom rm -f /usr/local/bin/atom rm -f /usr/local/bin/apm rm -f /Applications/Atom.app rm -f ...

  8. UI组件(思维导图)

  9. PHP太怪了,in_array() ,strpos,

    PHP中在某个字符中查找另外一个字符串,是否存在,用的是strpos,此函数用法,经常很多人用反了,正确的用法是strpos(string,search),strstr等,前面是原字符串,后面是要在原 ...

  10. 利用mybatis_generator自动生成Dao、Model、Mapping相关文件

    技术交流群:233513714 http://blog.csdn.net/wyc_cs/article/details/9023117 http://www.cnblogs.com/smileberr ...