谁在死锁Mutex——用Windbg查找Mutex死锁所有者线程
By Ingo Rammer
I'm currently toying with the idea of creating a small frontend to SOS.DLL (CDB/WinDbg) and KD. This frontend should simplify some production debugging tasks based on crash dumps. (In my prototype, I'm already happily clicking myself through managed code threads, callstacks, parameters, locals and browsing objects' contents just like in Visual Studio, but solely based on crash dumps.)
A lot of debugging work is in general based on identifying a set of rather common issues. But even if you are tracking down a very common issue, it's still a tough call to work through 25 managed code callstacks to find out - for example - why a certain thread might be blocking a number of other threads. This was the rationale behind some of the features I'm thinking about implementing in the debugging frontend tool: to simplify and automate some manual tasks. But this tool is not (yet) ready for writing about, so lets look at the manual process for finding out whos blocking your thread.
For the analysis of thread blocks based on WaitHandles, I've found two kinds of possible solutions: one which is not 100% deterministic but works on crash dumps (but only for named synchronization objects) and another one which is deterministic but only works on the live system with a kernel debugger. I'll show you both approaches in this post.
First Approach - User Mode only, but works on Dumps
Note: This approach is not deterministic and not accurate but based on the automated equivalent of trial and error in the form of!DumpStackObjects
. It's essentially based on looking for WaitHandle references on a callstack and similar means to get the identifier of a named synchronization object. The only good news is that it works on dumps generated by ADPlus (and of course on live systems as well).
Ok. Let's get startet. Suppose you have a process in which several threads end up blocking others for periods longer than you would expect and you basically want to know which threads block on resources held by which other threads. You've created a -hang dump with ADPlus and are ready to examine the dump in WinDbg or CDB.
First, you will need to find out which threads are actually waiting for something. To do this, you first load SOS.DLL and then look at the states of your managed threads:
0:013> .loadby sos mscorwks // load the extension sos.dll in directory of mscorwks
0:013> !threads // list all managed threads
ThreadCount: 12
UnstartedThread: 0
BackgroundThread: 1
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
PreEmptive GC Alloc Lock
ID OSID ThreadOBJ State GC Context Domain Count APT Exception
0 1 b70 00180fd0 200a020 Enabled 00000000:00000000 0014c100 0 MTA
2 2 1b54 00152480 b220 Enabled 00000000:00000000 0014c100 0 MTA (Finalizer)
3 3 1ba0 0019a1c0 200b020 Enabled 00000000:00000000 0014c100 0 MTA
4 4 1f9c 0019c138 200b020 Enabled 00000000:00000000 0014c100 0 MTA
5 5 210 0019cd18 200b020 Enabled 00000000:00000000 0014c100 0 MTA
6 6 13e0 0019d8d0 200b020 Enabled 00000000:00000000 0014c100 0 MTA
7 7 17e0 0019e488 200b020 Enabled 00000000:00000000 0014c100 0 MTA
8 8 1204 0019f040 200b020 Enabled 00000000:00000000 0014c100 0 MTA
9 9 1ec0 0019fbf8 b020 Disabled 00000000:00000000 0014c100 0 MTA
10 a 1a48 001a07b0 b020 Disabled 00000000:00000000 0014c100 0 MTA
11 b 1558 001a1368 200b020 Enabled 00000000:00000000 0014c100 0 MTA
12 c 1d88 001a1f20 200b020 Enabled 00000000:00000000 0014c100 0 MTA
The threads with state 200a020 are currently waiting. As this does not necessarily mean that they are waiting for a WaitHandle (they could also just have called Thread.Sleep()), you now also need to find out which of these blocked threads are actually blocking on a WaitHandle.
To do this, you will have to look at the top managed code stackframe in each of the blocked threads. (Please note: You have to specify the native thread ID for the next command. This ID is contained in the first column of !threads, and is not the CLR thread id contained in the second column!)
0:007> ~0e !clrstack // list the managed call stack of thread 0 (CLR thread #1)
OS Thread Id: 0xb70 (0)
ESP EIP
0012f420 7c90eb94 [HelperMethodFrame: 0012f420] System.Threading.Thread.SleepInternal(Int32)
0012f474 793d80f5 System.Threading.Thread.Sleep(Int32)
0012f478 00ca00fb ResourceLock.Program.Main(System.String[])
0012f69c 79e88f63 [GCFrame: 0012f69c]
As you can see in this first example, the specified thread does not block on a WaitHandle, so let's continue our search. (Note: instead of manually using ~<THREAD_NUMBER>e !clrstack for each thread, you could also call ~*e !clrstack, do dump the complete managed code callstacks for all threads.)
0:007> ~7e !clrstack // list the managed call stack of thread 7 (also CLR thread #7)
OS Thread Id: 0x17e0 (7)
ESP EIP
011cf588 7c90eb94 [HelperMethodFrame_1OBJ: 011cf588] System.Threading.WaitHandle.WaitOneNative(Microsoft.Win32.SafeHandles.SafeWaitHandle, U
Int32, Boolean, Boolean)
011cf634 793d424e System.Threading.WaitHandle.WaitOne(Int64, Boolean)
011cf64c 793d4193 System.Threading.WaitHandle.WaitOne(Int32, Boolean)
011cf65c 793d420e System.Threading.WaitHandle.WaitOne()
011cf660 00ca0250 ResourceLock.Program.ThreadProc()
011cf6b4 793d7a7b System.Threading.ThreadHelper.ThreadStart_Context(System.Object)
011cf6bc 793683dd System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
011cf6d4 793d7b5c System.Threading.ThreadHelper.ThreadStart()
011cf8f8 79e88f63 [GCFrame: 011cf8f8]
Here, the top level call stack elements contain calls to WaitHandle.WaitOne() or WaitOneNative(). This thread is currently waiting for a synchronization object.
To retrieve details of the underlying sync object, you first have to look at the unmanaged callstack for the same thread. You will usually send a command like "kb 5" to the debugger after selecting the current thread with ~7 s.
0:007> ~7 s // select thread 7 if its not the current thread
eax=00000000 ebx=011cf350 ecx=79690098 edx=79690048 esi=00000000 edi=7ffde000
eip=7c90eb94 esp=011cf328 ebp=011cf3c4 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
ntdll!KiFastSystemCallRet:
7c90eb94 c3 ret
0:007> kb 5 // print five stack frames
ChildEBP RetAddr Args to Child
011cf324 7c90e9ab 7c8094f2 00000001 011cf350 ntdll!KiFastSystemCallRet
011cf328 7c8094f2 00000001 011cf350 00000000 ntdll!ZwWaitForMultipleObjects+0xc
011cf3c4 79f8ead4 00000001 011cf614 00000001 KERNEL32!WaitForMultipleObjectsEx+0x12c
011cf42c 79f17522 00000001 011cf614 00000001 mscorwks!WaitForMultipleObjectsEx_SO_TOLERANT+0x6f
011cf44c 79f17493 00000001 011cf614 00000001 mscorwks!Thread::DoAppropriateAptStateWait+0x3c
Please note: This output is what youll receive if you have debug symbols configured correctly. While its definitely not as nice as the one with symbols, you can still get the same information (minus the method names) even if you do not have debug symbols installed:
0:007> kb 5 // print five stack frames
ChildEBP RetAddr Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
011cf3c4 79f8ead4 00000001 011cf614 00000001 ntdll!KiFastSystemCallRet
011cf42c 79f17522 00000001 011cf614 00000001 mscorwks!StrongNameFreeBuffer+0x57fc
011cf44c 79f17493 00000001 011cf614 00000001 mscorwks!CreateAssemblyCache+0x9474
011cf4d0 79f1732f 00000001 011cf614 00000001 mscorwks!CreateAssemblyCache+0x93e5
011cf520 7a07b49c 00000001 011cf614 00000001 mscorwks!CreateAssemblyCache+0x9281
Here, I'll ignore the first three lines in any case, no matter if we have symbols or not. The next line is usually a call to WaitForMultipleObjectEx or a comparable function. This function is defined like this:
DWORD WaitForMultipleObjectsEx(
DWORD nCount,
const HANDLE* lpHandles,
BOOL bWaitAll,
DWORD dwMilliseconds,
BOOL bAlertable
);
In the third, fourth and fifth field of the output of kb 5, we can get the first three arguments to the method. We therefore know that nCount = 1 and lpHandles = 011cf614. This means that the .NET Framework has passed an array of handles which contains 1 (nCount) entry and starts at the location 011cf614 (lpHandles) to WaitForMultipleObjectsEx.
You can then call"dd <address> L<number_of_handles>"
(for example"dd 011cf614 L00000001"
) to the debugger to generate a memory dump to retrieve the list of handles.
0:007> dd 011cf614 L1 // dump 1 dword at location 011cf614
011cf614 00000684
In this command, the first field in each line contains the memory address and the subsequent fields (only one in our case) contain the handle to the object you are waiting for.
For each handle, you can then then call"!handle <handle> f"
, for example"handle 00000684 f"
to retrieve handle information:
0:007> !handle 00000684 f // dump detailed information (flags = f) for handle 684
Handle 684
Type Mutant
Attributes 0
GrantedAccess 0x1f0001:
Delete,ReadControl,WriteDac,WriteOwner,Synch
QueryState
HandleCount 4
PointerCount 8
Name \BaseNamedObjects\{1CA3A6E4-68E7-4847-9602-5669BC40D01E}
Object Specific Information
Mutex is Owned
In this case, you now know that the thread is waiting for a Mutex (Mutant is the kernel-level name for a Mutex) which has been named "{1CA3A6E4-68E7-4847-9602-5669BC40D01E}".
The next step is to find out who currently owns this mutex. This would be relatively straightforward in a live debug as we could use a kernel debugger to identify the owning thread (which I'll explain later). In a crash dump however, we are somewhat limited as we have to hunt for this information in a trial-and-error way. We are also limited to owner threads in the same process if the named mutex is owned by a thread in another process, we won't be able to get there with our single-process memory dump.
This trial-and-error information retrieval comes in the form of !DumpStackObjects (which can be abbreviated to !dso), an extension command which walks the stack of the selected thread and checks if each value it finds on the stack could be a managed object.
0:007> ~0e !dso // dump stack objects for thread #0
OS Thread Id: 0xb70 (0)
ESP/REG Object Name
0012f408 01273450 System.IO.StreamWriter
0012f428 01274768 System.Char[]
0012f478 01271b38 System.Object[] (System.String[])
0012f534 01271b38 System.Object[] (System.String[])
0012f6e0 01271b38 System.Object[] (System.String[])
0012f708 01271b38 System.Object[] (System.String[])
Theres not a lot going on in this thread. Certainly no Mutex here, let's go on:
0:007> ~4e !dso // dump stack objects for thread #4
OS Thread Id: 0x1f9c (4)
ESP/REG Object Name
00ecf748 01273b30 System.String Thread 4 sleeping
00ecf770 01273450 System.IO.StreamWriter
00ecf790 01273b64 System.Char[]
00ecf804 01273a40 System.Threading.ThreadStart
00ecf808 01273a40 System.Threading.ThreadStart
00ecf80c 01273a94 System.Threading.ThreadHelper
00ecf814 01271db8 System.String Thread
00ecf818 01273a94 System.Threading.ThreadHelper
00ecf81c 01271d98 System.Threading.ContextCallback
00ecf820 01273a94 System.Threading.ThreadHelper
00ecf82c 01273a94 System.Threading.ThreadHelper
00ecf834 01273ac8 System.Threading.ExecutionContext
00ecf844 01273ac8 System.Threading.ExecutionContext
00ecf848 01273a94 System.Threading.ThreadHelper
00ecf850 01273a94 System.Threading.ThreadHelper
00ecf914 01273aa8 System.Threading.ThreadStart
00ecfac0 01273aa8 System.Threading.ThreadStart
00ecfad0 01273aa8 System.Threading.ThreadStart
Thread #4 looks more interesting but contains no reference to any synchronization object, let's continue again:
0:007>~3e !dso // dump stack objects for thread #3
OS Thread Id: 0x1ba0 (3)
ESP/REG Object Name
00daf7c8 012739d8 System.String Thread 3 sleeping
00daf7f0 01273450 System.IO.StreamWriter
00daf810 01273a0c System.Char[]
00daf880 0127383c System.Threading.Mutex
00daf884 01271db8 System.String Thread
00daf888 01271db8 System.String Thread
00daf88c 01271db8 System.String Thread
00daf898 01271cb8 System.Threading.ThreadHelper
00daf89c 01271d98 System.Threading.ContextCallback
00daf8a0 01271cb8 System.Threading.ThreadHelper
00daf8ac 01271cb8 System.Threading.ThreadHelper
00daf8b4 01271d2c System.Threading.ExecutionContext
00daf8c4 01271d2c System.Threading.ExecutionContext
00daf8c8 01271cb8 System.Threading.ThreadHelper
00daf8d0 01271cb8 System.Threading.ThreadHelper
00daf994 01271ccc System.Threading.ThreadStart
00dafb40 01271ccc System.Threading.ThreadStart
00dafb50 01271ccc System.Threading.ThreadStart
At first glance, thread #3 looks innocent too. Only when examining it more closely, you'll find a Mutex as the fourth stack object. (Of course, this is now total guesswork as we don't know if the thread ever really acquired the mutex or if it is just looking at it. The thread might also have already released the mutex as well but I don't know if we can find an answer to these questions without looking at kernel memory.)
First, lets look if the method acquires a mutex at all. To do this, we first need to find out the method which owns this section of the stack. To do this, we take the value of ESP for our Mutex (which is the address on the stack which references the mutex) which is 00daf880 in our case and compare it to the managed code stack dump of the same thread:
0:007>~3 s // select thread #3
eax=7a38cf40 ebx=00000000 ecx=79ed75c0 edx=fbffffff esi=00000000 edi=00daf700
eip=7c90eb94 esp=00daf6d0 ebp=00daf728 iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206
ntdll!KiFastSystemCallRet:
7c90eb94 c3 ret
0:003> !clrstack // dump the managed code stack for this method
OS Thread Id: 0x1ba0 (3)
ESP EIP
00daf808 7c90eb94 [HelperMethodFrame: 00daf808] System.Threading.Thread.SleepInternal(Int32)
00daf85c 793d80f5 System.Threading.Thread.Sleep(Int32)
00daf860 00ca02c3 ResourceLock.Program.ThreadProc()
00daf8b4 793d7a7b System.Threading.ThreadHelper.ThreadStart_Context(System.Object)
00daf8bc 793683dd System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
00daf8d4 793d7b5c System.Threading.ThreadHelper.ThreadStart()
00dafaf8 79e88f63 [GCFrame: 00dafaf8]
In this stack dump, you can see that the stack address of the reference to the mutex (00daf880) lies between the third and the fourth frame's ESP. You can deduce that the object is created in the third frame, in the method ResourceLock.Program.ThreadProc().
To verify this, you can have a look at the IL code of this method. First you need to convert the Instruction Pointer (IP) to a method description. The IP is 00ca02c3 according to the output of !clrstack (second field in the stack frame for ThreadProc()).
0:003> !IP2MD 00ca02c3 // convert an instruction pointer (IP) to a method description (MethodDesc)
MethodDesc: 00912ff8
Method Name: ResourceLock.Program.ThreadProc()
Class: 009112c4
MethodTable: 00913010
mdToken: 06000003
Module: 00912c14
IsJitted: yes
m_CodeOrIL: 00ca0188
The resulting MethodDesc is 00912ff8. You can use this a parameter to !DumpIL to retrieve the IL code of this method:
0:003> !dumpil 00912ff8 // get the IL code for a method
ilAddr = 004020b4
IL_0000: nop
IL_0001: call System.Threading.Thread::get_CurrentThread
...
IL_004a: newobj System.Threading.Mutex::.ctor
IL_004f: stloc.2
IL_0050: ldloc.2
IL_0051: callvirt System.Threading.WaitHandle::WaitOne
IL_0044 to IL_51 roughly translate to the following C# fragment:
Mutex mtx = new Mutex (...);
mtx.WaitOne();
So, yes, the method definitely acquires at least one mutex at one point. But to find out if the the Mutex we've found on the stack really references the same named synchronization object the other thread is blocking on, we need to examine it more closely. First, let's have a look at the contents of the mutex by using !DumpObj (which we abbreviate to !do).
0:003> !do 0127383c // dump the mutex. Address is from ~3e !dso
Name: System.Threading.Mutex
MethodTable: 7910d018
EEClass: 7910cfa0
Size: 24(0x18) bytes
(C:\WINDOWS\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
Fields:
MT Field Offset Type VT Attr Value Name
790f9c18 4000184 4 System.Object 0 instance 00000000 __identity
790fe160 40005a6 c System.IntPtr 0 instance 1752 waitHandle
7910cf3c 40005a7 8 ...es.SafeWaitHandle 0 instance 012738d8 safeWaitHandle
79104f64 40005a8 10 System.Boolean 0 instance 1 hasThreadAffinity
790fe160 40005a9 984 System.IntPtr 0 shared static InvalidHandle
>> Domain:Value 0014c100:-1 <<
79104f64 40005ee 990 System.Boolean 0 shared static dummyBool
>> Domain:Value 0014c100:NotInit <<
7910d018 40005ef 14c ...m.Threading.Mutex 0 shared static s_ReservedMutex
>> Domain:Value 0014c100:NotInit <<
The second field is called waitHandle and contains the value 1752. This is the decimal handle to the underlying mutex. Lets convert this to hex (6d8) and have a look at the handle:
0:003> !handle 6d8 f // look at handle 6d8 (decimal 1752)
Handle 6d8
Type Mutant
Attributes 0
GrantedAccess 0x1f0001:
Delete,ReadControl,WriteDac,WriteOwner,Synch
QueryState
HandleCount 4
PointerCount 8
Name \BaseNamedObjects\{1CA3A6E4-68E7-4847-9602-5669BC40D01E}
Object Specific Information
Mutex is Owned
As you can see, thread #3 acquired the mutex with the name {1CA3A6E4-68E7-4847-9602-5669BC40D01E} using wait handle 6d8, and this is the exact same named mutex which thread #7 is waiting for on wait handle 684. It seems that we got lucky this time and could find the mutex in our crash dump. Whats left as an exercise is to dump the !clrstack of thread #3 and examine the top stackframes using !IP2MD and !DumpIL to find out why the mutex has not been released in time.
The Deterministic Second Approach - or: Now give me my Kernel Debugger!
If someone would have told me just two years ago that I'd ever start a kernel debugger, I wouldn't have believed him: after all, I'm not writing device drivers and that's all you need KD.EXE for and it's quite difficult to setup kernel debugging in the first place, right? Well, turns out that neither is the case.
If you only want to examine kernel structures on your local machine, you can start a kernel debugger as easily as KD.EXE /KL (and it's part of the same Debugging Tools for Windows package which contains WinDbg and CDB). Theres no need to setup a second machine and connect the debugger to it or anything like this. The only really important part is that contrary to WinDbg and CDB you absolutely have to have debug symbols configured correctly for what we are doing now. Usually its enough to just create a directory like c:\symbols, set _NT_SYMBOL_PATH to the matching entry SRV*c:\symbols*http://msdl.microsoft.com/download/symbols and start KD. The necessary symbols will be downloaded automatically behind the scenes whenever they are needed for the first time. (Note that this also means that KD will block from time to time while it's downloading symbols on demand.)
C:\debug>set _NT_SYMBOL_PATH=SRV*C:\symbols*http://msdl.microsoft.com/download/symbols
C:\debug>kd /KL Microsoft (R) Windows Debugger Version 6.6.0007.5
Copyright (c) Microsoft Corporation. All rights reserved. Connected to Windows XP 2600 x86 compatible target, ptr64 FALSE
Symbol search path is: SRV*C:\symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
*******************************************************************************
WARNING: Local kernel debugging requires booting with /debug to work optimally.
*******************************************************************************
Windows XP Kernel Version 2600 (Service Pack 2) UP Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 2600.xpsp_sp2_gdr.050301-1519
Kernel base = 0x804d7000 PsLoadedModuleList = 0x8055a420
Debug session time: Sat Aug 5 11:19:11.543 2006 (GMT+2)
System Uptime: 6 days 14:14:43.119
lkd>
Well, allright then. Let's start to hunt for the mutex. And don't forget: you can run WinDbg/CDB *and* KD at the same time to debug the user-mode and the kernel-mode parts side by side!
First, we need to get information about the process which we are debugging. The EXE in my example is called ResourceLock.exe, but you can use the !process command in KD to retrieve the necessary entry point for any process on your machine: (I'll use red color coding for KD commands and blue color coding for CDB commands)
lkd> !process 0 0 ResourceLock.exe // retrieve information for all processes running ResourceLock.exe
PROCESS 8900e020 SessionId: 0 Cid: 1cd0 Peb: 7ffde000 ParentCid: 12a0
DirBase: 0cb83000 ObjectTable: e72125c8 HandleCount: 122.
Image: ResourceLock.exe
The !process command can also be used to retrieve additional information. It accepts two numeric parameters: the process address and a list of flags. If you specify the process address we've retrieved above (8900e020) and the value 2 for flags, you'll get detailed information about the individual threads and their synchronization objects:
Important: If you are running WinDbg/CDB and KD side-by-side for the same process, you must switch WinDbg/CDB in to run mode (F5 or g) to retrieve this information.
lkd> !process 8900e020 2 // retrieve thread info for process
PROCESS 8900e020 SessionId: 0 Cid: 1cd0 Peb: 7ffde000 ParentCid: 12a0
DirBase: 0cb83000 ObjectTable: e72125c8 HandleCount: 122.
Image: ResourceLock.exe THREAD 89285020 Cid 1cd0.0b70 Teb: 7ffdd000 Win32Thread: e52fe420 WAIT: (DelayExecution) UserMode Alertable
89285110 NotificationTimer THREAD 8a02f280 Cid 1cd0.0268 Teb: 7ffdc000 Win32Thread: 00000000 WAIT: (UserRequest) UserMode Non-Alertable
891da5f8 SynchronizationEvent
8a13d6f8 SynchronizationEvent
89d50020 SynchronizationEvent THREAD 88f9fd58 Cid 1cd0.1b54 Teb: 7ffdb000 Win32Thread: 00000000 WAIT: (UserRequest) UserMode Non-Alertable
8a2fe310 NotificationEvent
8974b6f0 SynchronizationEvent THREAD 89d86da8 Cid 1cd0.1ba0 Teb: 7ffda000 Win32Thread: 00000000 WAIT: (DelayExecution) UserMode Alertable
89d86e98 NotificationTimer THREAD 8902e020 Cid 1cd0.1f9c Teb: 7ffd9000 Win32Thread: 00000000 WAIT: (DelayExecution) UserMode Alertable
8902e110 NotificationTimer THREAD 87b86020 Cid 1cd0.0210 Teb: 7ffd8000 Win32Thread: 00000000 WAIT: (DelayExecution) UserMode Alertable
87b86110 NotificationTimer THREAD 8971b828 Cid 1cd0.13e0 Teb: 7ffd7000 Win32Thread: 00000000 WAIT: (DelayExecution) UserMode Alertable
8971b918 NotificationTimer THREAD 87ba5020 Cid 1cd0.17e0 Teb: 7ffd6000 Win32Thread: 00000000 WAIT: (UserRequest) UserMode Alertable
89d48fc0 Mutant - owning thread 89d86da8 THREAD 88724da8 Cid 1cd0.1204 Teb: 7ffd5000 Win32Thread: 00000000 WAIT: (DelayExecution) UserMode Alertable
88724e98 NotificationTimer THREAD 8869e548 Cid 1cd0.1ec0 Teb: 7ffd4000 Win32Thread: 00000000 READY
THREAD 88e28020 Cid 1cd0.1a48 Teb: 7ffaf000 Win32Thread: 00000000 READY
THREAD 88ab5020 Cid 1cd0.1558 Teb: 7ffae000 Win32Thread: 00000000 WAIT: (DelayExecution) UserMode Alertable
88ab5110 NotificationTimer THREAD 88827360 Cid 1cd0.1d88 Teb: 7ffad000 Win32Thread: 00000000 WAIT: (UserRequest) UserMode Alertable
89d48fc0 Mutant - owning thread 89d86da8
Fantastic, isn't it? Everything we've been looking for with trial-and-error analysis with WinDbg/CDB is contained in this single output.
But let's just look at the three important threads for this list (the two which have a Mutant in their list of kernel objects, and the thread which is specified as the owner for this mutant):
THREAD 89d86da8 Cid 1cd0.1ba0 Teb: 7ffda000 Win32Thread: 00000000 WAIT: (DelayExecution) UserMode Alertable
89d86e98 NotificationTimer
THREAD 87ba5020 Cid 1cd0.17e0 Teb: 7ffd6000 Win32Thread: 00000000 WAIT: (UserRequest) UserMode Alertable
89d48fc0 Mutant - owning thread 89d86da8
THREAD 88827360 Cid 1cd0.1d88 Teb: 7ffad000 Win32Thread: 00000000 WAIT: (UserRequest) UserMode Alertable
89d48fc0 Mutant - owning thread 89d86da8
WAIT: (DelayExecution) in the first line means that the thread is sleeping. The next line indicates the kernel object which the thread is waiting for. It is a NotificationTimer and its a safe bet to assume that this is the timer which has caused the DelayExecution in the first place. By the way: If you've forgotten to switch your user mode debugger into run mode - just like it happens to me all the time - your threads will be marked as WAIT: (Suspended) KernelMode Non-Alertable. In this case, just hit F5 or g in your user mode debugger and re-try the !process command.
The second and third thread in this list are both in WAIT: (UserRequest). The synchronization object is in both cases the Mutant which is currently owned by thread 89d86da8. This owner thread is the first thread in our list. To match this information back to our user-mode information, we need to look at the thread's Cid, which is 1cd0.1ba0 in our case. Of this Cid, the first part (1cd0) is the Process ID and the second part (1ba0) is the Thread ID.
If you then switch back to the user mode debugger and break into the application (CTRL+BREAK), you can find the corresponding CLR thread by using !threads:
0:013> !threads // list all CLR threads
ThreadCount: 12
UnstartedThread: 0
BackgroundThread: 1
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
PreEmptive GC Alloc Lock
ID OSID ThreadOBJ State GC Context Domain Count APT Exception
0 1 b70 00180fd0 200a020 Enabled 00000000:00000000 0014c100 0 MTA
2 2 1b54 00152480 b220 Enabled 00000000:00000000 0014c100 0 MTA (Finalizer)
3 3 1ba0 0019a1c0 200b020 Enabled 00000000:00000000 0014c100 0 MTA
4 4 1f9c 0019c138 200b020 Enabled 00000000:00000000 0014c100 0 MTA
5 5 210 0019cd18 200b020 Enabled 00000000:00000000 0014c100 0 MTA
6 6 13e0 0019d8d0 200b020 Enabled 00000000:00000000 0014c100 0 MTA
7 7 17e0 0019e488 200b020 Enabled 00000000:00000000 0014c100 0 MTA
8 8 1204 0019f040 200b020 Enabled 00000000:00000000 0014c100 0 MTA
9 9 1ec0 0019fbf8 b020 Disabled 00000000:00000000 0014c100 0 MTA
10 a 1a48 001a07b0 b020 Disabled 00000000:00000000 0014c100 0 MTA
11 b 1558 001a1368 200b020 Enabled 00000000:00000000 0014c100 0 MTA
12 c 1d88 001a1f20 200b020 Enabled 00000000:00000000 0014c100 0 MTA
The third column in the thread list is called OSID and contains the operating system thread id. If you look for the thread ID 1ba0 which we retrieved from the kernel debugger's Cid, you will find that thread #3 is the one who blocks the Mutex. This is the same information which we got from the first approach, but now its a 100% deterministic and a lot simpler as well. I just needed to let go of my fear of kernel debugging before venturing in this direction ...
转自:http://weblogs.thinktecture.com/ingo/2006/08/who-is-blocking-that-mutex---fun-with-windbg-cdb-and-kd.html
谁在死锁Mutex——用Windbg查找Mutex死锁所有者线程的更多相关文章
- Windbg找出死锁
使用Windbg找出死锁,解决生产环境中运行的软件不响应请求的问题 前言 本文介绍本人的一次使用Windbg分析dump文件找出死锁的过程,并重点介绍如何确定线程所等待的锁及判断是否出现了死锁. 对于 ...
- 使用Windbg找出死锁,解决生产环境中运行的软件不响应请求的问题
前言 本文介绍本人的一次使用Windbg分析dump文件找出死锁的过程,并重点介绍如何确定线程所等待的锁及判断是否出现了死锁. 对于如何安装及设置Windbg请参考:<使用Windbg和SoS扩 ...
- JAVA基础再回首(二十五)——Lock锁的使用、死锁问题、多线程生产者和消费者、线程池、匿名内部类使用多线程、定时器、面试题
JAVA基础再回首(二十五)--Lock锁的使用.死锁问题.多线程生产者和消费者.线程池.匿名内部类使用多线程.定时器.面试题 版权声明:转载必须注明本文转自程序猿杜鹏程的博客:http://blog ...
- (并发编程)RLock(与死锁现象),Semaphore,Even事件,线程Queue
一.死锁现象与递归锁所谓死锁: 是指两个或两个以上的进程或线程在执行过程中,因争夺资源而造成的一种互相等待的现象,若无外力作用,它们都将无法推进下去.此时称系统处于死锁状态或系统产生了死锁,这些永远在 ...
- Go死锁——当Channel遇上Mutex时
背景 用metux lock for循环,在for循环中又 向带缓冲的Channel 写数据时,千万要小心死锁! 最近,我在测试ws长链接网关,平均一个星期会遇到一次服务假死问题,因为并不是所有rou ...
- 一次查找sqlserver死锁的经历
查找bug是程序员的家常便饭,我身边的人喜欢让用户来重现问题.当然他们也会从正式服务器上下载错误log,然后尝试分析log,不过当错误不是那种不经思考就可识别的情况,他们就会将问题推向用户,甚至怪罪程 ...
- 利用windbg查找dictionary导致IIS占CPU100%案例分析(一)
一.背景 先说下windbg使用场景.各位coder在工作中或多或少都会遇到下面四种情况 1.本地代码好好的,放服务器上运行一段时间后,IIS服务突然占用 w3wp.exe CPU突然100% ,不得 ...
- 查找 SqlServer死锁
use master if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[sp_who_lock]') ) dr ...
- windbg查找Kernel32.dll基址
一.首先准备好一个程序,运行起来,用windbg进行附加调试,由于每个windows下的程序都会加载kernel32.dll,因此,找基址的过程是一样的: 二.查看PEB地址: 法一.r $peb ...
随机推荐
- ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint......
今天用Hibernate建立外键的时候发现没有建立 但是创建了这个字段 情景: user表有一字段role,role是role表id字段的外键 原因: user表中已经有记录了,而且有的记录role这 ...
- 使用React并做一个简单的to-do-list
1. 前言 说到React,我从一年之前就开始试着了解并且看了相关的入门教程,而且还买过一本<React:引领未来的用户界面开发框架 >拜读.React的轻量组件化的思想及其virtual ...
- 关于HTML面试题汇总之visibility
一.页面可见性(visibility) 主要提供两个属性,一个事件(都在document对象上):1. 属性: 1.1. hidden:获取或设置当前页面的可见性,boolean值: 1.2 ...
- 规划在sharepoint中使用安全组
简介 如果将权限级别分配给组而非单个用户,那么管理 SharePoint 网站用户其实很简单.SharePoint 组是一组单独的用户,它还可以包含 Active Directory 组.在 Acti ...
- 2016最新CocoaPods安装和错误解决 + 自己的经验
p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 20.0px Times; color: #333333; background-color: #fffff ...
- 在eclipse中安装上genymotion插件
1.安装genymotion-vbox,选择安装目录. 具体安装过程可见http://www.cnblogs.com/wuyudong/p/5601897.html 2.登录并创建模拟器 3. ...
- 你真的了解UINavigationController吗?
一:首先查看一下关于UINavigationController的定义 NS_CLASS_AVAILABLE_IOS(2_0) @interface UINavigationController : ...
- error=Error Domain=NSURLErrorDomain Code=-1003
我的情况:模拟器 可以返回数据 .真机不可以.我手机连接的同事的共享,我以为他的网段和后台的网段在同一个网段.一直在找错误,打开手机网络设置一看 ,原来不在同一网段.手机的网络必须要跟PC的在同一个 ...
- Android无线开发的几种常用技术(阿里巴巴资深工程师原创分享)
完整的开发一个android移动App需要经过从分解需求.架构设计到开发调试.测试.上线发布等多个阶段,在发布后还会有产品功能上的迭代演进,此外还会面对性能.安全.无线网络质量等多方面的问题. 移动A ...
- js confirm函数 删除提示
<a href="del.php" onclick="return confirm('是否将此留言信息删除?')">删除留言</a>