VMware 에서 실행되는 Windows VM이 시스템 행 현상이 있다고 하여 분석을 진행해 보았습니다.
먼저 !vm 명령을 싱행해서 메모리 상태를 확인해 보았는데 별다른 문제가 확인되지 않았습니다.
0: kd> !vm
Page File:
\??\C:\pagefile.sys
Current: 1048576 Kb Free Space:
807844 Kb
Minimum:
1048576 Kb Maximum: 10485500 Kb
Physical Memory: 4194156 ( 16776624 Kb)
Available Pages:
3479978 ( 13919912 Kb)
ResAvail Pages:
4063051 ( 16252204 Kb)
Locked IO Pages: 0 ( 0 Kb)
Free System PTEs: 4294989313 (17179957252 Kb)
Modified Pages: 13432 ( 53728 Kb)
Modified PF Pages: 12747 ( 50988 Kb)
Modified No Write
Pages: 0 ( 0 Kb)
NonPagedPool 0: 33 ( 132 Kb)
NonPagedPoolNx 0:
32181 ( 128724 Kb)
NonPagedPool 1: 0 ( 0 Kb)
NonPagedPoolNx 1: 0 ( 0 Kb)
NonPagedPool Usage: 168 ( 672 Kb)
NonPagedPoolNx
Usage: 40172 ( 160688 Kb)
NonPagedPool Max: 4294967296 (17179869184 Kb)
PagedPool 0: 79908 ( 319632 Kb)
PagedPool 1: 51553 ( 206212 Kb)
PagedPool 2: 0 ( 0 Kb)
PagedPool Usage:
131461 ( 525844 Kb)
PagedPool Maximum: 4160749568 (16642998272 Kb)
Processor Commit: 917 ( 3668 Kb)
Session Commit: 16179 ( 64716 Kb)
Syspart SharedCommit 0
Shared Commit: 51347 ( 205388 Kb)
Special Pool: 0 ( 0 Kb)
Kernel Stacks: 8889 ( 35556 Kb)
Pages For MDLs: 342 ( 1368 Kb)
Pages For AWE: 0 ( 0 Kb)
NonPagedPool Commit: 38599 ( 154396 Kb)
PagedPool Commit: 131461 ( 525844 Kb)
Driver Commit: 9397 ( 37588 Kb)
Boot Commit: 50253 ( 201012 Kb)
System PageTables: 997 ( 3988 Kb)
VAD/PageTable
Bitmaps: 6871 ( 27484 Kb)
ProcessLockedFilePages: 12 ( 48 Kb)
Pagefile Hash Pages: 206 ( 824 Kb)
Sum System Commit: 315470 ( 1261880 Kb)
Total Private: 473164 ( 1892656 Kb)
Misc/Transient
Commit: 4291 ( 17164 Kb)
Committed pages:
792925 ( 3171700 Kb)
Commit limit:
4456300 ( 17825200 Kb)
Storage에는 Pending 된 IO가 있는지 확인을 해 보았지만 Pending 된 것이 확인되지 않았습니다.
0: kd> !storunit
STORPORT Units:
==================
Product SCSI ID Object Extension Pnd Out Ct State
--------------------------------------------------------------------------------------
VMware Virtual di 0
0 0 ffffab06b03a4060 ffffab06b03a41b0 0
0 0 Working
VMware Virtual di 0
1 0 ffffab06b0399060 ffffab06b03991b0 0
0 0 Working
0: kd> !storunit
ffffab06b03a4060
DO: ffffab06b03a4060 Ext: ffffab06b03a41b0 Adapter: ffffab06b03cb1a0 Working
Vendor: VMware Product: Virtual disk SCSI ID: (0, 0, 0)
Claimed Enumerated
SlowLock: Free RemLock: 1
PageCount: 1
QueueTagList: ffffab06b03a42b0 Node 0 Outstanding: Head:
0000000000000000 Tail:
0000000000000000 Timeout: 0 (Ticking
Down)
Node 1 Outstanding: Head:
0000000000000000 Tail:
0000000000000000 Timeout: 0 (Ticking
Down)
DeviceQueue: ffffab06b03a4340 Depth: 64
Status: Not Frozen PauseCount: 0 BusyCount: 0
IO Gateway: Busy Count: 0 Pause Count: 0
Requests: Outstanding: 0 Device: 0
ByPass: 0
[Device-Queued Requests]
IRP SRB Type SRB XRB Command MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Bypass-Queued Requests]
IRP SRB Type SRB XRB Command MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Outstanding Requests]
IRP SRB Type SRB XRB Command MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Completed Requests]
IRP SRB Type SRB XRB Command MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------
커널에서 동기화 객체를 기다리는 것이 있는지 확인해 보았지만 없었습니다.
0: kd> !locks
**** DUMP OF ALL RESOURCE
OBJECTS ****
KD: Scanning for held
locks.........
89274 total locks
그렇다면 CPU가 바빠서 발생하는 이슈일지 모른다고 생각해서 CPU의 상태를 확인해보았으니 IDLE 상태 였습니다.
0: kd> !cpuinfo
CP F/M/S Manufacturer MHz PRCB Signature MSR 8B Signature Features
0
6,79,1 GenuineIntel 2295 0b00002a00000000 311b3fff
1
6,79,1 GenuineIntel 2295 0b00002a00000000 311b3fff
2
6,79,1 GenuineIntel 2295 0b00002a00000000 311b3fff
3
6,79,1 GenuineIntel 2295 0b00002a00000000 311b3fff
Cached Update Signature
0b00002a00000000
Initial Update Signature 0b00002a00000000
0: kd> !running
Process PID Thread Id Pri Base Pri Next CPU CSwitches
User Kernel State Time Reason
======= ===
================ == === ======== ======== ========= ==== ===============
======= =============== ==============
Idle 0 fffff8035d3d5940 0
0 0 0
24448367 0 23h:56:06.250 Running 1d.00:48:53.796
Executive
Idle 0 ffffdf005f631000 0
0 0 1
20088217 0 23h:48:42.703 Running 3s.984 WrCalloutStack
Idle 0 ffffdf005f6b9000 0
0 0 2
18610602 0 23h:58:19.172 Running 1d.00:48:53.796
Executive
Idle 0 ffffdf005f741000 0
0 0 3
25365439 0 1d.00:00:03.219
Running 1d.00:48:53.796 Executive
Count: 4 | Show Unique
Stacks
0: kd> !mex.us -cpu
4 threads:
ffffdf005f6b9000 ffffdf005f741000 fffff8035d3d5940 ffffdf005f631000
fffff8035d223ff1
nt!PpmIdleGuestExecute+0x15
fffff8035d109eac
nt!PpmIdleExecuteTransition+0xcbc
fffff8035d10903a nt!PoIdle+0x33a
fffff8035d17afac nt!KiIdleLoop+0x2c
1 stack(s) with 4 threads
displayed (4 Total threads)
lpc의 상태를 확인해보니 svchost.exe가 explorer에 lpc message를 보내고 기다리는 것이 확인 됩니다.
0: kd> !lpcwait
Server Process
ffffab06`b8137800 Connection Port
ffffab06`b853c540
(\Device\HarddiskVolume1\Windows\explorer.exe)
Client Thread Client Wait Time Client Message Server Thread
================= ================= ================= =================
ffffab06`b1b28800 1:25.406 ffffce0e`5b04dcf0 (\Device\HarddiskVolume1\Windows\System32\svchost.exe)
1 thread waiting
svchost.exe의 Thread를 확인해보니 logoff를 처리하던 것으로 보입니다.
0: kd> kc
*** Stack trace for last set context - .thread/.cxr resets it
# Call Site
00 nt!KiSwapContext
01 nt!KiSwapThread
02 nt!KiCommitThreadWait
03 nt!KeWaitForSingleObject
04 nt!AlpcpSignalAndWait
05 nt!AlpcpReceiveSynchronousReply
06 nt!AlpcpProcessSynchronousRequest
07 nt!NtAlpcSendWaitReceivePort
08 nt!KiSystemServiceCopyEnd
09 ntdll!NtAlpcSendWaitReceivePort
0a RPCRT4!LRPC_BASE_CCALL::DoSendReceive
0b RPCRT4!I_RpcSendReceive
0c combase!CMessageCall::CallI_RpcSendReceive
0d combase!ThreadSendReceive
0e combase!CSyncClientCall::SwitchAptAndDispatchCall
0f combase!CSyncClientCall::SendReceive2
10 combase!SyncClientCallRetryContext::SendReceiveWithRetry
11 combase!CSyncClientCall::SendReceiveInRetryContext
12 combase!DefaultSendReceive
13 combase!CSyncClientCall::SendReceive
14 combase!CClientChannel::SendReceive
15 combase!NdrExtpProxySendReceive
16 RPCRT4!NdrpClientCall3
17 combase!ObjectStublessClient
18 combase!ObjectStubless
19 combase!RemoteReleaseRifRefHelper
1a combase!RemoteReleaseRifRef
1b combase!CStdMarshal::DisconnectCliIPIDs
1c combase!CStdMarshal::Disconnect
1d combase!CStdIdentity::{dtor}
1e combase!CStdIdentity::CInternalUnk::Release
1f netprofmsvc!ATL::IConnectionPointImpl<CImplINetworkListManager,&IID_INotifyNetworkGlobalCostEvents,ATL::CComDynamicUnkArray>::Unadvise
20 netprofmsvc!CImplINetworkListManager::UnadviseAndRemoveSessionCookie<&IID_INotifyNetworkGlobalCostEvents>
21 netprofmsvc!CImplINetworkListManager::PurgeClients
22 netprofmsvc!EnterPurgeClientsCallback
23 combase!EnterForCallback
24 combase!SwitchForCallback
25 combase!PerformCallback
26 combase!CObjectContext::InternalContextCallback
27 combase!CObjectContext::ContextCallback
28 combase!CContextSwitcher::ContextCallback
29 netprofmsvc!NetProfileManUserLogOff
2a ntdll!TppSimplepExecuteCallback
2b ntdll!TppWorkerThread
2c KERNEL32!BaseThreadInitThunk
2d ntdll!RtlUserThreadStart
Explorer는 파일을 닫는 중이었는데 백신 프로그램의 필터 드라이버인 TmXPFlt가 IO가 더 진행되지 못하게 막고 있었습니다.
0: kd> !us -p ffffab06b8137800
1 thread [stats]: ffffab06b8121080
fffff8035d17e576 nt!KiSwapContext+0x76
fffff8035d0d74fd nt!KiSwapThread+0x17d
fffff8035d0d6f9f nt!KiCommitThreadWait+0x14f
fffff8035d0d8d77 nt!KeWaitForSingleObject+0x377
fffff8022e82bce3 TmXPFlt+0x1bce3
fffff8022e814884 TmXPFlt+0x4884
fffff8022e814d65 TmXPFlt+0x4d65
fffff8022e814f65 TmXPFlt+0x4f65
fffff8022e4b3f8d TmPreFlt!TmpQueryFullName+0x1e85
fffff8022e4b40d8 TmPreFlt!TmpQueryFullName+0x1fd0
fffff8022c603d15 FLTMGR!FltpPerformPostCallbacks+0x2a5
fffff8022c603756 FLTMGR!FltpPassThroughCompletionWorker+0x76
fffff8022c605299 FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x239
fffff8022c603146 FLTMGR!FltpDispatch+0xb6
fffff8035d4e75a0 nt!IopCloseFile+0x150
fffff8035d5142c5 nt!ObCloseHandleTableEntry+0x245
fffff8035d43098b nt!NtClose+0xcb
fffff8035d187503 nt!KiSystemServiceCopyEnd+0x13
00007ffe863c5ce4 ntdll!NtClose+0x14
00007ffe83335aa2 KERNELBASE!CloseHandle+0x62
00007ffe82f02560 SHCORE!CGenericFileHandle::Release+0x160
00007ffe82f02082 SHCORE!CFileStream::~CFileStream+0xd2
00007ffe82f00d35 SHCORE!CFileStream::Release+0x35
00007ffe84b0d273 SHELL32!IconCacheSave+0x18965b
00007ffe84983a05 SHELL32!SHDesktopMessageLoop+0x45
00007ff694781468 Explorer!wWinMain+0x714
00007ff694810727 Explorer!__wmainCRTStartup+0x1c7
00007ffe83ce8364 KERNEL32!BaseThreadInitThunk+0x14
00007ffe8638e851 ntdll!RtlUserThreadStart+0x21
1 stack(s) with 1 threads displayed (1 Total threads)
분석 결과를 보면 사용자는 Logoff를 하려고 하였는데 백신 제품이 IO를 잡고 있어서 시스템이 멈춘 것으로 보인 증상이었습니다.