VMware 에서 실행되는 Windows VM이 시스템 행 현상이 있다고 하여 분석을 진행해 보았습니다.


먼저 !vm 명령을 싱행해서 메모리 상태를 확인해 보았는데 별다른 문제가 확인되지 않았습니다.

0: kd> !vm

Page File: \??\C:\pagefile.sys

  Current:   1048576 Kb  Free Space:    807844 Kb

  Minimum:   1048576 Kb  Maximum:     10485500 Kb

 

Physical Memory:          4194156 (   16776624 Kb)

Available Pages:          3479978 (   13919912 Kb)

ResAvail Pages:           4063051 (   16252204 Kb)

Locked IO Pages:                0 (          0 Kb)

Free System PTEs:      4294989313 (17179957252 Kb)

Modified Pages:             13432 (      53728 Kb)

Modified PF Pages:          12747 (      50988 Kb)

Modified No Write Pages:        0 (          0 Kb)

NonPagedPool    0:             33 (        132 Kb)

NonPagedPoolNx  0:          32181 (     128724 Kb)

NonPagedPool    1:              0 (          0 Kb)

NonPagedPoolNx  1:              0 (          0 Kb)

NonPagedPool Usage:           168 (        672 Kb)

NonPagedPoolNx Usage:       40172 (     160688 Kb)

NonPagedPool Max:      4294967296 (17179869184 Kb)

PagedPool  0:               79908 (     319632 Kb)

PagedPool  1:               51553 (     206212 Kb)

PagedPool  2:                   0 (          0 Kb)

PagedPool Usage:           131461 (     525844 Kb)

PagedPool Maximum:     4160749568 (16642998272 Kb)

Processor Commit:             917 (       3668 Kb)

Session Commit:             16179 (      64716 Kb)

Syspart SharedCommit 0

Shared Commit:              51347 (     205388 Kb)

Special Pool:                   0 (          0 Kb)

Kernel Stacks:               8889 (      35556 Kb)

Pages For MDLs:               342 (       1368 Kb)

Pages For AWE:                  0 (          0 Kb)

NonPagedPool Commit:        38599 (     154396 Kb)

PagedPool Commit:          131461 (     525844 Kb)

Driver Commit:               9397 (      37588 Kb)

Boot Commit:                50253 (     201012 Kb)

System PageTables:            997 (       3988 Kb)

VAD/PageTable Bitmaps:       6871 (      27484 Kb)

ProcessLockedFilePages:        12 (         48 Kb)

Pagefile Hash Pages:          206 (        824 Kb)

Sum System Commit:         315470 (    1261880 Kb)

Total Private:             473164 (    1892656 Kb)

Misc/Transient Commit:       4291 (      17164 Kb)

Committed pages:           792925 (    3171700 Kb)

Commit limit:             4456300 (   17825200 Kb)




Storage에는 Pending 된 IO가 있는지 확인을 해 보았지만 Pending 된 것이 확인되지 않았습니다.

0: kd> !storunit

STORPORT Units:

==================

Product                 SCSI ID  Object            Extension         Pnd Out Ct  State

--------------------------------------------------------------------------------------

VMware     Virtual di   0  0  0  ffffab06b03a4060  ffffab06b03a41b0    0   0  0  Working

VMware     Virtual di   0  1  0  ffffab06b0399060  ffffab06b03991b0    0   0  0  Working

 

0: kd> !storunit ffffab06b03a4060

   DO: ffffab06b03a4060   Ext: ffffab06b03a41b0   Adapter: ffffab06b03cb1a0   Working

   Vendor: VMware    Product: Virtual disk      SCSI ID: (0, 0, 0)

   Claimed Enumerated

   SlowLock: Free  RemLock: 1  PageCount: 1

   QueueTagList: ffffab06b03a42b0     Node 0 Outstanding: Head: 0000000000000000  Tail: 0000000000000000  Timeout: 0 (Ticking Down)

   Node 1 Outstanding: Head: 0000000000000000  Tail: 0000000000000000  Timeout: 0 (Ticking Down)

   DeviceQueue: ffffab06b03a4340  Depth: 64  Status: Not Frozen   PauseCount: 0  BusyCount: 0

   IO Gateway: Busy Count: 0  Pause Count: 0

   Requests: Outstanding: 0  Device: 0  ByPass: 0

 

 

[Device-Queued Requests]

 

IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout

-----------------------------------------------------------------------------------------------------------------------------------

 

 

[Bypass-Queued Requests]

 

IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout

-----------------------------------------------------------------------------------------------------------------------------------

 

 

[Outstanding Requests]

 

IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout

-----------------------------------------------------------------------------------------------------------------------------------

 

 

[Completed Requests]

 

IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout

-----------------------------------------------------------------------------------------------------------------------------------




커널에서 동기화 객체를 기다리는 것이 있는지 확인해 보았지만 없었습니다.

0: kd> !locks

**** DUMP OF ALL RESOURCE OBJECTS ****

KD: Scanning for held locks.........

89274 total locks




그렇다면 CPU가 바빠서 발생하는 이슈일지 모른다고 생각해서 CPU의 상태를 확인해보았으니 IDLE 상태 였습니다.

0: kd> !cpuinfo

CP  F/M/S Manufacturer  MHz PRCB Signature    MSR 8B Signature Features

 0  6,79,1 GenuineIntel 2295 0b00002a00000000                   311b3fff

 1  6,79,1 GenuineIntel 2295 0b00002a00000000                   311b3fff

 2  6,79,1 GenuineIntel 2295 0b00002a00000000                   311b3fff

 3  6,79,1 GenuineIntel 2295 0b00002a00000000                   311b3fff

                      Cached Update Signature 0b00002a00000000

                     Initial Update Signature 0b00002a00000000

0: kd> !running

Process PID Thread           Id Pri Base Pri Next CPU CSwitches User          Kernel State              Time Reason

======= === ================ == === ======== ======== ========= ==== =============== ======= =============== ==============

Idle      0 fffff8035d3d5940  0   0        0        0  24448367    0   23h:56:06.250 Running 1d.00:48:53.796 Executive

Idle      0 ffffdf005f631000  0   0        0        1  20088217    0   23h:48:42.703 Running          3s.984 WrCalloutStack

Idle      0 ffffdf005f6b9000  0   0        0        2  18610602    0   23h:58:19.172 Running 1d.00:48:53.796 Executive

Idle      0 ffffdf005f741000  0   0        0        3  25365439    0 1d.00:00:03.219 Running 1d.00:48:53.796 Executive

 

Count: 4 | Show Unique Stacks

 

0: kd> !mex.us -cpu

4 threads: ffffdf005f6b9000 ffffdf005f741000 fffff8035d3d5940 ffffdf005f631000

    fffff8035d223ff1 nt!PpmIdleGuestExecute+0x15

    fffff8035d109eac nt!PpmIdleExecuteTransition+0xcbc

    fffff8035d10903a nt!PoIdle+0x33a

    fffff8035d17afac nt!KiIdleLoop+0x2c

 

1 stack(s) with 4 threads displayed (4 Total threads)




lpc의 상태를 확인해보니 svchost.exe가 explorer에 lpc message를 보내고 기다리는 것이 확인 됩니다.

0: kd> !lpcwait

Server Process ffffab06`b8137800  Connection Port ffffab06`b853c540  (\Device\HarddiskVolume1\Windows\explorer.exe)

 

    Client Thread        Client Wait Time     Client Message       Server Thread   

    =================    =================    =================    =================

    ffffab06`b1b28800           1:25.406      ffffce0e`5b04dcf0                                                          (\Device\HarddiskVolume1\Windows\System32\svchost.exe)

 

    1 thread waiting




svchost.exe의 Thread를 확인해보니 logoff를 처리하던 것으로 보입니다.

0: kd> kc

  *** Stack trace for last set context - .thread/.cxr resets it

 # Call Site

00 nt!KiSwapContext

01 nt!KiSwapThread

02 nt!KiCommitThreadWait

03 nt!KeWaitForSingleObject

04 nt!AlpcpSignalAndWait

05 nt!AlpcpReceiveSynchronousReply

06 nt!AlpcpProcessSynchronousRequest

07 nt!NtAlpcSendWaitReceivePort

08 nt!KiSystemServiceCopyEnd

09 ntdll!NtAlpcSendWaitReceivePort

0a RPCRT4!LRPC_BASE_CCALL::DoSendReceive

0b RPCRT4!I_RpcSendReceive

0c combase!CMessageCall::CallI_RpcSendReceive

0d combase!ThreadSendReceive

0e combase!CSyncClientCall::SwitchAptAndDispatchCall

0f combase!CSyncClientCall::SendReceive2

10 combase!SyncClientCallRetryContext::SendReceiveWithRetry

11 combase!CSyncClientCall::SendReceiveInRetryContext

12 combase!DefaultSendReceive

13 combase!CSyncClientCall::SendReceive

14 combase!CClientChannel::SendReceive

15 combase!NdrExtpProxySendReceive

16 RPCRT4!NdrpClientCall3

17 combase!ObjectStublessClient

18 combase!ObjectStubless

19 combase!RemoteReleaseRifRefHelper

1a combase!RemoteReleaseRifRef

1b combase!CStdMarshal::DisconnectCliIPIDs

1c combase!CStdMarshal::Disconnect

1d combase!CStdIdentity::{dtor}

1e combase!CStdIdentity::CInternalUnk::Release

1f netprofmsvc!ATL::IConnectionPointImpl<CImplINetworkListManager,&IID_INotifyNetworkGlobalCostEvents,ATL::CComDynamicUnkArray>::Unadvise

20 netprofmsvc!CImplINetworkListManager::UnadviseAndRemoveSessionCookie<&IID_INotifyNetworkGlobalCostEvents>

21 netprofmsvc!CImplINetworkListManager::PurgeClients

22 netprofmsvc!EnterPurgeClientsCallback

23 combase!EnterForCallback

24 combase!SwitchForCallback

25 combase!PerformCallback

26 combase!CObjectContext::InternalContextCallback

27 combase!CObjectContext::ContextCallback

28 combase!CContextSwitcher::ContextCallback

29 netprofmsvc!NetProfileManUserLogOff

2a ntdll!TppSimplepExecuteCallback

2b ntdll!TppWorkerThread

2c KERNEL32!BaseThreadInitThunk

2d ntdll!RtlUserThreadStart




Explorer는 파일을 닫는 중이었는데 백신 프로그램의 필터 드라이버인 TmXPFlt가 IO가 더 진행되지 못하게 막고 있었습니다.

0: kd> !us -p ffffab06b8137800

1 thread [stats]: ffffab06b8121080

    fffff8035d17e576 nt!KiSwapContext+0x76

    fffff8035d0d74fd nt!KiSwapThread+0x17d

    fffff8035d0d6f9f nt!KiCommitThreadWait+0x14f

    fffff8035d0d8d77 nt!KeWaitForSingleObject+0x377

    fffff8022e82bce3 TmXPFlt+0x1bce3

    fffff8022e814884 TmXPFlt+0x4884

    fffff8022e814d65 TmXPFlt+0x4d65

    fffff8022e814f65 TmXPFlt+0x4f65

    fffff8022e4b3f8d TmPreFlt!TmpQueryFullName+0x1e85

    fffff8022e4b40d8 TmPreFlt!TmpQueryFullName+0x1fd0

    fffff8022c603d15 FLTMGR!FltpPerformPostCallbacks+0x2a5

    fffff8022c603756 FLTMGR!FltpPassThroughCompletionWorker+0x76

    fffff8022c605299 FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x239

    fffff8022c603146 FLTMGR!FltpDispatch+0xb6

    fffff8035d4e75a0 nt!IopCloseFile+0x150

    fffff8035d5142c5 nt!ObCloseHandleTableEntry+0x245

    fffff8035d43098b nt!NtClose+0xcb

    fffff8035d187503 nt!KiSystemServiceCopyEnd+0x13

    00007ffe863c5ce4 ntdll!NtClose+0x14

    00007ffe83335aa2 KERNELBASE!CloseHandle+0x62

    00007ffe82f02560 SHCORE!CGenericFileHandle::Release+0x160

    00007ffe82f02082 SHCORE!CFileStream::~CFileStream+0xd2

    00007ffe82f00d35 SHCORE!CFileStream::Release+0x35

    00007ffe84b0d273 SHELL32!IconCacheSave+0x18965b

    00007ffe84983a05 SHELL32!SHDesktopMessageLoop+0x45

    00007ff694781468 Explorer!wWinMain+0x714

    00007ff694810727 Explorer!__wmainCRTStartup+0x1c7

    00007ffe83ce8364 KERNEL32!BaseThreadInitThunk+0x14

    00007ffe8638e851 ntdll!RtlUserThreadStart+0x21


1 stack(s) with 1 threads displayed (1 Total threads)



분석 결과를 보면 사용자는 Logoff를 하려고 하였는데 백신 제품이 IO를 잡고 있어서 시스템이 멈춘 것으로 보인 증상이었습니다.


'Debugging' 카테고리의 다른 글

System Hang 분석  (0) 2018.11.24
Symbol Server 설정 (공유, HTTP)  (0) 2018.03.04
[디버거 명령]!Mex.p  (0) 2017.09.25
[디버깅 명령]!mex.help  (0) 2017.09.24

Windows Hang 관련 hotfix 입니다.

Windows Server 2012 and Windows Server 2012 R2
  System freezes during disk I/O in Windows
  https://support.microsoft.com/en-us/kb/2904100

The logon process hangs at the "Welcome" screen or the "Please wait for the User Profile Service" error message window
  https://support.microsoft.com/en-us/kb/3132080

  DNS server freezes and service restart fails during the service restart in Windows Server 2012 R2 or Windows Server 2012
  https://support.microsoft.com/en-us/kb/3134242

+ Recent posts