KiPageFault into BSOD when stepping over
I've been struggling long time with weird bug check during kernel driver debugging. Stack trace would look like this:
1: kd> k Child-SP RetAddr Call Site ffffd000`20463d78 fffff800`1aa610ea nt!DbgBreakPointWithStatus ffffd000`20463d80 fffff800`1aa609fb nt!KiBugCheckDebugBreak+0x12 ffffd000`20463de0 fffff800`1a9d8da4 nt!KeBugCheck2+0x8ab ffffd000`204644f0 fffff800`1aa00b1f nt!KeBugCheckEx+0x104 ffffd000`20464530 fffff800`1a8c75ad nt! ?? ::FNODOBFM::`string'+0x1797f ffffd000`204645d0 fffff800`1a9e2f2f nt!MmAccessFault+0x7ed ffffd000`20464710 fffff800`002b92e3 nt!KiPageFault+0x12f ffffd000`204648a0 fffff800`0117b41f Wdf01000!imp_WdfFdoInitQueryProperty+0x28 ffffd000`204648f0 fffff800`0118117f MyVolFlt!WdfFdoInitQueryProperty+0x5f [c:\program files (x86)\windows kits\8.1\include\wdf\kmdf\1.13\wdffdo.h @ 217] ffffd000`20464940 fffff800`0027f55b MyVolFlt!MyVolFltEvtDeviceAdd+0x9f [c:\development\projects\kernelmode\myvolflt\driver.c @ 116] ffffd000`20464bd0 fffff800`1a9539d9 Wdf01000!FxDriver::AddDevice+0xab ffffd000`20464ff0 fffff800`1ace18ab nt!PpvUtilCallAddDevice+0x35 ffffd000`20465030 fffff800`1acdff9e nt!PnpCallAddDevice+0x63 ffffd000`204650b0 fffff800`1acdf2db nt!PipCallDriverAddDevice+0x6e2 ffffd000`20465250 fffff800`1ad14b89 nt!PipProcessDevNodeTree+0x1cf ffffd000`204654d0 fffff800`1a97d0b8 nt!PiProcessReenumeration+0x91 ffffd000`20465520 fffff800`1a97cf2e nt!PnpDeviceActionWorker+0x168 ffffd000`204655d0 fffff800`1af93382 nt!PnpRequestDeviceAction+0x1da ffffd000`20465610 fffff800`1af89022 nt!IopInitializeBootDrivers+0x83e ffffd000`204658b0 fffff800`1af7794d nt!IoInitSystem+0x91e ffffd000`204659d0 fffff800`1ad7bd09 nt!Phase1InitializationDiscard+0xe61 ffffd000`20465bd0 fffff800`1a9182e4 nt!Phase1Initialization+0x9 ffffd000`20465c00 fffff800`1a9df2c6 nt!PspSystemThreadStartup+0x58 ffffd000`20465c60 00000000`00000000 nt!KiStartSystemThread+0x16
Bug Check description:
1: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* PAGE_FAULT_IN_NONPAGED_AREA (50) Invalid system memory was referenced. This cannot be protected by try-except, it must be protected by a Probe. Typically the address is just plain bad or it is pointing at freed memory. Arguments: Arg1: ffffe00020464c10, memory referenced. ...
Now lets see what is this address:
1: kd> !pool ffffe00020464c10 Pool page ffffe00020464c10 region is Nonpaged pool ffffe00020464000 is not a valid large pool allocation, checking large session pool... Unable to read large session pool table (Session data is not present in mini and kernel-only dumps) ffffe00020464000 is not valid pool. Checking for freed (or corrupt) pool Address ffffe00020464000 could not be read. It may be a freed, invalid or paged out page 1: kd> ? poi(DeviceInit) Evaluate expression: -35183830610928 = ffffe000`20464c10
Wow, faulting memory references is DeviceInit
actually! And it is located on stack (because of KMDF model).
Sure IRQL is at PASSIVE level:
1: kd> !irql Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL)
The funniest thing so far is that if I set bp after the call to WdfFdoInitQueryProperty
- it would run smoothly. So there is something wrong with the debugger interacting OS kernel.
Now I finally managed to figure out what was wrong. I would normally set my bp during initial break-in sequence:
Connected to Windows 8 9600 x64 target at (Thu Jan 16 00:54:33.435 2014 (UTC + 2:00)), ptr64 TRUE Kernel Debugger connection established. ************* Symbol Path validation summary ************** Response Time (ms) Location Deferred cache*C:\Development\Tools\Symbols Deferred srv*http://msdl.microsoft.com/download/symbols Symbol search path is: cache*C:\Development\Tools\Symbols;srv*http://msdl.microsoft.com/download/symbols Executable search path is: Windows 8 Kernel Version 9600 MP (1 procs) Free x64 Built by: 9600.16452.amd64fre.winblue_gdr.131030-1505 Machine Name: Kernel base = 0xfffff800`5547e000 PsLoadedModuleList = 0xfffff800`55742990 System Uptime: 0 days 0:00:00.102 nt!DebugService2+0x5: fffff800`555d28e5 cc int 3 kd> bp MyVolFltEvtDeviceAdd kd> g
And here what happens after:
Unload module \SystemRoot\system32\mcupdate_GenuineIntel.dll at fffff800`1b200000 Unload module \SystemRoot\System32\drivers\werkernel.sys at fffff800`19ed5000 ... Unload module \SystemRoot\system32\DRIVERS\MyVolFlt.sys at fffff800`1b9ed000 nt!DebugService2+0x5: fffff800`555d28e5 cc int 3 kd> k # Child-SP RetAddr Call Site 00 fffff800`573991a8 fffff800`55544361 nt!DebugService2+0x5 01 fffff800`573991b0 fffff800`555442ff nt!DbgLoadImageSymbols+0x45 02 fffff800`57399200 fffff800`55b76fc4 nt!DbgLoadImageSymbolsUnicode+0x2b 03 fffff800`57399240 fffff800`55b7684b nt!MiReloadBootLoadedDrivers+0x300 04 fffff800`573993c0 fffff800`55b6c091 nt!MiInitializeDriverImages+0x163 05 fffff800`57399470 fffff800`55b67299 nt!MiInitSystem+0x3d9 06 fffff800`57399500 fffff800`557e84ea nt!InitBootProcessor+0x301 07 fffff800`57399740 fffff800`557de1a3 nt!KiInitializeKernel+0x5a2 08 fffff800`57399ad0 00000000`00000000 nt!KiSystemStartup+0x193
It is unloading boot time drivers! And reloading with different start addresses! So when I set my breakpoint at MyVolFltEvtDeviceAdd
, WinDbg would insert int 3
instruction and during module relocation that instruction is copied as is. So my breakpoint actually hits, despite code relocation. But this is where the Windows and debugger fall apart - they don't know about this breakpoint.
In order to issue correct breakpoint address, you must break on module load:
kd> sxe ld MyVolFlt kd> sxe ud MyVolFlt kd> sx ct - Create thread - ignore et - Exit thread - ignore cpr - Create process - ignore epr - Exit process - ignore ld - Load module - break (only break for myvolflt) ud - Unload module - break (only break for MyVolFlt)
And issue bp
command after kernel reloads boot loaded drivers.