Debugging tips for nvidia driver crashes

I am trying to resolve an nvidia driver crash and display freezing (no TDR, just freeze). We get frequent nvlddmkm errors every hour but only occasionaly do these cause the system to freeze (reboot required). Sometimes a TDR is caught and the driver recovers however.

I am generally getting 4 or 5 errors as follows:

Graphics Exception: Class 0x0 Subchannel 0x0 Mismatch
0000000002003000000000000D00AAC0000000000000000000000000000000000000000000000000

Graphics Exception: ESR 0x4041b0=0x00
000000002003000000000000D00AAC0000000000000000000000000000000000000000000000000

Graphics Exception: ESR 0x404000=0x80000002
0000000002003000000000000D00AAC0000000000000000000000000000000000000000000000000

Graphics Exception: MISSING_MACRO_DATA
0000000002003000000000000D00AAC0000000000000000000000000000000000000000000000000

Graphics Exception: ESR 0x404490=0x80000001
0000000002003000000000000D00AAC0000000000000000000000000000000000000000000000000

This is running on a GTX 750Ti and I have tried the following driver versions:

375.63, 373.06, 372.90, 362.00,3 61.91, 353.62

I also have 3 X identical hardware setup running and all experience the same issue.

I would like to know how I can go about debugging this issue or if at least have a better understanding on what the above errors mean?

The application is a mix of new and legacy openGL so I’ve no doubt that there is something in the app exercising functionality in the nvidia driver that is causing the issue.

I doubt Nvidia will be able to fix this without the exact setup so I would like some debugging tips / tools that I could use to try and pin down what is going on.

I’ve had a look at winDbg but I’m not sure how best to use this to monitor the driver.

Any tips would be much appreciated.

The solution to my crazy video crashing headache.

I was having random issues with my new PC build. Windows 10 latest updates. New mobo, cpu, and ram, but same hdd, gpu, psu. I had the nvlddmkm error sometimes, and other times I would have no errors in event log, but I would be playing a half life 2 mod and the video would freeze, but I could hear the audio continue normally. Another game would CTD. I used DDU, trying different driver versions going back a year, I tried tweaks found in the nvlddmkm sticky page and other forum posts (TDR), I analyzed crash dumps, checked source console logs, researched errors, verified psu voltage, ran memtest, and nothing. I got a new ssd and re-installed Windows, same issue.

In my event log, I found a few random disk errors and traced them to my 3TB SATA data drive. I bought a new hdd, transferred my data, and disconnected this drive from my system, and all the issues were resolved.

My notebook with an GTX 980M has the exact same issues as you’ve described in your reply.
What kind of disk errors did you get? I watched my event logs and didn’t find a bad thing regading my HDD.