Hi, I am hoping someone can help me figure out an issue I started having last week.
I have used VMWare Workstation Pro on my Win 10 Pro machine for years with no problems. Last week, I also installed WSL2 and Docker Desktop for work. After that, any time I play a game, my computer will BSOD and reboot. It doesn’t happen right away but usually happens within 30 minutes of playing. Each time the game will freeze and audio will be stuck playing whatever sound it was playing at the moment it froze. Once in a while the game will unfreeze and be playable again but most of the time it ends up being a BSOD.
I tried rolling drivers back twice using the DDU/manual install method but that hasn’t worked. I also uninstalled WSL2 to test if that is the issue and the problem went away so I know it is tied to WSL2 somehow.
My system is running some older but nice hardware from the Intel Sandy Bridge days.
I have checked the crash dumps as well with WinDbg. The crashes happen in different places most of the time. The only consistent thing is that a game is running when the crash occurs and I always get a DPC WATCHDOG error making me think the heavier load on the video card and WSL2 are somehow not playing well together.
I have another computer built about 5 years ago. It also has WSL2 and the same video card (1080) and it plays fine with no errors.
At this point, I am wondering if my older hardware just doesn’t want to work in this instance. Does anybody have insight into this type of problem?
Ok, I can try there. He probably thought WSL2 and this Linux forum had a connection and I am willing to explore at this point. :) Thanks for the direction!
Installing WSL2 will also enable the bare metal virtualization layer of Windows 10, and much functionality going with that. Doing that has impact, for instance, on your VMware Workstation installation, which now has to use a different (much slower) path to virtualize.
As you are not running anything inside WSL2 that would exercise the Nvidia GPU (there is actually support for doing that, as the OS exposes a dedicated interface into a WSL2 virtual machine (cf Leveling up CUDA Performance on WSL2 with New Enhancements | NVIDIA Technical Blog) this is not a problem of Nvidia functionality being triggered via WSL2.
So, as others have said, installing WSL2 per se does not make this a Linux problem; your problem is most likely more tied to the Windows bare metal hypervisor layer coming with WSL2.
DPC is Deferred Procedure Call ( Introduction to DPC Objects - Windows drivers | Microsoft Docs) and in essence trouble in there results from messy interaction between very low-level components of your Windows installation. Hence all the advice out there for Windows is “fixup the software stack for all the low-level components and pray for the best”.
Really appreciate the reply. I started reading through the link you gave on DPCs and that is making sense with what I am seeing. It appears as if an ISR is not able to start for whatever reason and the system hangs. The bare metal virtualization you are talking about is Hyper-V, right? I had to enable a special option in my VMWare VMs to get them to run in a performant way after installing WSL2 and found out Hyper-V is really at work under the hood. This seems like a Microsoft issue more than anything so not sure I should even post this to another forum here. What do you think?
The product “Hyper-V” (and WSL2 and quite some more components in recent versions of Windows) build on top of that bare metal hypervisor. In Windows 11 this layering is nicely visible in the “Windows Features” dialog, prior to that it was not as evident.
If I was in your situation, I’d pick one of the following two strategies:
A: do nothing, live with it
B: rebuild the complete software stack from scratch, starting with BIOS updates, BIOS defaults up to the OS / very latest drivers, leave out as many “invasive” tooling as you can (e.g. VPN software, antivirus tooling - except Microsoft Defender -, VMware products, VirtualBox, anything that goes kernel-level). Do that in stages, that might enable you to identify a trigger if the problem crops up again.
With strategy A you will continue to have problems, but no wasted effort. Strategy B will cost you considerable time and effort, with unknown results - and it even might not be feasible for you to enact if you depend on that machine. AFAICT, there is no other effective / efficient strategy.
Forget about any kind of forums or support. This is very very peculiar to your current machine and the currently installed software stack and its configuration.
I think solution A is my best path forward at this point since I can’t spend a lot of time on trying to fix this. I have tested a newer computer running a similar stack with no problems so I will just wait until I do a new computer rebuild to do more thorough testing since I will have an idea of what to look for. I also have a workaround for now that keeps me from crashing so while not optimal, I can live with it.
Thanks for walking through the issue with me. The detailed thoughts you gave were of great help. I think you just saved me countless hours! :)