DXGI_ERROR_DEVICE_REMOVED because of DXGI_ERROR_DEVICE_HUNG

Hi there,

We have a number of users experiencing driver errors with our GPU path tracer on RTX 3060. The path tracer builds quite a big command list (we use DirectX 12) and sometimes after executing it we get the “device removed” error. The reason of the removal is that the device hangs. The Windows events log gets errors like “Display driver nvlddmkm stopped responding and has successfully recovered.”.

We handle device lost errors according to MSDN recommendations and it sometimes works, but if the error happens a few times during the rendering (usually after 3-4 device re-creations) we cannot create that device anymore because it disappears from the devices list returned by DX.

This happens with the current drivers (536.23) and with the older ones (something about 531 or so), but didn’t happen about a year ago when we implemented this feature. I can’t really tell what was the driver version back then, though.

Unfortunately, there is no way I can provide a simple project that reproduces the problem, but if you guys could download and install the software, I would provide a sample scene and the instructions on how to crash the driver. This includes a software license, so please contact me directly using this profile email.

If there is a way to enable extra logging and get more information or you need some sort of logs or dumps - I am more than ready to help.

Thank you :)

Hi there @appsforlife, welcome to the NVIDIA developer forums.

Does your company have some form of developer relations with NVIDIA already? If not I might be hard-pressed to find someone who would download and debug your app for you.

But let’s try to see if we can still help somehow.

Do you have a stack trace that would possibly narrow it down to a bit more than just nvlddmkm?
What SDKs or APIs are you using to build your path tracer? It is probably not Assembly?

Maybe with that information we can take it further.

Thanks!