On a vulkan compute application I am working on, after I peform a vkQueueSubmit to the compute queue of my GPU with a command buffer which performs a large dispatch on a fairly sizable shader, calling vkQueueWaitIdle returns VK_ERROR_DEVICE_LOST. Calling vkWaitForFences with the fence passed into vkQueueSubmit always returns VK_NOT_READY. If I comment out a call to a large function (large in the sense of an increase to the program’s CFG) in the entry point of the shader, this error does not occur. This leads me to believe that something about the shader is breaking the driver. Potentially the shader is causing cores to stack overflow (via large stack frames in the program), or something else. The program in question requires fairly sizable stack frames.
Here is a copy of my nvidia-bug-report.log.gz, vktrace_out.vktrace, and compiled spirv shader: