mwelsh
November 12, 2020, 3:07am
1
When running a certain sequence of Vulkan commands, the app will hang. If the process is not terminated, the graphics driver will hang/become unstable.
Here is a RenderDoc capture of the issue. This also causes RenderDoc to hang/be unstable in the same way on my machine:
vulkan-nvidia-hang.zip (9.4 KB)
RenderDoc v1.10
After bisecting driver versions:
GeForce Game Ready Driver 456.38 (Sep 17) and later hangs. The issue persists on the latest driver, 457.30.
452.06 and earlier do not have the issue.
Windows 10 Pro 64-Bit
Nvidia Geforce 2080 Ti
Vulkan SDK 1.2.154.1
Full Nvidia control panel system info dump:
GitHub issue with full reproducer project and info:
opened 10:13PM - 13 Oct 20 UTC
external: driver-bug
api: vulkan
<!-- Thank you for filing this! Please read the [debugging tips](https://github.… com/gfx-rs/wgpu/wiki/Debbugging-wgpu-Applications).
That may let you investigate on your own, or provide additional information that helps us to assist.-->
**Description**
It's a little hard to turn this into a small repro case so please forgive the vagueness.
After submitting a frame to wgpu whilst using Vulkan backend, Vulkan seems to become unstable and this manifests itself in a few ways:
- A hang of the application
- Graphics device crashes and PC dies (this happened to me a few times)
- The submit seemingly returns okay but nothing actually happened, and the device will become lost the next time we try to draw a frame
Our application has two ways to reproduce the bug:
- When rendering to a window, we repeatedly submit frames as a typical game would. This often just locks up but sometimes will gracefully give you an error about the device being lost.
- Rendering one single frame to a texture, saved to disk.
In this second case, we perform the following sequence of events:
- Create a texture
- Draw a frame to a command encoder
- Submit the command encoder to the queue
- Copy the texture buffer to disk, much like the capture example in wgpu-rs
Seemingly the submit returns okay but the texture is completely empty, when we'd expect to see some graphics in it. The application then freezes (at least, for me on windows - this seems to vary) when dropping wgpu::Instance. For reference, the image it spits out should be identical to [this one](https://raw.githubusercontent.com/ruffle-rs/visual-tests/master/tests/drawing_api/fills_and_lines/actual.png).
I've taken a trace of this single-frame capture and had to manually close the toml as the recording can't finish. This seems to freeze when played back, but I'm unable to get renderdoc to play nice and see anything from it.
This worked for us in the past, I think as soon as 24 days ago I was running this without issues. The same code, unchanged, no longer works today.
**Repro steps**
I haven't been able to create a minimal reproducible example, but you can see it in our project with the following steps:
- Grab [this swf](https://github.com/ruffle-rs/visual-tests/raw/master/tests/drawing_api/fills_and_lines/test.swf)
- Clone [Ruffle](https://github.com/ruffle-rs/ruffle/)
- `cargo run --package=ruffle_desktop -- test.swf` if you want to see it visually, with multiple frames
- `cargo run --package=exporter -- test.swf` if you want to see the single frame saved to a texture on disk
You can apply this commit to reduce the amount of rendering done to the bare minimum that still crashes, with that particular swf: https://github.com/Dinnerbone/ruffle/commit/b4f173dbbc9db0128cd2c591d71efbd49e024852
**Expected vs observed behavior**
I expect to either get an error describing how we're using wgpu wrong, or for it to work :D
**Extra materials**
- [A trace.zip of saving one frame to a texture](https://github.com/gfx-rs/wgpu/files/5374447/trace.zip)
**Platform**
Reproduced on Windows.
Only affects Vulkan backend. We're seeing some instability with DX12 but not certain it's related yet.
Reproduced on wgpu 0.6 and https://github.com/gfx-rs/wgpu-rs/commit/e3eadca8c626beb9a1c25c359b0e20f6fdef00c4