Vk_error_device_lost in many game titles

Starting lately i get vk_error_device_lost error with hangs in games using wine under Gentoo Linux Gnome Desktop 3.36 it started after a recent driver update like >440.100

2 Likes

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

This bug probably has to do with DXVK please check the GitHub issue at: Final Fantasy XIV: VK_ERROR_DEVICE_LOST Ā· Issue #1791 Ā· doitsujin/dxvk Ā· GitHub. Multiple people have been experiencing this bug on every driver version after 440.100.

ā€¦but nobody seems to have reported here, at least using the search didnā€™t bring up anything related.
nvidia-bug-report.log?
hardware setup?
reproduction steps?
games affected?
I guess @amrits would like to play some new games.

Iā€™ve emailed my nvidia-bug-report.log.gz to linux-bugs@nvidia.com.

That should be in the nvidia-bug-report.log.gz

Play the games enough. Or specifically with DXVK 1.7.1 which lacks this work around: [dxvk] Only use half of the DEVICE_LOCAL | HOST_VISIBLE heap on Nvidia Ā· doitsujin/dxvk@16a51f3 Ā· GitHub

Here is a list of games people have reported to the DXVK bug tracker that crash (I would provide links to them but I am restricted to 3 links per post):

Ghostrunner
Assassinā€™s Creed Odyssey
Final Fantasy XIV
World of Warcraft and Thief 2014
Unigine Superposition

Final Fantasy XIV is really fun!

Iā€™ve attached my DXVK log for the crash to this report I am not sure how helpful it will be.

ffxiv_dx11_d3d11.log (61.5 KB)

Edit: Correction DXVK 1.7.2 was when the work around was introduced. Therefor testing with DXVK 1.7.1 should recreate this crash much more frequently.

Edit: Some more debugging info.

Working versions:
440.100

Broken versions:
450.66
450.80.02
455.45.01
460.27.4

I guess you then already did all you could. Now you can only keep bugging.

Thereā€™s not really anything I can do but wait for Nvidia to release a new driver and hope itā€™s fixed and that hasnā€™t been working well so far. Iā€™m sure the gaming community would very much appreciate if Nvidia actually looked into this bug and got it fixed.

Still broken in 460.32.3.

ffxiv_dx11_d3d11.log (15.2 KB)
ffxiv_dx11_dxgi.log (6.3 KB)

Still broken 460.39.0.

ffxiv_dx11_d3d11.log (21.4 KB)
ffxiv_dx11_dxgi.log (6.3 KB)

Note that the mpv project has also been encountering this problem when using vulkan acceleration for decoding video, what a coincidence right?

https://github.com/mpv-player/mpv/issues/8360

Nice find. The XID 31 points to a memory management bug, was this also reported in the logs when playing games?

Unfortunately nothing shows up on my logs for the crash in dxvk but considering FFXIV seems to trigger a lot of res changes I wouldnā€™t be surprised if theyā€™re related?

DXVK:

info:  Setting display mode: 1920x1080@60
info:  Setting display mode: 1920x1080@60
info:  Setting display mode: 1920x1080@60
info:  Setting display mode: 1920x1080@60
info:  Setting display mode: 1920x1080@60
info:  Setting display mode: 1920x1080@60
err:   DxvkSubmissionQueue: Failed to sync fence: VK_ERROR_DEVICE_LOST
err:   DxvkSubmissionQueue: Command submission failed: VK_ERROR_DEVICE_LOST
info:  Setting display mode: 1920x1080@60

MPV

[vo/gpu/vulkan/libplacebo] vk->QueueSubmit(cmd->queue, 1, &sinfo, cmd->fence): VK_ERROR_DEVICE_LOST
[vo/gpu/vulkan/libplacebo] vk->QueueSubmit(cmd->queue, 1, &sinfo, cmd->fence): VK_ERROR_DEVICE_LOST
[vo/gpu/vulkan/libplacebo] vk->QueueSubmit(cmd->queue, 1, &sinfo, cmd->fence): VK_ERROR_DEVICE_LOST
[vo/gpu/vulkan/libplacebo] vk->QueueSubmit(cmd->queue, 1, &sinfo, cmd->fence): VK_ERROR_DEVICE_LOST
[vo/gpu/vulkan/libplacebo] vk->QueueSubmit(cmd->queue, 1, &sinfo, cmd->fence): VK_ERROR_DEVICE_LOST
[vo/gpu/vulkan/libplacebo] vk->QueueSubmit(cmd->queue, 1, &sinfo, cmd->fence): VK_ERROR_DEVICE_LOST
[vo/gpu/vulkan/libplacebo] vk->QueueSubmit(cmd->queue, 1, &sinfo, cmd->fence): VK_ERROR_DEVICE_LOST
[vo/gpu/vulkan/libplacebo] vk->QueueSubmit(cmd->queue, 1, &sinfo, cmd->fence): VK_ERROR_DEVICE_LOST
[vo/gpu/vulkan/libplacebo] vk->QueueSubmit(cmd->queue, 1, &sinfo, cmd->fence): VK_ERROR_DEVICE_LOST
[vo/gpu/vulkan/libplacebo] vk->QueueSubmit(cmd->queue, 1, &sinfo, cmd->fence): VK_ERROR_DEVICE_LOST
[vo/gpu/vulkan/libplacebo] Failed holding swapchain image for presentation
[vo/gpu] Failed presenting frame!

Iā€™ve noticed the more I alt-tab or switch away from the game window the more often the bug occurs.

I tried the Vulkan Beta Driver at the suggestion of one of the users in the GitHub bug report, Still crashes sadly.

Still broken. 455.50.04

ffxiv_dx11_d3d11.log (4.7 KB)
ffxiv_dx11_dxgi.log (6.4 KB)

Beta Driver. Still Broken. 455.50.07

ffxiv_dx11_d3d11.log (4.7 KB)
ffxiv_dx11_dxgi.log (6.7 KB)

According to this PR you should try 455.50.10

Fixed a bug with the host-visible device-local memory heap, where if an allocation failed due to space constraints, it could cause the application to crash on future Vulkan function calls

Yeah the DXVK dev already let me know in the GitHub Issue thread.

Hello buggy drivers my old friendā€¦ itā€™s time to crash again. Broken 455.50.10.

ffxiv_dx11_d3d11.log (5.1 KB)
ffxiv_dx11_dxgi.log (6.7 KB)

Broken. 465.19.01.

ffxiv_dx11_d3d11.log (4.7 KB)
ffxiv_dx11_dxgi.log (6.7 KB)

2021-04-01: Iā€™ve downgraded all the way to 440.100 again and Iā€™ll be tested that for a good month or so.

Prime offload involved?
https://forums.developer.nvidia.com/t/doom-2016-vulkan-renderer-is-broken-since-440-drivers-optimus/160332/8?u=generix