Vulkan apps crash or freeze on VT switch

Miro256 · April 30, 2023, 8:56pm

When I switch VT (virtual terminal), Vulkan applications either crash, or freeze. More annoyingly, this also happens when locking my screen via light-locker (which I believe switches VT). I am on Debian Testing, driver 525.105.17, NVIDIA RTX 3070.

The simplest app to test is “vkcube”, which comes from the package “vulkan-tools”. After VT switch, it crashes with an assertion failure (seems that some Vulkan call receives VK_ERROR_DEVICE_LOST).

Another app is Chromium with hardware acceleration enabled. Switching VT freezes it for like 15 seconds and then seemingly recovers (according to the console output, its GPU process crashes).

Next I tried some Vulkan games on Steam running via Proton, such as Doom Eternal. They all freeze indefinitely.

OpenGL apps do not suffer from this problem. Because of that and the fact I’ve not been able to find a single Vulkan app that survives a VT switch, I believe this may be an NVIDIA driver bug.

nvidia-bug-report.log.gz (304.2 KB)

EDIT: Some additional information that may or may not be helpful:

I only have one GPU (no integrated one).
The console seems to be driven by efifb. Perhaps there is some bad interaction between efifb and nvidia-driver.

aplattner · May 1, 2023, 7:16pm

Unlike OpenGL, which recovers from device mode switches behind the scenes in the driver, Vulkan handles them by returning VK_ERROR_DEVICE_LOST. Support for that is hit and miss in Vulkan apps in general, and I think vkcube is no exception. The version of vkcube that I have just hangs, so crashing on an assertion sounds like an improvement. What it’s supposed to do is recreate its device and continue, but it sounds like that’s not implemented.

You can avoid generating VK_ERROR_DEVICE_LOST on VT switches by enabling the NVreg_PreserveVideoMemoryAllocations=1 parameter to the nvidia kernel module. Please note that if that options is enabled, then suspending the system needs to go through the nvidia-suspend.service and nvidia-resume.service systemd units in order for the suspend sequence to work properly. From your bug report log, it looks like those services aren’t installed:

____________________________________________

/usr/bin/systemctl status nvidia-suspend.service nvidia-hibernate.service nvidia-resume.service nvidia-powerd.service

____________________________________________

Miro256 · May 1, 2023, 9:19pm

Thank you very much for your response! It is unfortunate that not even Chromium handles errors properly.

Anyway, I can confirm that the NVreg_PreserveVideoMemoryAllocations=1 parameter allows all of the applications to continue gracefully.

Interesting, I can’t find these services anywhere… It could be that Debian hasn’t packaged them. But oh well, I think I can live without suspend/resume for now.

By the way, I’ve chanced upon your post 216303 – Commit ee7a69aa38d87a3bbced7b8245c732c05ed0c6ec broke legacy frame buffer with NVIDIA where you mention:

I’m looking at making the NVIDIA driver install its own framebuffer console in order to work around this problem, but that will take a little while to develop and get it into production.

Would this also solve the problem (along with others such as slow VT switching and low resolution)? Or would VK_ERROR_DEVICE_LOST still be generated?

aplattner · May 1, 2023, 9:51pm

No, that’s unrelated. The reason VT switching loses the device is because the X driver has to assume that when it’s not on the active VT, the system could suspend at any time. If video memory contents are lost during suspend, then the driver needs to recover anything that was in video memory after resume. For Vulkan, that necessitates a VK_ERROR_DEVICE_LOST. If video memory preservation is enabled, then the X driver knows that memory contents will stay where they are and can allow applications to just continue on as if nothing happened.

Whether the framebuffer console is driven by DRM or some other framebuffer console driver doesn’t make a difference here.

Topic		Replies	Views
Vulkan application freeze after suspend or switch to virtual terminal Linux vulkan	4	1122	June 30, 2021
(Optimus) Artifacts and vulkan applications crashing when switching TTY Linux	7	873	June 18, 2021
Hangs/Freezes when Vulkan v-sync (VK_PRESENT_MODE_FIFO_KHR) is enabled Linux	39	13867	January 11, 2021
Vulkan App with VK_PRESENT_MODE_FIFO_KHR (VSync) causes desktop stuttering across entire system when moving or resizing any window. (Linux/X11) Vulkan	12	8683	February 8, 2024
Rare crash deep inside vkQueueSubmit Vulkan	2	2335	March 23, 2017
Regression 565 PRIME offloading Linux nvbugs , vulkan , linux-driver	7	245	April 30, 2025
Games freeze and then crash when using Steam Proton on GeForce GTX 1060 Mobile Linux	19	4314	February 16, 2022
vkSetDebugUtilsObjectNameEXT() causes a crash when attempting to set a name for VkPhysicalDevice Vulkan	7	1909	January 28, 2020
Crash on Wayland with Vulkan apps Linux	11	656	March 7, 2024
Crash with vulkan when using VK_NV_framebuffer_mixed_samples + dynamic rendering Vulkan	16	2757	November 13, 2023

Vulkan apps crash or freeze on VT switch

Related topics