X Server crash with invalid Vulkan swapchain surface format

If you create a vkDebugUtilsMessengerEXT and the pfnUserCallback is null, the validation layers do warn about this. However, it looks like the driver does not perform a null check on the function pointer and blindly follows the pointer, causing an X server crash. Basic example

VkDebugUtilsMessengerEXT debug_messenger = {};
VkDebugUtilsMessengerCreateInfoEXT debug_create_info = { VK_STRUCTURE_TYPE_DEBUG_UTILS_MESSENGER_CREATE_INFO_EXT };
vkCreateDebugUtilsMessengerEXT = reinterpret_cast<PFN_vkCreateDebugUtilsMessengerEXT>(vkGetInstanceProcAddr(instance, "vkCreateDebugUtilsMessengerEXT"));
vkCreateDebugUtilsMessengerEXT(instance, &debug_create_info, nullptr, &debug_messenger);

While this is invalid, it should not crash the X server. The 460 drivers on Windows do not misbehave in any way with this situation, so this is specific to Linux only.

Can you please run sudo nvidia-bug-report.sh after reproducing the X server crash and attach the bug report log?

Also, do you have a test app you could send me that reproduces this?

Ah yes I have that report and forgot to attach it. However, after a bit more investigation I don’t think it’s an invalid debug messenger callback causing the crash but an invalid swap chain creation. I’ll figure out exactly what conditions cause it and post a full reproduction program.

Awesome, thanks! In the meantime, the crash dump in the bug report log could still be useful.

Alright, I have a program which will reliably crash the X server. I was mistaken, it wasn’t a null pointer to a debug messenger callback function, that was a red herring. It happens when creating a swapchain with the surfaceFormat being invalid. Vulkan validation layers do warn about the surface format is invalid. The full source code with comment at the place which causes the crash

This uses xcb, Vulkan 1.2.162, nVidia X server 450 and drivers 460.

The same program using Win32 on Windows will crash with a divide by zero error within nvogl.dll, which still isn’t perfect behavior but doesn’t completely kill the window manager. Probably safer on Windows because drivers are user space on Windows.

nvidia-bug-report.log.gz (441.0 KB)

Awesome, thanks! This reproduced the problem right away and I filed internal bug number 3256934.

Thanks for your report. This issue will be fixed in our next driver release.

1 Like