presentKHR still blocks on windows even when using VK_KHR_present_wait

mizvekov · April 24, 2023, 11:40pm

NVidia Vulkan drivers have had a long standing technical debt where presentKHR blocks until the image is presented, which can be for a very long time in FIFO mode.
Even if that is a valid implementation, that’s a though pill to swallow since the spec doesn’t even allow a timeout parameter for that call.

The VK_KHR_present_wait extension is supposed to provide a vkWaitForPresentKHR function which can be used to wait for presentation.

The NVidia Windows Vulkan drivers claim to implement that extension, but according to my tests, the blocking behavior on presentKHR is still there, which makes the utility of this extension a bit dubious.

Is this behavior as intended, or is that a problem on my end?

I attach the demo which I used for my tests, which is just vkcube.cpp extended to support the extension.

First patch is the main implementation, second patch adds printing of some metrics, and I also attach the full source code for convenience.

The diff is against the demo in SDK version 1.3.243.0.

FWIW the same demo works as expected when tested on a WINE Linux setup with Intel graphics.

On NVidia windows it prints metrics which suggest all waiting is happening on presentKHR:
present:13.488300ms waitForPresent:0.059500ms
On Linux Wine intel it prints metrics which suggest no blocking on presentKHR, all the waiting is happening on waitForPresent:
present:0.151900ms waitForPresent:30.070200ms.

The affected setup is Windows 11 machine with a RTX 3060 connected to a PHL 499P9 monitor.

Thanks.
commit-420063b (8.2 KB)
commit-678d8b8 (1.7 KB)
cube.cpp (138.0 KB)

MarkusHoHo · April 25, 2023, 1:48pm

Hello @mizvekov, thanks for pulling the issue from Discord! And welcome to the NVIDIA developer forums.

Let’s see if we can get some feedback.

mizvekov · September 13, 2023, 2:43pm

Ping, in case this was forgotten about.

Thanks.

dark_sylinc · April 16, 2025, 5:52pm

I’m not from NVIDIA.

And sorry for the necrpost, I just want to point out several things for anyone who bumps into this post (like I just did):

The NVidia Windows Vulkan drivers claim to implement that extension, but according to my tests, the blocking behavior on presentKHR is still there, which makes the utility of this extension a bit dubious.

Is this behavior as intended, or is that a problem on my end?

This behavior is allowed by the standard. The spec says: “Calls to vkQueuePresentKHR may block, but must return in finite time”.

Originally drivers were intended to block in vkAcquireNextImageKHR which has a timeout value like you said, but due to various technicalities, this information is not known until vkQueuePresentKHR so drivers chose to block there on certain OSes.

Mesa RADV on X11 on Linux also waits on vkQueuePresentKHR.

The VK_KHR_present_wait extension is supposed to provide a vkWaitForPresentKHR function which can be used to wait for presentation.

The NVidia Windows Vulkan drivers claim to implement that extension, but according to my tests, the blocking behavior on presentKHR is still there, which makes the utility of this extension a bit dubious.

That is not what VK_KHR_present_wait is for. vkQueuePresentKHR is blocking until at least one swapchain has become available again (e.g. you’ve created 4 image swapchains and you’ve submitted all 4 images).

This extension allows to wait until a specific swapchain has been presented to prevent the CPU from getting too far ahead from the GPU and thus reduce latency. This is useful when you don’t want to be more than e.g. 1 or 2 frames behind but VkSwapchainCreateInfoKHR::minImageCount is substantially larger.

While you can use VkFences to avoid the GPU getting too far ahead, a VkFence lets you know when the GPU is done doing frame N, not when the GPU is done presenting frame N. This gap between work done and presentation increases latency.

mizvekov · April 21, 2025, 5:28pm

Thanks for the response.

I think the problem is that for a swap chain with a very small number of images (ie 2), VK_KHR_present_wait on Nvidia driver does not do anything useful, it just returns immediately, while on other drivers, even those which normally block on vkQueuePresentKHR, it does wait until a moment where you can call present without blocking at all.

I think the vkcube demo I provided does a nice job of showing the problem.
It’s a shame a demonstration of this extension was never incorporated there.

If the behavior of the NVidia driver is allowed, I think it is still an issue that there is no way to query the implementation if it’s going to behave that way.

Topic		Replies	Views
Problems with VK_KHR_swapchain Vulkan	5	5231	September 30, 2018
Vulkan/Wayland vkQueuePresentKHR waits for GPU to finish Linux	3	351	August 3, 2024
Severe user input lag in Vulkan on Windows Vulkan	5	818	July 26, 2024
VK_KHR_present_{id,wait} causes device loss on Nvidia 525.60.11 on PRIME setup Linux nvbugs , vulkan , linux , linux-driver	5	2362	March 25, 2023
vkAcquireNextImageKHR ignoring timeout Vulkan	6	2432	July 19, 2017
[windows] Possible vulkan driver bug with recent driver versions likely related to swapchain Vulkan	4	3289	September 10, 2019
364.19 Linux/X11 - Presenting from more than 2 queues causes hangs/VK_ERROR_DEVICE_LOST. Vulkan	8	2927	August 17, 2016
Presentation in Latest Nvidia driver [545.29.02-4] appears to be bugged Linux	7	2866	April 22, 2024
vkQueuePresentKHR switches to DX12 context and blocks GPU workload Profiling x86 Windows Targets	1	54	November 7, 2024
VK_PRESENT_MODE_MAILBOX_KHR behaving incorrectly Vulkan	4	3043	June 11, 2019

presentKHR still blocks on windows even when using VK_KHR_present_wait

Related topics