Hello. I’m running KDE Plasma 5 via Wayland on Arch Linux x64, using a GTX 1070 Ti.
Recently, I updated to the latest NVIDIA driver (545.29.02-4).
Ever since, I have been experiencing completely unusable performance in a Vulkan-based application I’ve been developing (as in, it is constantly frozen and “not responding”, and if I click on a button drawn via imgui, it will not react to this for a minute or longer, if at all!), despite zero changes being made to my code. This occurs using both a clean build of my app and an older compilation which was made before the driver update.
It appears to be caused by a Nvidia driver bug.
In my attempts to debug this issue, I’ve paused my application while debugging it many times, and have noticed that it always (100% of the time) pauses on the same function: vkQueuePresentKHR
. This indicates that the vast majority of my application’s runtime is being spent on this function call, which is unusual.
If I place a breakpoint on my application’s call to this function, and another on the code that immediately follows it, I can see that it takes about 7 seconds or more (!!!) for vkQueuePresentKHR
to return, which is obviously very unusually slow, and very much seems to be the cause of my extreme performance issues.
Running perf on my application shows that it is spending the vast majority of its time in a single kernel function call: _nv040303rm
. Googling this gives no results, but the “nv” obviously leads me to believe this is a function from the nvidia driver.
I’d like to note that this happens with both debug and release builds of my application, and it happens with or without the Vulkan validation layers enabled. Weirdly, however, it does not seem to occur in other Vulkan-based applications, such as games run via proton+dxvk? So, it must be something my application in particular is doing that triggers this bug. Seemingly even commenting out all of my update/rendering code so that my application literally only renders a clear color, and nothing else, still causes this to happen, however.
In addition to all of this, ever since the driver update, I’ve also been experiencing various “flickering” issues while using XWayland applications, such as Spotify, Discord, or Visual Studio Code.
It appears as though the contents of the applications’ windows are being presented to the screen, despite the rendering not yet actually being complete, resulting in various visual anomalies, such as some of the contents of the window not being shown on-screen.
I initially was experiencing these issues constantly, to the point where it was very frustrating. Now, for whatever reason, I’m suddenly unable to reproduce them. I’ll update this post with a link to a screen recording showing the issue if I manage to reproduce it again.
UPDATE: Here, I managed to reproduce it https://www.youtube.com/watch?v=wHVYkRuwYXc. Pay attention to the Spotify window around the 5 and 7 second marks.
These issues do not seem to occur at all in native Wayland applications. When this was happening often at first, forcing the Wayland backend in electron-based apps, such as Visual Studio Code, completely resolved the flickering issues. Unfortunately, not all apps have a native Wayland version available.
Putting all of this together, it appears to me that the cause of both of these problems I am experiencing is related to a bug regarding presentation which was introduced in the latest Nvidia driver.
I am not sure if making this forum post is the proper way to report this as a possible bug, or not. I will be happy to provide logs or any other assistance in debugging this issue if needed.