455.23.04 driver produces very large shader caches - results in constant cache invalidation

I’m a bit tired of complaining about this all the time through various channels, so I’m just going to quote an email that I’ve sent two months ago to linux-bugs@nvidia.com describing the bug in detail:

Hi.

I’ve recently noticed Vulkan Developer driver 450.56.02, when compiling shaders for at least Vulkan apps running through Wine compat layer with DXVK, seems to generate GLCaches that are far larger than the GLCache shader caches generated by mainline drivers, and that they get invalidated all the time even though the driver version stays the same.

For instance, in Overwatch, installed through Lutris application with a pre-filled DXVK state cache with roughly 39 thousands of valid shader pipeline entries, launched with __GL_SHADER_DISK_CACHE_PATH environment variable to separate its shader cache location, the size of GLCache under Nvidia Driver 450.57 is 86.6 MB, and it persists just fine through multiple launches of the app.

However, under 450.56.02 and even 450.56.06 driver, the size of GLCache under the same conditions is almost 10 times that - 833 MB. And that shader cache does not persist - it gets invalidated by the driver with every launch of the app and recompiled again, which I find unacceptable as the compilation of so many shaders takes a lot of time, during which the game isn’t playable.

However, if I increase the size at which Nvidia driver invalidates its shader cache with __GL_SHADER_DISK_CACHE_SIZE environment variable by setting it to a value of 1000000000, shader cache doesn’t get invalidated anymore, although its size of course still stays rather large.

I understand that Vulkan Developer driver is not a stable driver, but it was my impression that changes made in Vulkan Developer drivers eventually get picked up by the stable driver series, so I hope this issue gets resolved before that happens.

Despite me reporting this to Nvidia, as I have predicted, this issue is now present in mainline 455.23.04 driver and now affects many Linux Overwatch players.
This is not even the first time this happened - this exact issue was first noticed with the early 440.xx drivers almost a year ago, and after similar complaining, the change that caused it was simply reverted in a later release. Was Nvidia expecting it to just go away all by itself without properly addressing it???
And I don’t even think this issue affects just Overwatch, there are quite a few D3D games with lots of shaders that work well with DXVK/Wine, iirc Quake Champions is one of them, not that this issue affecting only one application makes it less important.
I represent the Lutris development team and this bug has been incredibly frustrating to deal with, as it creates a very negative experience for our new users and floods our support channels with help requests for a problem that is not even ours.
I sincerely hope that this was just a very clumsy oversight and this bug gets properly fixed in a future 455.xx driver release and will not come back randomly in a year, and that Lutris won’t have to add a permanent global hack to workaround Nvidia bugs, possibly causing other unforeseen issues.

4 Likes

I play Overwatch regularly with a Nvidia Graphics card on Linux so I have to deal with this issue frequently.

I hope Nvidia addresses this issue but at this point I will most likely switch to AMD because of Nvidia’s lackluster support for the Linux community.

I also encounter this issue. I’d be more then happy to personally provide testing to help resolve the issue. My GPU is a GTX 1080.

I’m running an Arch Linux system, it takes several minutes for the cache to fully warm up each and every time.

Based on the description, I experienced this with Shadow of the Tomb Raider, Would happen daily and would take 11+ minutes to recache the shaders.

With a 2080 Ti, games like Overwatch take extremely long times to load shaders in linux. For as much as my card costs, I shouldn’t be stuttering so much in the first several minutes of a game.

I am also affected by this bug. Please give it some attention

Affected too :-/ i will revert to an old release but it’s just a Workaround

Same issue here !

The increase in observed shader disk cache usage is intended, as additional shader information is being stored to improve application runtime performance.

A change increasing the default size of the shader disk cache will be made available in the next driver release series (after 455). Moving forward we will continue to optimize our shader disk cache usage for performance as well as size.

1 Like