Non-existent shared VRAM on NVIDIA Linux drivers

Has anyone tested this with a desktop environment that only utilizes Vulkan (if such a DE even exists)? It could be an OpenGL only problem since vulkaninfo seems to recognize the RAM as a secondary memory heap.

Gnome is Vulkan.

the application has to implement vulkan heap usage on it’s own. Which is different from the automatic usage of host memory (which apparently works for any application with non-nvidia hardware…)

1 Like

Guys try this config file. It is nothing about shared vram, but could save a lot vram on my machine (KDE wayland):

{"profiles":[{"name":"save-vram","settings":[{"key":"GLVidHeapReuseRatio","value":0}]}],"rules":[{"pattern":{"feature":"true","matches":""},"profile":"save-vram"}]}

Put it at /etc/nvidia/nvidia-application-profiles-rc.d/vram or ~/.nv/nvidia-application-profiles-rc, then restart the program or just reboot computer.

Happy birthday to this thread !
It’s now been two years without a single acknowledgement of this issue from Nvidia.

9 Likes

Has anyone managed to find any official documentation on what the value of this GLVidHeapReuseRatio key exactly means? A quick web-search shows that ppl put either 0 or 1 as a value and I wonder what the difference is…

Thanks!

NVIDIA officially explained this parameter in their release note, where they said:

Added a new application profile key, “GLVidHeapReuseRatio”, to control the amount of memory OpenGL may hold for later reuse, as well as some application profiles for several Wayland compositors using the new key to work around issues with excessive video memory usage.

They have set it to 1 for popular window managers (e.g. kwin, at /etc/nvidia/nvidia-application-profiles-rc). It is also found to be useful for other not-that-popular window managers (such as niri, see issue here).

I used to be working fine with my 8GB 4060. After I bought a 4K 160FPS screen, my VRAM frequently full and paraview (a 3D analysis software) crash. I found it is helpful to set this parameter globally in my situation.

this high-level announcement does not explain what the meaning of the property exactly is: ratio of what to what?

“They”? Who do you mean? I haven’t seen any official Nvidia recommendation: if you are aware of such, please do post a link. Thanks! :)

The only thing I’ve managed to find is this github comment, which sets it to 1 in the attached file, but does not explain why it should be 1 not 0 or, for example, not 221

In your post you set it to 0: could you explain why?

Many thanks!! :)

UPDATE:
I started to read the whole discussion on github and found this, which quotes some documentation saying:

This specifies the maximum percentage of video memory OpenGL will hold in a per-process pool for later reuse. For example, a value of 10 would allow an OpenGL process to hold up to 1% of video memory in its pool.

“They”? Who do you mean?

I mean NVIDIA official. The config file /etc/nvidia/nvidia-application-profiles-rc would be installed together with nvidia driver (at least on NixOS), and in this config, the set GLVidHeapReuseRatio for kwin.

In your post you set it to 0: could you explain why?

No I could not explain. Actually I am not familiar with GPU and OpenGL stuff. I just blindly set it to zero and find it soved my problem. I even not tested what is the difference between setting it to 1 or 0.

1 Like

Ah, that explains :) Thanks! (apparently this file is not present in Debian DC packages)

See also the github comment I linked in the update to my previous post: it quotes some description of the param.

UPDATE: this comment explains where this description comes from (/usr/share/nvidia/nvidia-application-profiles-${VERSION}-key-documentation)

I ran into this issue daily with my RTX 3070, playing a simple game would eventually lead to a full desktop crash, first Discord’s UI would stop responding, then Firefox would crash, then a few other apps, then kwin would crash. Exiting the game would allow me to use the system again but everything I had open was gone.

Got tired of waiting for a fix and switched to AMD. Now I can play games while recording with OBS, can take screenshots, watch youtube and chat on Discord all at the same time without worrying about crashing.

5 Likes

This issue is still present. If only Nvidia would create shared vram in its drivers. Oh what a world would it be

2 Likes

This thread was started Jul/19/2023 and not even so much as an acnowledgement from NVIDIA … Happy Birthday +1 to the thread … it would be nice if this would get fixed before the 580 drivers put me in legacy mode … betcha that whiz-bang AI could sort that out ….

3 Likes

It’s way beyond time to address this

DirectX12 performance is terrible on Linux

[575.64] NVRM Out of memory error causes dGPU to not be usable after some time

2 Likes

Any progress on this issue? I think it’s a way bigger problem than DX12 performance problem.

No one from the dev team has even posted on this thread yet.

4 Likes

“betcha that whiz-bang AI could sort that out ….”

xD

Stop being poor, buy gpu with more VRAM (or switch to AMD). That s their answer.

It looks like nvidia fanboys can handle the truth and flag my post as inappropriate.
But it does not matter how I express it, nvidia (lack of) actions speak more loudly than my post. If they don’t fix their broken drivers, it will make lots of capable ““““low budget”””” gpus unusuable and will push you toward more pricy GPUs… or AMD…. or consoles….

4 Likes

It is 14 Oct 2025 and still seeing Linux Vs Windows feature support discrimination … so unprofessional of Nvidia.

1 Like

There must be a technical reason as to why this is and it’s obviously a hard nut to crack for Nvidia. I honestly don’t think they just ignore this.

The driver CAN use system ram but it can’t handle VRAM OOM situations very good (or at all) and just lets the applications crash instead.

I think this is rather a question of fixing memory leaks and race conditions in the memory manager.

The driver obviously can use what people most likely call “shared memory”:

  • It can use shadow memory to temporarily free vmem and load it back in from the shadow if needed. This is most likely used only for allocations that need high bandwidth, and it’s the calling application which must allow this because it knows the performance constraints best. The driver cannot just wildly guess here.
  • It can use system memory directly over PCI bus. This is slow for several reasons (not only because the bus is slower). The driver doesn’t do that automatically, it is the calling application which needs to explicitly ask for that because it knows the performance implications best.

Still, there are probably allocations where the driver could itself just implicitly use shadow buffers. But using real shared memory (accessing system memory over PCI bus only) is probably not what people want: It’s slow. And you probably can’t just migrate memory back and forth like swap memory, that would be inefficient. Also, I don’t think PCI devices can just randomly access any physical system memory location at will: There’s IOMMU. There’s DMA. There’s BAR. There are translation tables (applications do not know where the physical allocation is placed in memory, the driver needs to track and orchestrate that). All of these only provide limited, temporarily fixed windows into the system memory - so you cannot just use system memory to expand the video memory, and randomly access it. This needs to be a coordinated process, and that’s why shadow memory exists (it keeps a handle of “non-present” memory, and then the GPU can copy it back to a known location via PCI bus and window adjustment after it marked some other memory as “non-present” and cheaply discarded that without going over the bus).

I think the primary problem here is broken, racy and incomplete memory management - not “missing” shared memory. dmesg logs during high memory pressure tell a lot of what’s going wrong: assertions fail, allocations leak (especially after playing games with raytracing, there’s a big difference of how much memory is actually known allocations, and what the driver tells to be free), null pointers, NOT_PRESENT exceptions… We can at least see progress here that the driver logs these problems now and doesn’t just crash the system, and most likely also doesn’t crash the game (instead textures and geometry become missing first, until the game itself cannot handle the situation).

From that perspective, the drivers have become pretty stable. But their internal memory management is still a mess full of races and leaks.

If you asked me, that is mostly a result of porting the driver to newer memory management models like DRM, KMS, GBM (or whatever the various subsystems are called, or which are actually involving memory management) while NVIDIA still needs to keep compatibility with Xorg memory management and especially with the shared code base of the Windows driver (and Windows has a very different memory management model).

2 Likes