Nvidia Driver Fails to use system ram when vram is full leading to crashes

Nvidia Driver Fails to use system ram when vram is full leading to crashes or performance problems. This is to gaming and not Cuda or other similar workloads where memory sharing is properly supported on Linux (as far as i can tell).
This is a technology supported by all uefi compatible gpu drivers and hardware providers including Intel and AMD for the past ten years except Nvidia’s linux driver. It is even the case that Nvidia’s windows driver even supports GTT memory sharing and yet the Linux driver doesn’t.

This is not a new issue and has been reported many times before and for many years in a row.
Here is a non exhaustive list:

https://forums.developer.nvidia.com/t/vram-allocation-issues/239678

This is very obviously a duplicate issue and I called your phone support line about it and was told to create a post in the forum in order to obtain support. So here i am.

1 Like

The sysmem fallback is a nightmare. Instead of fighting the driver, I started shrinking the models. LoRA Lens uses SVD math to keep the weights small enough to stay in the dedicated VRAM. No sharing needed, no crashes. It’s the only way to get stability on Linux/Windows right now.

I didn’t mention AI or LLM’s what are you talking about? Of course it’s related, but it’s also a total non sequitur.

I can confirm on my GeForce 750 Ti (2 GB card) on Windows, everything was ok to run any workloads I wanted and it offloaded things to system memory seamlessly. It did this since 2014 when I put together this system.

On the same system but running Arch Linux (any Wayland DE) with the latest 580 drivers, I could barely open a few browsers and terminals before going out of GPU memory and things would crash or stop working normally. I included way more details in one of the threads you linked at Non-existent shared VRAM on NVIDIA Linux drivers .

I ended up spending almost 2 full weeks troubleshooting as much as I could, wrote an 8,500 word blog post, made videos, opened issues on GitHub which got a reply from NVIDIA, but no resolution.

All in all, since then I still use the same system but I switched to an AMD GPU (RX 480) and this problem went away entirely. I’m only posting this piece of information to confirm it very strongly appears to be a problem with NVIDIA drivers but not AMD.

1 Like