Non-existent shared VRAM on NVIDIA Linux drivers

I have the same issue with my RTX 3070 and the 555 driver. Playing games causes xwayland to crash with

[drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
[drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object

I bought a AMD card yesterday. I think this will fix it ;)

3 Likes

I have a NVIDIA 2080ti with NVIDIA 555.58.2 and same problem when i run game with a high VRam usage. XWayland crash with same error

@omarhanykasban706 @Schaufelmeister.Storch @florian.richer

In which games did you guys experienced crashes of XWayland? To try those games using my 3070 Max-Q and generate a bug report in the nvidia open modules github page.

OUTBRK on the very high settings because the game is not optimized so it make XWayland crash very often

Since I can’t edit this thread I will do some clarifications on this reply. What NVIDIA Linux drivers are missing is GTT (graphics translation table)/GART (graphics address remapping table) which something like the Shared GPU memory that exists on windows, but this is how is called technically on the Linux kernel.

GTT has been on the Linux kernel for 12 years or more already and it has been implemented on the AMD drivers (amdgpu) and intel (i915) since a long time ago and it’s baffling how nvidia has been unable to implement GTT on their drivers. This is not a brand new feature in constant development like Wayland.

Edit: The GTT feature from the Linux Kernel is not the same as nvidia-smi -gtt (GPU Target Temperature)

References:


Note for experienced people: If you have a AMD GPU, you can execute radeontop -d - -l 1 to get something like this:

Dumping to -, line limit 1.
1720222198.786508: bus 01, gpu 0.00%, ee 0.00%, vgt 0.00%, ta 0.00%, sx 0.00%, sh 0.00%, spi 0.00%, sc 0.00%, pa 0.00%, db 0.00%, cb 0.00%, vram 14.08% 574.87mb, gtt 7.33% 289.45mb, mclk 20.00% 0.300ghz, sclk 18.21% 0.214ghz

In the end, you can see this: vram 14.08% 574.87mb, gtt 7.33% 289.45mb. vram is obviusly the used VRAM of the GPU and gtt is the System RAM being used by the GPU.

If you have an Integrated Intel GPU, you open htop (to monitor your RAM usage) and execute Release v0.5.0 - Tune behavior with large PCIe BARs · GpuZelenograd/memtest_vulkan · GitHub under the Intel GPU. You will see how your RAM usage increases in less than 1 second. This would also apply for AMD users but I would recommend to use the script from this repository instead: GitHub - T-X/linux-amdgpu-radeon-vram-swapping-test: Linux amdgpu Radeon VRAM Swapping Test

5 Likes

In which games did you guys experienced crashes of XWayland?

Prety much any VRAM intensive game will cause a crash after a while. Try Ready or Not with maximum textures, that one seems to crash quite fast.

1 Like

Could someone from NVidia please tell us if this issue is going to be addressed or not? (including an ETA if possible)

It’s a major problem when working with applications which need a lot of VRAM (e.g. Unreal Engine) and my 3060 isn’t cutting it anymore. So i have to decide to either get a 16GB 40 series card (if this issue gets sorted out) or a 24GB AMD 7900XTX (which apparently gives even more headroom given their driver supports shared memory)

1 Like

What happened to the VRAM management with 555 drivers? Sometimes Xwayland starts to consume too much VRAM and crashes all the xwayland apps opened. And I’m not even playing anything really intensive, I’m just using Krita.

[26484.710101] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002b00] Failed to allocate NVKMS memory for GEM object
[26484.710166] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002b00] Failed to allocate NVKMS memory for GEM object
[26484.741552] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002b00] Failed to allocate NVKMS memory for GEM object
[26484.741628] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002b00] Failed to allocate NVKMS memory for GEM object

1 Like

For me it happens in Elite Dangerous. I can speed up getting to an XWayland crash by watching youtube videos on my secondary monitor. I now switched to an AMD Radeon RX 7900 XTX and this fixed all issues I ever had with desktop linux.
I really hope nvidia will address this issue soon. I still have an RTX 3060 TI built into my Laptop.

@Fijxu thanks for beeing so dedicated and researching stuff. Since this issue is related to the proprietary NVIDIA driver, I think opening a github issue in the open-gpu-kernel-modules repository is not doing much.

1 Like

I don’t play PC games and I’m still seeing related errors on my work laptop.

[13150.718760] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
[13150.720663] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
[13177.944113] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
[13177.944149] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object

It has a 3050Ti with 4GB VRAM. Right now, nvidia-smi shows around 3GB RAM in use, with Firefox using over 1GB of it. Maybe that’s a Firefox bug?

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.58.02              Driver Version: 555.58.02      CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050 ...    Off |   00000000:01:00.0  On |                  N/A |
| N/A   55C    P8              7W /   35W |    3087MiB /   4096MiB |     13%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      4311      G   /usr/bin/kwin_wayland                         484MiB |
|    0   N/A  N/A      4438      G   /usr/bin/maliit-keyboard                      374MiB |
|    0   N/A  N/A      4496      G   /usr/bin/kded6                                  1MiB |
|    0   N/A  N/A      4505      G   /usr/bin/plasmashell                          208MiB |
|    0   N/A  N/A      4553      G   /usr/libexec/kactivitymanagerd                  1MiB |
|    0   N/A  N/A      4598      G   ...6/polkit-kde-authentication-agent-1          1MiB |
|    0   N/A  N/A      4599      G   /usr/libexec/org_kde_powerdevil                 1MiB |
|    0   N/A  N/A      4600      G   /usr/libexec/xdg-desktop-portal-kde             1MiB |
|    0   N/A  N/A      4751      G   kdeconnectd                                     1MiB |
|    0   N/A  N/A      4911      G   /usr/libexec/DiscoverNotifier                   1MiB |
|    0   N/A  N/A      5379      G   /usr/bin/kwalletd6                              1MiB |
|    0   N/A  N/A      5745      G   /usr/libexec/baloorunner                        1MiB |
|    0   N/A  N/A      7469      G   /usr/lib64/firefox/firefox                   1285MiB |
|    0   N/A  N/A      8039      G   ...bin/plasma-browser-integration-host          1MiB |
|    0   N/A  N/A      8136      G   /usr/bin/konsole                                1MiB |
|    0   N/A  N/A     98509      G   /usr/bin/Xwayland                               2MiB |
|    0   N/A  N/A    100804      G   ...erProcess --variations-seed-version        288MiB |
+-----------------------------------------------------------------------------------------+

Eventually the memory usage gets high enough that xwayland crashes and brings down half the system with it.

I’ve got 64GB system RAM, over half of which is unused…

My personal laptop is a Framework 16 with AMD graphics and I don’t experience this issue on there. Both are running the same OS (Fedora 40) with mostly the same apps (Firefox and VS Code being the two I use the most).

2 Likes

On my KDE system, I noticed that maliit-keyboard was using 374 MB VRAM according to nvidia-smi. If you don’t use the onscreen keyboard, you can disable it in System Settings → Keyboard → Virtual Keyboard, which will free up some VRAM. Not a fix to this shared VRAM issue of course, but until it’s fixed, every little bit of VRAM you can free up helps a bit.

1 Like

Is this even getting addressed? I believe nvidia has said that their GPUs supports spilling memory into system RAM when in need, but indeex “glxinfo -B” reports total VRAM == Dedicated VRAM

Is that an issue with the drivers? I am uncertain, Ratchet & Clank Rift Apart crashes complaining that it has ran out of memory with my 1050 ti, but it runs just fine in windows on low settings.

Is everyone here delusional and the drivers indeed support memory spilling into system RAM? Is it that the game crashes cause the driver, while supporting it, fails to report it to whatever the game is using, resulting in the crash?

Is it the game’s fault and should be reported to them? Or is it the driver’s? If so, can we expect a fix in the proprietary driver? Let alone the one with the open kernel modules? Honestly the build I am describing is fine for budget low settings couch 1080p gaming, but this issue is a major obstacle.

Idk why but i think this is the answer, [SOLVED] shared video memory in wine???

Code:
Section “Device”
Identifier “Device0”
Driver “nvidia”
VendorName “NVIDIA Corporation”
BoardName “GeForce 7300 SE/7200 GS”
Option “AllowSHMPixmaps” “0”
EndSection

“AllowSHMPixmaps”

this allows NVIDIA drivers to run more efficiently while using shared memory

Any NVIDIA devs could comment this please ?

I monitor memory using this :

nvidia-smi --query-gpu=pstate,utilization.gpu,memory.free,memory.used --format=csv -l 5

I observe similar issues in following scenarios :

  • When reaching 0 Free VRAM memory, Kwin (KDE window manager) crash or glitching.
  • Diablo V become slow. I have to switch from High => Low => High settings. I notice Free VRAM come back to 1Gb. Then the game fill it and same issue happens after a while.
  • Cuda application say out of memory for few MB that could not be allocated.

All these facts point on the same reason, the Nvidia driver (mine : 560.28.03) are not able to offload VRAM when it’s getting full as it is done on Windows from my understanding.

If at least, we could have an official explanation on the issue :

  • Does the feature is supported : Yes/No
  • If Yes, is it a configuration issue : Yes / No
  • If Yes, what are the steps to configure it

Thank you

2 Likes

This already defaults to off (0) as per Nvidia’s documentation: Appendix B. X Config Options

Option "AllowSHMPixmaps" "boolean"

Default: off (shared memory pixmaps are not allowed).

What is state of this issue? I have GTX1060 6GB version and many games I cannot play because they fill my VRAM and crash, even when I have 32 GB of RAM and plenty of free space in RAM.

Im waiting for RTX 5000 series for upgrade, but if 5000 series will have this issue on Linux too, I will have to switch to AMD.

1 Like

I’ve just encountered it once or twice, completely crashing my wayland session.

Nvidia please acknowledge this and give us at least a timeline. It’s a pretty awful issue and without this being resolved things will not progress.
I’m on a 3080, and planning to upgrade to 5090, so yeah… I’d like to keep using my system without crashing, please.

1 Like

I can also confirm this issue.

I am running a 3070 on Nobara (Fedora) 40 with the 565.57.01 nvidia driver.
When I play certain games, they tend to show massive lags after some time. Then it runs fine for a while until a lot of applications crash on my pc. Usually the game crashes and Spotify too for some reason.
In journalctl I also get the error [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002b00] Failed to allocate NVKMS memory for GEM object
Followed by a lot of crash reports:

Nov 06 23:00:06 nobara gnome-shell[4660]: X Wayland crashed; attempting to recover
Nov 06 23:00:06 nobara gnome-shell[4660]: Connection to xwayland lost
Nov 06 23:00:06 nobara steam[191724]: X connection to :0 broken (explicit kill or server shutdown).
Nov 06 23:00:06 nobara steam[191724]: X connection to :0 broken (explicit kill or server shutdown).
Nov 06 23:00:06 nobara steam[191724]: X connection to :0 broken (explicit kill or server shutdown).
Nov 06 23:00:06 nobara steam[191724]: X connection to :0 broken (explicit kill or server shutdown).
Nov 06 23:00:06 nobara steam[191724]: X connection to :0 broken (explicit kill or server shutdown).
Nov 06 23:00:06 nobara gnome-shell[4660]: WL: error in client communication (pid 4660)
Nov 06 23:00:06 nobara gnome-shell[5233]: (EE) failed to dispatch Wayland events: Protocol error
Nov 06 23:00:06 nobara gnome-shell[5233]: XWAYLAND: wp_linux_drm_syncobj_surface_v1#97: error 3: Release or Acquire point set but no buffer attached
Nov 06 23:00:06 nobara gnome-shell[5233]: Error getting buffer

Games where this happened for me were Elite Dangerous, Star Citizen and Phasmophobia.

I’ve since updated to 565 with open kernel modules and the issue now occured in BG3.

Honestly, how has this not been addressed yet? It’s not an issue on AMD side.
From what I can tell it seems like the driver just doesn’t handle VRAM exhaustion at all anymore (since 555?) under wayland running xwayland applications. The entire application running at ~100 fps will cause the issue to occur and then suddenly the entire system slows down to a crawl, even the DE running at ~5 fps.

Can we at least get some acknowledgment of the issue and maybe a target driver version for a fix? It’s extremely obnoxious, especially for the people with lower vram (which is most nvidia cards because you refuse to put a reasonable amount of vram on a card).

1 Like

Same issue here. I am using 4060 laptop. When VRAM comes to nearly full, game frame rate will dramatically decrase.