Nvidia-smi uses all of ram and swap

Hello.

Ran into this myself. My system was slow and my RAM completely consumed. Finally realized it was nvidia-smi; another tool was calling it in a loop and causing RAM to flood.

To verify, I made sure the looping tool was dead, then ran nvidia-smi directly, and watched the RAM get consumed.

Wanna watch? https://www.youtube.com/watch?v=zU1gfNk4kH0

Best part (not shown): subsequent runs don’t use whatever nvidia-smi cached; it starts fresh, pegging a CPU core to fill all RAM.

zoey@Clippy:~$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 24.10
Release:	24.10
Codename:	oracular
zoey@Clippy:~$ nvidia-smi
Thu Oct 10 22:08:23 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        Off |   00000000:01:00.0  On |                  N/A |
|  0%   51C    P8             21W /  350W |      75MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A    118489      G   /usr/bin/gnome-shell                           68MiB |
+-----------------------------------------------------------------------------------------+

Running sudo chmod o-w /var/run/nvidia-persistenced/socket as suggested by developer.nvidia.com26 above makes nvidia-smi instantly respond without consuming everything in sight. However, I severely dislike magical fixes and would love something that doesn’t require me to fiddle with files like this to work around bugs.

2 Likes