High Xorg VRAM usage

This is what I get when I run nvidia-smi, is this supposed to be normal? I am starting out with AI and have read that more VRAM is better for a bigger batch size and helps with lowering training time. So 1.4GB VRAM usage out of 8GB total looks kinda sketchy. If this is unusual, what should I do?

P.S. I do have a 4K monitor set to 60FPS and thought that might’ve been the reason. I set it to 1080p60 and rebooted, but it still used ~1GB+ VRAM.

─┬─[ pts/3 0 21-04-10 11:45:22 ]
 ├─[ flameboi: atheistd ▶ /home/atheistd ]
 ╰─> nvidia-smi                                                                                                                                                                                                                                                                         
Sat Apr 10 11:45:29 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67       Driver Version: 460.67       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3070    Off  | 00000000:09:00.0  On |                  N/A |
| 41%   47C    P8    16W / 220W |   1532MiB /  7979MiB |      5%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2269      G   /usr/lib/xorg/Xorg                102MiB |
|    0   N/A  N/A      3164      G   /usr/lib/xorg/Xorg                917MiB |
|    0   N/A  N/A      3741      G   /usr/bin/gnome-shell              267MiB |
|    0   N/A  N/A    297921      G   ...AAAAAAAAA= --shared-files      226MiB |
|    0   N/A  N/A    304816      G   gnome-control-center                3MiB |
|    0   N/A  N/A    304931      G   /usr/bin/nvidia-settings            0MiB |
+-----------------------------------------------------------------------------+
─┬─[ pts/3 2 21-04-10 11:48:31 ]
 ├─[ flameboi: atheistd ▶ /home/atheistd ]
 ╰─> apt list --installed | grep nvidia | grep driver                                                                                                                                                                                                                                   

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

nvidia-driver-460/focal,now 460.67-1pop0~1616430777~20.04~71e1ad1 amd64 [installed]
─┬─[ pts/3 0 21-04-10 11:48:46 ]
 ├─[ flameboi: atheistd ▶ /home/atheistd ]
 ╰─> apt list --installed | grep nvidia | grep xorg                                                                                                                                                                                                                                     

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

xserver-xorg-video-nvidia-460/focal,now 460.67-1pop0~1616430777~20.04~71e1ad1 amd64 [installed,automatic]
─┬─[ pts/3 0 21-04-10 11:45:29 ]
 ├─[ flameboi: atheistd ▶ /home/atheistd ]
 ╰─> uname -a                                                                                                                                                                                                                                                                           
Linux flameboi 5.11.0-7612-generic #13~1617215757~20.04~97a8d1a-Ubuntu SMP Thu Apr 1 21:15:20 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
─┬─[ pts/3 0 21-04-10 11:46:01 ]
 ├─[ flameboi: atheistd ▶ /home/atheistd ]
 ╰─> lsb_release -a                                                                                                                                                                                                                                                                     
No LSB modules are available.
Distributor ID:	Pop
Description:	Pop!_OS 20.04 LTS
Release:	20.04
Codename:	focal
─┬─[ pts/3 0 21-04-10 11:53:46 ]
 ├─[ flameboi: atheistd ▶ /home/atheistd ]
 ╰─> apt-cache show xserver-xorg | grep Version                                                                                                                                                                                                                                         
Version: 1:7.7+19ubuntu14

That’s really absurdly high vmem usage. Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

I wasn’t aware of nvidia-bug-report.sh before, sorry for not attaching it in the first place. But here it is. nvidia-bug-report.log.gz (425.7 KB)

You have some scaling set, 5120x2880->3840x2160, though I don’t think that should have the impact on vmem usage you’re experiencing.
The monitors get requeried in fast succession, I think I’ve seen this before on PopOS. Was some plugin/config service which had to be disabled IIRC.

It most probably doesn’t, but at this point I’m not sure. As I previously mentioned, I set the monitor resolution to 1080p60 and rebooted. A while later of getting into my regular workflow the VRAM usage was back to it’s “normal high” ~1.3GiB.

Also, I have resolution set as 3840x2160 and fractional scaling at 150% in gnome-control-center, is that why my scaling resolution is at 5120x2880? And, is this a normal thing? I saw the same resolution in OBS too; Running xrandr --current and xdpyinfo | grep dimensions confirm that it indeed is 5120x2880 pixels. But from my monitor’s menu, it shows that it is receiving input at 3840x2160@60 (which is also the optimal resolution as the menu suggested). Maybe the scaling from 5120x2880@60 to 3840x2160@60 is causing higher than normal VRAM usage?

Talking about plugins, I do have one gnome extension at the moment which shows network speed. But I performed OS reinstalls several times (on the same hardware with only Pop!_OS 20.04 LTS) and for a few times did not install said gnome extension and saw the same high VRAM usage. Based on that experience I’m 99% sure it’s not a memory leak from that particular gnome extension and I do not have any other extension installed (installed by the user, there are some pre-installed like pop-shell tiling extension etc). So I’m not sure where to look. If you do remember it at a later point in time, please let me know. TIA! ;)

I checked the extensions and as of now, this is what I have installed and or enabled. See any familiar names causing problems?

I found the thread https://forums.developer.nvidia.com/t/high-cpu-usage-on-xorg-when-the-external-monitor-is-plugged-in/169173
Though the symptom on this thread was high cpu usage on hybrid graphics, the underlying symptom was the same, fast monitor requeries. Unfortunately, no solution found besides that this was popOS specific.
You could try disabling the power-daemon and plugin, which was my last guess in that thread.
Fractional scaling is Ubuntu specific, I don’t have experience how well it works meanwhile and its impact on performance/vmem usage.

From the mentioned thread, I boiled down the problem to the problem to the System76 Power gnome extension (which was active in the screenshot above). I disabled it and rebooted, and the problem still persists.

─┬─[ pts/0 0 21-04-10 20:32:03 ]
 ├─[ flameboi: atheistd ▶ /home/atheistd ]
 ╰─> nvidia-smi                                                               %
Sat Apr 10 20:32:04 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67       Driver Version: 460.67       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3070    Off  | 00000000:09:00.0  On |                  N/A |
| 41%   42C    P8    16W / 220W |   1073MiB /  7979MiB |      4%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2285      G   /usr/lib/xorg/Xorg                102MiB |
|    0   N/A  N/A      3295      G   /usr/lib/xorg/Xorg                349MiB |
|    0   N/A  N/A      4116      G   /usr/bin/gnome-shell              404MiB |
|    0   N/A  N/A      5368      G   ...AAAAAAAAA= --shared-files      202MiB |
+-----------------------------------------------------------------------------+

─┬─[ pts/0 0 21-04-10 20:32:04 ]
 ├─[ flameboi: atheistd ▶ /home/atheistd ]
 ╰─> uptime                                                                   %
 20:33:53 up 2 min,  1 user,  load average: 0.44, 0.49, 0.22

I do not know what do you mean by that. Could you point me to an article either explaining “fast monitor requeries” or an article about how to prevent that? TIA!

Edit: My setup is a Ryzen 9 (no iGPU) with a RTX 3070 on a desktop, so I don’t think that the System76 Power package might cause any issues as there is only one display and only one GPU.

By monitor requeries I mean something is calling xrandr in a loop, this can be seen in the xorg logs (can also be caused by a broken cable but this has also other effects):

Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): BenQ EL2870U (DFP-0): connected
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): BenQ EL2870U (DFP-0): Internal TMDS
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): BenQ EL2870U (DFP-0): 600.0 MHz maximum pixel clock
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-1: disconnected
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-1: Internal DisplayPort
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-1: 2670.0 MHz maximum pixel clock
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-2: disconnected
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-2: Internal TMDS
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-3: disconnected
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-3: 2670.0 MHz maximum pixel clock
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-4: disconnected
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-4: Internal TMDS
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-4: 165.0 MHz maximum pixel clock
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-5: disconnected
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-5: Internal DisplayPort
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-5: 2670.0 MHz maximum pixel clock
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-6: disconnected
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-6: Internal TMDS
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-6: 165.0 MHz maximum pixel clock
Apr 10 10:59:09 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): BenQ EL2870U (DFP-0): connected
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): BenQ EL2870U (DFP-0): Internal TMDS
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): BenQ EL2870U (DFP-0): 600.0 MHz maximum pixel clock
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-1: disconnected
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-1: Internal DisplayPort
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-1: 2670.0 MHz maximum pixel clock
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-2: disconnected
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-2: Internal TMDS
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-3: disconnected
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-3: 2670.0 MHz maximum pixel clock
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-4: disconnected
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-4: Internal TMDS
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-4: 165.0 MHz maximum pixel clock
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-5: disconnected
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-5: Internal DisplayPort
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-5: 2670.0 MHz maximum pixel clock
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-6: disconnected
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-6: Internal TMDS
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): DFP-6: 165.0 MHz maximum pixel clock
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0):
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): BenQ EL2870U (DFP-0): connected
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): BenQ EL2870U (DFP-0): Internal TMDS
Apr 10 10:59:10 flameboi /usr/lib/gdm3/gdm-x-session[3164]: (--) NVIDIA(GPU-0): BenQ EL2870U (DFP-0): 600.0 MHz maximum pixel clock

Oh okay. Thanks anyways. I’ll try out another distro once I need to re-install my OS again. :)

Is there any way I could impose a restriction on Xorg to use only a limited amount of VRAM?

@generix, I nuked my Pop install and switched to Ubuntu 20.04 and here’s my nvidia-smi output. Turns out it wasn’t Pop specific. I did a # apt upgrade -y and rebooted to get the latest NV drivers, but it still didn’t help.

Edit: VRAM usage just went up to 1076MiB from startup to replying to this post. :(

Hi

Im seeing high VRAM usage on ubuntu 20.04 also

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.56       Driver Version: 460.56       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 750 Ti  Off  | 00000000:01:00.0  On |                  N/A |
| 42%   33C    P0     2W /  52W |   1749MiB /  1999MiB |      5%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1142      G   /usr/lib/xorg/Xorg                 54MiB |
|    0   N/A  N/A      1863      G   /usr/lib/xorg/Xorg                592MiB |
|    0   N/A  N/A      1992      G   /usr/bin/gnome-shell              118MiB |
|    0   N/A  N/A      3289      G   /usr/lib/firefox/firefox          947MiB |
|    0   N/A  N/A      3443      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      3582      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      5346      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A      5700      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A      6204      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A     25989      G   /usr/bin/nvidia-settings            0MiB |
+-----------------------------------------------------------------------------+

DP-1 connected primary 5120x1440+0+0 (normal left inverted right x axis y axis) 1mm x 1mm

Same issues, it just fills the VRAM nvidia-bug-report.log.gz (1.2 MB)

This bug has existed for years…

Hi @thefirst1322 I have the same issue, and it is not specific to PopOS or NVIDIA hardware – I’ve been able to replicate it with Ubuntu and Arch on both NVIDIA and AMD hardware. The problem is that you are running a very high resolution – 4K is a high resolution to begin with, and with xrandr fractional scaling you are actually running at 5120x2880. Therefore it is not surprising that VRAM usage is high. Either switch to a lower resolution screen, or try KDE. In any case, with 8GB of VRAM I don’t think you need to worry yet.

I am experiencing exactly same issue. I have 2 notebook running Ubuntu 20.04 and their resolution are 1920x1080 and xorg uses 24G virtual memory on start. Switching back to open source non NVIDIA driver does not have sure issue, but I need to use NVIDIA driver for our AI development. I believe the resolution is not the only factor.

Xorg is using ~2.2GB of VRAM for me on PopOS 20.04. It’s incredibly frustrating.

This is still an issue for me on Ubuntu 22.04.1 (RTX 2060, driver version 515.76). Any suggestions please? I’d like to give as much VRAM as possible to CUDA, not Xorg.

same problem here. xorg takes 1900mb vram on my ubuntu 22.04 with 525.60.11 driver. 4k screen no scaling