Memory leak for gnome-shell for gdm user [all driver versions]

Starting with Gnome 3.16 the login screen keeps Gnome Shell around after logging in. This login screen instance, running under the “gdm” user, hangs around forever. Something in the NVIDIA driver is leaking memory when this background gnome-shell is running. Open source drivers do not exhibit a memory leak.

Steps to reproduce:

  1. Gnome 3.16 or higher
  2. NVIDIA binary driver 340.93 or 352.41 or higher
  3. Login to a user.
  4. Leave system logged in for at least 24 hours.
  5. After 24 hours, check memory usage for the gnome-shell process for the gdm user.

The memory usage of gnome-shell for gdm continues to increase without stopping. Here is a snippet from “top” showing the largest memory processes on a system with only 5 days of uptime. (This occurs on systems without VirtualBox running, so it is not related to VirtualBox.)

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
 1045 gdm       20   0 3349472 1.800g  21360 S   0.0 23.1   0:55.67 gnome-shell 
 2131 mcronen+  20   0 4161124 1.755g 1.686g S  14.6 22.5   1143:42 VirtualBox  
 1636 mcronen+  20   0 1450460 541252   4548 S   0.7  6.6   9:53.12 goa-daemon  
25395 mcronen+  20   0 1485700 440156 111644 S   1.0  5.4   1:59.93 firefox     
 1589 mcronen+  20   0 1979372 389232  50256 S   0.7  4.8  63:49.54 gnome-shell

340.93 bug report
352.41 bug report

Other users have reported the same problem.

I asked about this a month ago. Someone suggested that what happens is that the gdm instance of gnome-shell keeps leaking the wallpaper used in the greeter again every 10 minutes or so.
The wallpaper used in the greeter changers color throughout the day and gnome is not freeing memory when rotating.

I think this is mostly cairo and mutter/clutter’s fault.
Cairo has many leaks (some of which are already fixed in git).

Expect the user instance to grow in memory since they disabled GC in gnome-shell but this is likely why the greeter version of gnome-shell leaks.

I’m afraid that is not the case. On systems using the Intel driver the gnome-shell gdm instance does not leak memory. Over the course of the same time period (5 days) the Intel system memory usage is nearly identical to when it was first booted.

On Fedora’s theme the greeter wallpaper does not change anyway.

I will go ahead and add a 355.11 bug report as I didn’t include it, but it is also affected.

The 355.11 system has only been up for 2 days and the memory leak has allowed the gnome-shell gdm instance to surpass the logged in user’s gnome-shell usage. Initially the resident memory was under 200MB.

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
12851 michael   20   0 2409448 802016 125072 S  24.9  4.9 220:34.84 firefox     
 1429 gdm       20   0 1985064 461164  73380 S   0.0  2.8   0:20.16 gnome-shell 
 2091 michael   20   0 1930364 293580  92740 S   0.0  1.8   3:11.61 gnome-shell 
13196 michael   20   0 1207412 278364  92368 S   0.0  1.7   1:16.29 thunderbird

355.11 bug report

How on Earth a user space application can leak memory in [NVIDIA] driver? I cannot even imagine that.

It doesn’t leak memory in the nvidia driver. It leaks memory while using the nvidia proprietary glx library.
nvidia driver makes the compositor cache pixmaps. gnome is expecting them to be freed but they aren’t. cairo can manually do that but cairo is horrible in itself.
the pitfalls of caching are:

  1. caching and not using cache which ends up in fragmentation and multiple versions of the same items getting cached.
  2. not purging obsolete items via gc routines.

Most of this can be worked around on gnome’s side but gnome are not interested in mitigating x11 issues since their main concern is wayland.

Wow, just wow.

I mean it’s so f*cked up no wonder people refuse to use Linux. X11 is broken, wayland is totally incomplete, let’s have some fun!

Linux is like airbus .

I have 20 years experience in Linux. The reason people refuse to use Linux is because it is free.

  1. Windows users don’t understand the concept of free.
  2. Free means no paid support. People want to pay for support instead of community support which relies on people from different stacks of the desktop trying to work around each other’s changes.

The Linux kernel itself is rock solid. It is a more mature and better written product than the Windows kernel. Why is that? Because it supports paid products as well (Android, tizen, etc…).
However as you climb up towards the desktop, you get caught in chaotic activity where everyone is doing just their own thing and sometimes at an extremely high development speed. Qt and gtk+ are good examples here and those are just the toolkits.
Now go up even higher and what do you end up with? Gnome/KDE. Free products. The developers cater for their own needs because they are doing it for free and they cannot test many configurations (not enough money to do so).
Had the desktop been a paid product, you would have seen less visible bugs.
The Linux desktop was put in a situation where it was expected to compete with Windows/OSX. To do that, they had to innovate faster than they can maintain and thus we ended up frustrating bugs. People and not only developers are very feature hungry and that is biting us in the neck.
Ok, offloading compositing to the GPU did made the desktop faster (look at how KDE4 was a big success). However, we ended up with a desktop written in JS/cairo/clutter (gnome-shell) and plasma5 (also JS) which basically exposed limitations in X11 and the xorg drivers.
Wayland is mostly ready and it does work around the x11 limitations but there it is not yet implemented in the proprietary nvidia package.
Try to understand what I am saying. Linux was perfect for the desktop in the gnome2 and kde 2/3/4 days. But we asked for too much from our current stack.
Most of your nvidia bugs will disappear when X11 is gone.
Nvidia’s stack is very x11ish. The free drivers are more lenient but it will happen eventually. Nvidia needs to implement wayland first and kwin, mutter/clutter need to support nvidia’s implementation.

Has anyone at NVIDIA investigated or created a bug to track this? Thanks.

Plasmashell 2 is also leaking memory, maybe that too is the Nvidia driver. But cannot confirm since I haven had time to look into it more deeply.

The 358.16 driver also exhibits the memory leak in gnome-shell.

358.16 bug report

If people would notice there might be more of an outcry but it appears people are just oblivious to their memory disappearing or don’t leave their computer on for more than 24 hours.

Reverting this patch in gnome-shell

and applying the patches in to gjs seems to have a massive effect.

There are still leaks but it stays in the 300 to 350MB range with those patches.

(no gnome-shell extensions installed).

Bump…I also have often high cpu usage on gdm process it raises above 20% It seems happens with randomness in time. For example it can raise on using my browser google chrome but when I use “shift+Esc” combination in my browser it shows up browser task manager and there is no processes cost so much cpu only 1 or 2 points in there so there is definitely not an google chrome or “docky” problem. (I have also crashed Docky few minutes ago…).

But in terms of memory leaks I dunno how to look at it it seems to eat about 350Mb’s according to my searches for gdm processes in htop.

And these high cpu usages happens very randomly to me. Weird thing I even want to switch away from gnome or KDE and try something like MATE, but it still like Cinnamon and forked from same gnome code base and I know that Cinnamon its even more lags then Mate or Gnome…The KDE now uses sddm for their window manager cause recent wm that they are using giving flickering problems on nvidia binary drivers. And I am only experience well nice work on Ubuntu with their unity + lightdm it feels much more faster and responsive compared to any gnome-shell or my recent KDE experience.

I think that is why Ubuntu developers starting unity and using lightdm instead of gdm or sddm or something else cause they are work around these X11 bugs that can be also related to when using nvidia binary blob ;(

I am starting to using Sabayon 16 on long term basis and using it for more then 2 month at these moment, unfortunately I won’t like to use Ubuntu until 16.04 next LTS will come out, and SUSE Leap does not looking mature distro for me cause they even not aware of KDE is should use sddm or anything others then what they used up and kills me with these nvidia tearing flickering desktop experience bugs ;( I also using Manjaro but they are too not aware of these terrible thing. I mean I understand that it can be fixed up by myself but wtf I am a bit more like lazy and just want to use nice rolling distro that updates recently and don’t have these pain so I just Install sabayon on my new dedicated hdd and leave other distros installed on 5 or more dedicated hdd’s ;D So its a matter of switching hdd’s to boot something other distro for me (don’t like to mess with grub partitioning or chain load anything).

The problem is that unity libs are bad combines when you for example install gnome near…in ubuntu…I do not try to use unity on other distros. I think its not worse efforts cause I like default desktop expericence.

Anyway I think we should wait until Linux world adopts to new wayland protocol and programs landing some support of it in their code base. And I still have a little hope that nvidia will share some stuff with nouveau project or open source their drvier ;D

Well at least I have no TDR’s here on Linux. I leave windows world about half year ago and it fine, I’d really wont like to install any redaction of windows anymore.

If anyone in this thread is affected by the memory leak problem and has the technical savvy to apply a patch and recompile gnome-shell, could you please give the patch attached to a try?

Gnome 3.18.5 was released with the patch included and I can confirm the leak has been fixed. Thanks, Aaron.