Gnome-Shell crashes and rendering problems on XServer with 430.26

I’ve installed the nvidia driver the recommend way using the rpm-howTo https://rpmfusion.org/Howto/NVIDIA.
The last 10 times I resumed from Hibernate a lot of characters and symbols aren’t rendered, right at the lock screen and my gnome-shell crashes as well.

Setup:

  • fedora 30
  • gnome 3.32.2
  • GeForce GTX 1060 6GB/PCIe/SSE2
  • kernel: 5.1.15-300.fc30.x86_64
  • nvidia driver: 430.26

Steps to reproduce:

  • go into hibernate (I’ve configured that in my power settings to happen when I press the power button on my tower)
  • wake up from hibernate by pressing the power button again.
  • experience a lock screen which has missing characters and buttons (i guess the gnome-shell has already crashed here)
  • unlock as normal
  • get notified that gnome-shell crashes
  • experience even more rendering problems as seen here
  • restart gnome-shell to go back to normal.

After unlock and before gnome-shell restart (bad):

After gnome-shell restart (normal):

I’ve also tested this with nouveau drivers. Here the screen remains black when I initialize the wake up and I can only hard restart.

PS: The “_old” bug report was done right after the crash, but then I’ve noticed that one should run

startx -- -logverbose 6

. So the one without “old” in the filename was done after running the refered cmd.

My SWAP is now 1,5x of my RAM. Before it was 0,5x of my RAM. I got 16GB installed.
The problem still remains, but the rendering artifacts are a little different now.



nvidia-bug-report.log.gz (717 KB)
nvidia-bug-report.log.old.gz (738 KB)

1 Like

Hello, why can’t I get any support here?
I thought this is the official place to ask for driver support?

It is the right place to ask but as to get support, YMMV.
This is partially a known problem with the proprietary nvidia driver. The known part is that on context switches like VT switch or suspend/resume FBOs like background images etc. get destroyed so the WM has to take care to rebuild them. Which was working pretty fine with Gnome before but isn’t working anymore telling from my own experience but I didn’t care enough so far to check if this is a regression in Gnome 3.32 or in the nvidia driver. The gnome-shell crashing sure is a gnome bug.

1 Like

Ok, thanks for your reply.
My observation is that the gnome-shell doesn’t crash anymore now in this situations.
However the rendering artifacts are remaining.
So restarting the gnome-shell is the one thing I can do for now until someone from nvidia is keen enough to further investigate this…

Was pondering the new driver to see what was added and found this:
https://download.nvidia.com/XFree86/Linux-x86_64/430.09/README/powermanagement.html
Seems it was silently added and no one noticed.

Oh, that’s interesting. Thanks for the link.
However the mentioned directories containing the config files are missing on my system.

There is no NVIDIA_GLX-1.0 directory in my docs folder.
Only nvidia-persistenced and nvidia-settings.

Where can I get these files instead?

Download the latest 430 .run installer:
https://http.download.nvidia.com/XFree86/Linux-x86_64/
and just uncompress it using the -x option. In the created directory, there are the needed systemd units and suspend-script:
nvidia-suspend.service
nvidia-resume.service
nvidia-hibernate.service
nvidia-sleep.sh
Also report to your distro or the driver repo maintainer to add it.

Hello. I came from another topic (about artefacts after sleeping https://devtalk.nvidia.com/default/topic/1057103/linux/artefacts-after-sleeping-very-very-old-issue-again-/post/5359980/#5359980).

I exctracted .run file, then I copied nvidia-sleep.sh to /usr/bin directory and registered system-d services.

Well:

  1. My machine can’t go to sleep when “nvidia-suspend.service” is enabled
  2. enabled “nvidia-resume.service” really helps to fix artefacts of shortcuts on desktop, but doesn’t help with applications like viber or latte-dock (etc.)
  • currently I’m on KDE, as I noted in topic by link, same behavior I was watching in Linux MInt 18.3 Mate (based on gtk) *

(Also, I returned back to driver 390 from official repo, because with driver v430 from run file I’ve got a black screen instead of Login Window and I can’t fix it)

Hi, thanks for backing this.

Same here, regarding your point 1. my machine gets stuck in the process to enter hibernate.
Screen turns off, but my machine stays awake…that’s it

Did you also pass the mentioned kernel module parameters to the nvidia module?
I created an extra conf for this.
However I’m still using /tmp which hasn’t the recommended filesystem, but there is enough free space on it. So this might be related to the situation, but it’s too much of a hack for me now to create an extra partion with a different filesystem…

If there is no other solution I’m going to stay with it as it is for now.

No, I didn’t it manually. However, if you look at the .service files, you can see that some parameters are passing by systemd service.


So, I had got updates from KDE neon repo and NVIDIA Settings App was updated to version 418, after that afrtefacs on the desktop returned back. Even nvidia-susped.service is enabled.

Ok, I completely updated nvidia subsystem from ppa (ubuntu-drivers) to version 430.

$ nvidia-smi
Sat Jul 13 10:04:24 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce 930MX       Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   48C    P0    N/A /  N/A |    310MiB /  2004MiB |     15%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1063      G   /usr/lib/xorg/Xorg                           194MiB |
|    0      1500      G   /usr/bin/kwin_x11                             29MiB |
|    0      1507      G   /usr/bin/krunner                               1MiB |
|    0      1510      G   /usr/bin/plasmashell                          53MiB |
|    0      1551      G   /usr/bin/latte-dock                           21MiB |
|    0      1557      G   ...Installed/WizNote-2.7.5-x86_64.AppImage     5MiB |
+-----------------------------------------------------------------------------+

I extracted .run-file (also v430) and registered needed system.d services by my little script:

#!/bin/bash

NVDIR="/home/r3d9u11/Downloads/NVIDIA-Linux-x86_64-430.34"
SDDIR="/lib/systemd/system"

function reg_service() {
    if [ ! -f "$SDDIR/$1.service" ] ; then
        if sudo cp "$NVDIR/$1.service" "$SDDIR/$1.service" ; then
            if sudo systemctl enable "$1.service" ; then
                return 0;
            fi
        fi
        
        exit 1
    else
        if sudo systemctl enable "$1.service" ; then
            return 0;
        fi
    fi
}

#--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---

reg_service nvidia-hibernate
reg_service nvidia-suspend
reg_service nvidia-resume

#--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---

if [ ! -f "/usr/bin/nvidia-sleep.sh" ] ; then
    cp "$NVDIR/nvidia-sleep.sh" "/usr/bin/"
fi

#--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---

echo ""
echo "  AVAILABLE NVIDIA SERVICES:"
echo ""

ls /lib/systemd/system/ | grep nvidia

echo ""
echo "  ENABLED NVIDIA SERVICES:"
echo ""

systemctl list-unit-files | grep enable | grep nvidia

echo ""
echo ""

All done without error/exceptions, all services was successfully registered:

$ systemctl list-unit-files | grep enable | grep nvidia
nvidia-hibernate.service                                         enabled        
nvidia-resume.service                                            enabled        
nvidia-suspend.service                                           enabled

And now, with driver v430, my machine is going asleep normally.
However there is afterfacts on my desktop and applications still yet after suspending.
Looks like nvidia services do nothing ;-(

At that time I see only one solution to avoid artefscts on the dekstop: reload all needed application after susped (Plasma, Viber and etc). In that case waking up process requires more time. Or just use nouveau, but I afraid to lose performance.

I also tested this shortly, with success. For my system, I had to change the path of pidof in the script. Nevertheless, on suspend it would just freeze, REISUB needed. This seems to be due to the fact that (like said in the doc) it’s needed to switch to VT, which doesn’t work. Switching manually to VT and then suspend, suspend/resume works fine and screen corruption is gone when switching back to gnome.
Without properly setting the module parameter, it doesn’t work, corruption still there.
So in my case, it’s left to modify the script so switching to VT and back works.
Note: I’m also using tmpfs.

Edit: use ‘cat /proc/driver/nvidia/params’ to check if the parameter is correctly set.

Thanks for letting us know.
How about hibernate? Did you encounter similar results?

Don’t know, I’m running Gentoo and I didn’t compile hibernating in because I never liked that concept.

Intermediate result: I’ve taken a look at how the script determines the VT at which Xorg is currently running and it’s a quite naive approach, probably not working at most modern systems. So there’s still some brain work to do.

I now changed the nvidia-sleep.sh script to simply use fgconsole to determine the currently used VT for switching back and forth, changed the path of chvt to match my system and now it works for me:

#!/bin/bash

RUN_DIR="/var/run/nvidia-sleep"
XORG_VT_FILE="${RUN_DIR}"/Xorg.vt_number

case "$1" in
    suspend|hibernate)
        /bin/mkdir -p "${RUN_DIR}"
        /usr/bin/fgconsole > "${XORG_VT_FILE}"
        /usr/bin/chvt 63
        if [[ $? -ne 0 ]]; then
                exit $?
        fi
        /bin/echo "$1" > /proc/driver/nvidia/suspend
        exit $?
        ;;
    resume)
        /bin/echo "$1" > /proc/driver/nvidia/suspend 
        #
        # Check if Xorg was determined to be running at the time
        # of suspend, and whether its VT was recorded.  If so,
        # attempt to switch back to this VT.
        #
        if [[ -f "${XORG_VT_FILE}" ]]; then
            XORG_PID=$(cat "${XORG_VT_FILE}")
            /bin/rm "${XORG_VT_FILE}"
            /usr/bin/chvt "${XORG_PID}"
        fi
        exit 0
        ;;
    *)
        exit 1
esac

Thanks for your help and hint about *.service files in .run archive!
Seems like I’ve got success with proprietary nvidia-drivers and KDE Plasma (X11 backend)

Step by step what I did:

  1. I went to clean console session (Ctrl+Alt+F2)
  2. closed GUI session:
sudo service sddm stop
  1. switched off graphical session:
sudo systemctl set-default multi-user.target
  1. removed all previous nvidia stuff:
sudo apt purge nvidia* bumblebee* primus*
sudo apt autoremove
sudo rm /etc/modprobe.d/*nvidia*
sudo rm /etc/modprobe.d/*nouveau*
sudo rm /etc/modprobe.d/*bumblebee*
sudo update-initramfs -u
reboot
  1. installed nvidia 396:
sudo apt install nvidia-driver-396
  1. enabled nvidia-resuming serivce from extracted *.run package (I used latest .run package v430):
sudo systemctl enable nvidia-resume.service
  1. returned back graphical session:
sudo systemctl set-default graphical.target
  1. I also needed do
rm ~/.Xathoruty

for my current user
9) rebooted again and got profit


upd:

after several reboots artefacts returned back :D


upd2:

I enabled service nvidia-suspend.serivce, after that I told to my system to go to sleep.

Sleeping process was failed and I saw Login window instead of sleeping mode.

After that I disabled serivce nvidia-suspend.service and machine went to sleep normally.

After that procedure artefacts isn’t rising.


There is a simple solution for shortcuts https://bugs.kde.org/show_bug.cgi?id=364766#c67 :D