Nvidia disappears on Ubuntu (multiboot side effect ?)

Hi

I have quite a strange problem with nvidia on linux. I wanted to believe it was an accident, but it happened twice…

I have a lenovo P50 laptop with two graphic devices :

  • intel chipset
  • NVIDIA Quadro M2000M

I have 3 bootable partitions with 3 OS :

  • ubuntu studio for live performance (21.10)
  • ubuntu “classic” for everyday usage (20.04)
  • windows 10

In ubuntu studio I need nvidia driver, after install I launch :

ubuntu-drivers autoinstall
reboot

What happens :

  • ubuntu studio with nvidia driver works fine during days. I did not apply any update on this OS.
  • I use the other OS (ubuntu classic and windows, with maybe system updates on thoses)
  • when I come back to ubuntu studio after a few days… nvidia is gone !

lspci shows no nvidia device
lsmod shows no nvidia module
nvidia-settings : ERROR: Unable to load info from any available system + empty app
nvidia-driver package is still installed

The first time I reinstalled ubuntu studio from scrach.

The second time I managed to recover nvidia with following commands :

rm /etc/X11/xorg.conf
apt install --reinstall nvidia-prime
prime-select nvidia
sudo update-initramfs -u
reboot
# no nvidia at boot

apt remove nvidia-driver*
ubuntu-drivers autoinstall
reboot
# nvidia at boot !

I wonder if the “classic” ubuntu updates could mess with grub and have an effect on my ubuntu studio ? I tried to launch “os-prober; update-grub” on classic ubuntu to reproduce the case, but nvidia was still OK on ubuntu studio.

Thanks for any help or idea on this !

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

Also, please post the output of
grep 10de /lib/udev/rules.d/*

Here is the file, and the output of the command :

/lib/udev/rules.d/71-nvidia.rules:SUBSYSTEM=="pci", ATTRS{vendor}=="0x10de", DRIVERS=="nvidia", TAG+="seat", TAG+="master-of-seat"
/lib/udev/rules.d/71-nvidia.rules:ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x03[0-9]*", TEST=="power/control", ATTR{power/control}="auto"
/lib/udev/rules.d/71-nvidia.rules:ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", TEST=="power/control", ATTR{power/control}="auto"
/lib/udev/rules.d/71-nvidia.rules:ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", TEST=="power/control", ATTR{power/control}="auto"
/lib/udev/rules.d/71-nvidia.rules:ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", TEST=="power/control", ATTR{power/control}="auto"

nvidia-bug-report.log.gz (360.1 KB)

When the log was created, everything was working, also on the previous boot, as it seems. Please create a new log once the nvidia drver fails.

Yes everything is working fine now.

I am not able to make it malfunction, as I don’t understand the problem… and it can take weeks for the problem to come back… But I will come here and post when it happens !

Thanks !

Hi

the same issue reappeared, after a system update on ubuntu studio.

  • HDMI output does not work anymore.
  • lsmod shows no nvidia module
  • nvidia-settings : ERROR: Unable to load info from any available system + empty app
  • nvidia-driver package is still installed

This time however nvidia still shows in lspci output :

01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M2000M] (rev a2)

grep 10de /lib/udev/rules.d/* :

/lib/udev/rules.d/71-nvidia.rules:SUBSYSTEM=="pci", ATTRS{vendor}=="0x10de", DRIVERS=="nvidia", TAG+="seat", TAG+="master-of-seat"
/lib/udev/rules.d/71-nvidia.rules:ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x03[0-9]*", TEST=="power/control", ATTR{power/control}="auto"
/lib/udev/rules.d/71-nvidia.rules:ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", TEST=="power/control", ATTR{power/control}="auto"
/lib/udev/rules.d/71-nvidia.rules:ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", TEST=="power/control", ATTR{power/control}="auto"
/lib/udev/rules.d/71-nvidia.rules:ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", TEST=="power/control", ATTR{power/control}="auto"

I add the nvidia bug report result.

Thanks for any help,

nvidia-bug-report.log.gz (78.4 KB)

Something went terribly wrong with the last update.

avril 03 13:09:22 ubstud kernel: NVRM: API mismatch: the client has the version 470.103.01, but
NVRM: this kernel module has the version 470.86. Please
NVRM: make sure that this kernel module and all NVIDIA driver
NVRM: components have the same version.

Is the last sign of the driver trying to load.

Take a look at the output (and maybe paste it here) of dkms status, for which kernels the driver is installed (if it says added, that’s not enough).
Make sure you have the kernel headers for your running kernel installed. Usually named like: linux-headers-$(uname -r). Because there is no sign of the driver module compiling.

Thanks for your answer,

if I understand correctly, the update installed the new kernel without compiling the nivida module ?

Below are the command results, I now run with 5.13.0-22 kernel and headers seem to be installed as far as I understand.

# dkms status
v4l2loopback, 0.12.5, 5.13.0-19-generic, x86_64: installed
v4l2loopback, 0.12.5, 5.13.0-22-generic, x86_64: installed
v4l2loopback, 0.12.5, 5.13.0-22-lowlatency, x86_64: installed
v4l2loopback, 0.12.5, 5.13.0-30-generic, x86_64: installed
v4l2loopback, 0.12.5, 5.13.0-30-lowlatency, x86_64: installed
v4l2loopback, 0.12.5, 5.13.0-39-generic, x86_64: installed
v4l2loopback, 0.12.5, 5.13.0-39-lowlatency, x86_64: installed

# uname -a
Linux ubstud 5.13.0-22-lowlatency #22-Ubuntu SMP PREEMPT Fri Nov 5 15:40:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux


# apt list --installed | grep linux-headers | grep 5.13.0-22
linux-headers-5.13.0-22-lowlatency/impish-updates,impish-security,now 5.13.0-22.22 amd64  [installé, automatique]
linux-headers-5.13.0-22/impish-updates,impish-updates,impish-security,impish-security,now 5.13.0-22.22 all  [installé, automatique]

Quite a strange thing, I rebooted my laptop since the last nvidia-debug, and now it does not show the NVRM messages you mentioned…

nvidia-bug-report-2.log.gz (64.6 KB)

When installing a new version of the nvidia driver, all driver modules from all installed kernel versions get removed. And usually only get installed for the currently running kernel.

I don’t know why the driver didn’t get installed though you have the headers installed.
Any reason you run 5.13.0-22 and not 5.13.0-39?

Please post the output of dpkg -l |grep nvidia.

Here is the result :

dpkg -l |grep nvidia
ii  libnvidia-cfg1-470:amd64                      470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-470                          470.103.01-0ubuntu0.21.10.1                 all          Shared files used by the NVIDIA libraries
ii  libnvidia-compute-470:amd64                   470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA libcompute package
ii  libnvidia-compute-470:i386                    470.103.01-0ubuntu0.21.10.1                 i386         NVIDIA libcompute package
ii  libnvidia-decode-470:amd64                    470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-470:i386                     470.103.01-0ubuntu0.21.10.1                 i386         NVIDIA Video Decoding runtime libraries
ii  libnvidia-egl-wayland1:amd64                  1:1.1.7-2build1                             amd64        Wayland EGL External Platform library -- shared library
ii  libnvidia-encode-470:amd64                    470.103.01-0ubuntu0.21.10.1                 amd64        NVENC Video Encoding runtime library
ii  libnvidia-encode-470:i386                     470.103.01-0ubuntu0.21.10.1                 i386         NVENC Video Encoding runtime library
ii  libnvidia-extra-470:amd64                     470.103.01-0ubuntu0.21.10.1                 amd64        Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-470:amd64                      470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-470:i386                       470.103.01-0ubuntu0.21.10.1                 i386         NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-470:amd64                        470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-470:i386                         470.103.01-0ubuntu0.21.10.1                 i386         NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-ifr1-470:amd64                      470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA OpenGL-based Inband Frame Readback runtime library
ii  libnvidia-ifr1-470:i386                       470.103.01-0ubuntu0.21.10.1                 i386         NVIDIA OpenGL-based Inband Frame Readback runtime library
rc  linux-modules-nvidia-470-5.13.0-21-lowlatency 5.13.0-21.21+1                              amd64        Linux kernel nvidia modules for version 5.13.0-21
rc  linux-modules-nvidia-470-5.13.0-22-lowlatency 5.13.0-22.22+2                              amd64        Linux kernel nvidia modules for version 5.13.0-22
ii  linux-modules-nvidia-470-5.13.0-39-lowlatency 5.13.0-39.44                                amd64        Linux kernel nvidia modules for version 5.13.0-39
ii  linux-modules-nvidia-470-lowlatency           5.13.0-39.44                                amd64        Extra drivers for nvidia-470 for the lowlatency flavour
rc  linux-objects-nvidia-470-5.13.0-21-lowlatency 5.13.0-21.21+1                              amd64        Linux kernel nvidia modules for version 5.13.0-21 (objects)
ii  linux-objects-nvidia-470-5.13.0-22-lowlatency 5.13.0-22.22+2                              amd64        Linux kernel nvidia modules for version 5.13.0-22 (objects)
ii  linux-objects-nvidia-470-5.13.0-39-lowlatency 5.13.0-39.44                                amd64        Linux kernel nvidia modules for version 5.13.0-39 (objects)
ii  linux-signatures-nvidia-5.13.0-22-lowlatency  5.13.0-22.22+2                              amd64        Linux kernel signatures for nvidia modules for version 5.13.0-22-lowlatency
ii  linux-signatures-nvidia-5.13.0-39-lowlatency  5.13.0-39.44                                amd64        Linux kernel signatures for nvidia modules for version 5.13.0-39-lowlatency
ii  nvidia-compute-utils-470                      470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA compute utilities
ii  nvidia-driver-470                             470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA driver metapackage
ii  nvidia-kernel-common-470                      470.103.01-0ubuntu0.21.10.1                 amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-470                      470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA kernel source package
ii  nvidia-prime                                  0.8.17.1                                    all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                               470.57.01-0ubuntu3.1~0.21.10.1              amd64        Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-470                              470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA driver support binaries
ii  screen-resolution-extra                       0.18.1                                      all          Extension for the nvidia-settings control panel
ii  xserver-xorg-video-nvidia-470                 470.103.01-0ubuntu0.21.10.1                 amd64        NVIDIA binary Xorg driver

It seems there is something more for 5.13.0-39 compared to -22 :

# dpkg -l |grep nvidia | grep 5.13.0-22
rc  linux-modules-nvidia-470-5.13.0-22-lowlatency 5.13.0-22.22+2                              amd64        Linux kernel nvidia modules for version 5.13.0-22
ii  linux-objects-nvidia-470-5.13.0-22-lowlatency 5.13.0-22.22+2                              amd64        Linux kernel nvidia modules for version 5.13.0-22 (objects)
ii  linux-signatures-nvidia-5.13.0-22-lowlatency  5.13.0-22.22+2                              amd64        Linux kernel signatures for nvidia modules for version 5.13.0-22-lowlatency

# dpkg -l |grep nvidia | grep 5.13.0-39
ii  linux-modules-nvidia-470-5.13.0-39-lowlatency 5.13.0-39.44                                amd64        Linux kernel nvidia modules for version 5.13.0-39
ii  linux-modules-nvidia-470-lowlatency           5.13.0-39.44                                amd64        Extra drivers for nvidia-470 for the lowlatency flavour
ii  linux-objects-nvidia-470-5.13.0-39-lowlatency 5.13.0-39.44                                amd64        Linux kernel nvidia modules for version 5.13.0-39 (objects)
ii  linux-signatures-nvidia-5.13.0-39-lowlatency  5.13.0-39.44                                amd64        Linux kernel signatures for nvidia modules for version 5.13.0-39-lowlatency

Any reason you run 5.13.0-22 and not 5.13.0-39?

No…

I have 3 bootable partitions with 3 OS :

  • ubuntu studio for live performance (21.10) (where this problem occurs since update)
  • ubuntu “classic” for everyday usage (20.04)
  • windows 10

In grub.cfg I can see the following “menuentry” item :

/boot/vmlinuz-5.13.0-39-lowlatency # ubuntu studio, march 2022
/boot/vmlinuz-5.13.0-30-lowlatency # ubuntu studio, feb 2022
/boot/vmlinuz-5.13.0-22-lowlatency # ubuntu studio, november 2021
/EFI/Microsoft/Boot/bootmgfw.efi  # windows
/boot/vmlinuz-5.11.0-46-generic   # other ubuntu install
/boot/vmlinuz-5.11.0-44-generic   # other ubuntu install

But when I really come to grub screen on boot, the only choice I have for ubuntu studio is :
5.13.0-22
5.13.0-21 (no more installed : if I choose this the boot fails)

I suspect I may have a problem with grub update during system update, something like “ubuntu studio lets the other ubuntu install deal with grub”…

Thanks again for your help !

I launched this on ubuntu studio :

os-prober
update-grub

and… nothing changed in the grub menu, I still have the old kernels. I guess I just did what the system update did.

I booted on the other ubuntu partition and launched the same commands, and bingo : the new kernels where detected, and now appear on boot. By default ubuntu studio boots on 5.13.0-39-lowlatency and nvidia-settings is now working.

I still do not understand exactly what happens, but it seems that only the regular ubuntu can handle grub.

  • ubuntu has /boot on sda7
  • ubuntu studio has /boot on sda8
  • the boot partition UEFI is on sda2

Anyway, even if you don’t have a perfect explanation for this, a big thanks for your help, I now know what to do (update-grub on the other ubutu after each update) !