NVIDIA Driver 460 installation is failed on GeForce GTX 970 (Ubuntu 20.04) - endless login loop & missing package

After upgrading to ubuntu 20.04 and installing nvidia driver, i can’t log in because of the endless login loop.
(lightDM, GTX 970, kernel 5.4.0-64)

before os upgrading, opengl and vulkan programs were working without any problems under ubuntu 14.4 and driver 364.

After upgrading to ubuntu 20.04, I tested 3 drivers, nvidia-390, 450 and 460. (460 is the recommended one by ubuntu-drivers)
commands used : sudo ubuntu-drivers autoinstall ( for 460) sudo apt install nvidia-xxx ( for 450, 390 )

< Issue 1 > Endless login loop (except xfce4)

If i chose Unity, GNOME3, Ubuntu, … , All my login attempts are stuck in endless login loop.
But if i chose only XFCE4, login process worked. but the driver doesn’t seem to be installed properly.

$ glxinfo | grep "direct rendering"

X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  151 (GLX)
  Minor opcode of failed request:  24 (X_GLXCreateNewContext)
  Value in failed request:  0x0
  Serial number of failed request:  99
  Current serial number in output stream:  100

< Issue 2 > Missing package(s)

when I installed driver-460, nvidia-dkms package was not installed.
and when I installed driver-390 and 450, I found linux-modules-nvidia packages were not installed.

to sum up,

  1. nvidia-driver-460
    linux-modules-nvidia-460-5.4.0-64-generic (OK)
    linux-modules-nvidia-460-generic (OK)
    nvidia-dkms package (Not built)

  2. nvidia-driver-450
    linux-modules-nvidia-450-5.4.0-64-generic (Not built)
    linux-modules-nvidia-450-generic (Not built)
    nvidia-dkms package (OK)

  3. nvidia-driver-390
    linux-modules-nvidia-390-5.4.0-64-generic (Not built)
    linux-modules-nvidia-390-generic (Not built)
    nvidia-dkms package (OK)

So, I had to install the missing packages later myself. (sudo apt install xxx)

I tested with 2 kernel versions(5.4.0-62, 5.4.0-64), and the results were the same.
for the test, i rebuilt build essentials, and purge and install the related packages(dkms, lightdm, and all the gnome packages).

I attached the bug report for all the three cases. nvidia-smi result is also attached.

thanks,
jaden.

nvidia-bug-report_v460p.log.gz (343.2 KB)
nvidia-bug-report_v450.log.gz (359.5 KB)
nvidia-bug-report_v390.log.gz (195.0 KB)
nvidia-smi.txt (1.5 KB)

Which Ubuntu version did you run prior to upgrading?

ubuntu 14.04 and driver 364.

I upgraded to 16.04, 18.04, and 20.04, continuously via ssh.

Did you check if everything was working when running 18.04? 16.04->18.04 was a crucial update, it changed the OpenGL client lib layout to glvnd. According to the logs, the Xserver part looks fine. Please post the output of
ls -l /usr/lib/libGL*

I didn’t check for 18.04.
(upgrading was ok for another computer(lenovo laptop B570, nvidia-390) in this way, so I didn’t expect this problem… )

there is no files like "/usr/lib/libGL* "
instead, i attached the output for "/usr/lib/x86_64-linux-gnu/libGL*"libgl.txt (3.6 KB)

The library setup looks fine. Only thing I could come up with that the libraries might be compiled for some older version of ubuntu. Did you check /etc/apt/sources.list and /etc/apt/sources.list.d/* for some leftover repos from old ubuntu versions?

1. Source list
Among valid source lists(which are not commented out), there are some sources from old 18.04.
source.list_from_18.04.txt (1.6 KB)

(graphics-drivers ppa : i tried installing driver for 18.04 and failed, but 20.04 is the target version, i continued os upgrading)

So… after removing old source lists, can i try reinstalling driver?

2. Old libraries
one more question, i have 2 gl libraries which are too old.
→ libglee and libglew 1.13

do i need to delete these packages?

lrwxrwxrwx 1 root root 9 Sep 4 2013 libGLee.a → libglee.a
lrwxrwxrwx 1 root root 10 Sep 4 2013 libGLee.so → libglee.so

lrwxrwxrwx 1 root root 17 Nov 10 2015 libGLEW.so.1.13 → libGLEW.so.1.13.0
-rw-r–r-- 1 root root 514176 Nov 10 2015 libGLEW.so.1.13.0
lrwxrwxrwx 1 root root 16 Jan 12 2019 libGLEW.so.2.1 → libGLEW.so.2.1.0
-rw-r–r-- 1 root root 669584 Jan 12 2019 libGLEW.so.2.1.0

[ Update ]
Sorry, I mistook…
I think my source lists are ok…
all 18.04 source lists are from the file “sources.list.distUpgrade” which is only related with os upgrade.
So, my entire list is as follows :
repo_lists.txt (1.4 KB)

BTW, missing packages seem to depend on the command used.
for the same driver 460, if ‘ubuntu-drivers’ command is used, nvidia-dkms is missing in the dependency list, if ‘apt install’ command, modules-nvidia packages are missing.
i’m not sure this is relevant…

$ sudo ubuntu-drivers autoinstall
(…)
The following additional packages will be installed:
(…)
libxxf86vm1:i386 linux-modules-nvidia-460-5.4.0-64-generic linux-modules-nvidia-460-generic mesa-vulkan-drivers:i386 nvidia-compute-utils-460 nvidia-driver-460 nvidia-kernel-common-460 nvidia-kernel-source-460 nvidia-prime nvidia-settings nvidia-utils-460 screen-resolution-extra xserver-xorg-video-nvidia-460
0 upgraded, 68 newly installed, 0 to remove and 0 not upgraded.

$ sudo apt install nvidia-driver-460
(…)
The following additional packages will be installed:
(…)
libxxf86vm1:i386 mesa-vulkan-drivers:i386 nvidia-compute-utils-460 nvidia-dkms-460 nvidia-driver-460 nvidia-kernel-common-460 nvidia-kernel-source-460 nvidia-prime nvidia-settings nvidia-utils-460 screen-resolution-extra xserver-xorg-video-nvidia-460
0 upgraded, 67 newly installed, 0 to remove and 0 not upgraded.

ubuntu_drivers_log.txt (29.9 KB)

apt_install_log.txt (21.5 KB)

Maybe try to clean your deps:
sudo apt clean
sudo apt update
sudo apt dist-upgrade

I tried but there is no difference…
bug-report and Xorg.log files are attached…

nvidia-bug-report.log.gz (332.8 KB)
Xorg.0.log (54.6 KB)

** BTW, I’m not sure if this is relevant, …
I heard that ubuntu tty setting has been changed, like :
– tty: Ctrl+Alt+F3 ~ F6, gui: Ctrl_Alt+F1 ~ F2

I don’t know but, even though my machine’s os is upgraded to 20.04, it still has an old-style tty setting :
– tty: Ctrl+Alt+F1 ~ F6, gui: Ctrl_Alt+F7

I guess you still have lightdm running, please try switching to gdm.

one modprobe config file from old package caused the problem.
/etc/modprobe.d/virtualgl.conf

this file has the line :

 options nvidia NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=1002 NVreg_DeviceFileMode=0660

there seems to be some value mismatch after os upgrade…
because that package was installed under 14.04, and i will install new version later …
So… after I removed modprobe config file (and the old package),
both LightDM and GDM3 worked perfectly without any login loop problem!!!
anyway…

thank you so much for your help!!!