Issues installing 550 drivers under Ubuntu 23.10 mantic

I’m trying to install the apt reinstall nvidia-driver-550 (550.40.07-0ubuntu0~gpu23.10.1) on a relatively unremarkable Ubuntu 23.10 box via the graphics-drivers PPA (the current installed version is 535 from the same repo)

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-driver-550

I’m using the standard Ubuntu-packaged kernel, 6.5.0-15-generic:

# uname -a
Linux gaia 6.5.0-15-generic #15-Ubuntu SMP PREEMPT_DYNAMIC Tue Jan  9 17:03:36 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

The install results in a DKMS build error:

...
Loading new nvidia-550.40.07 DKMS files...
Building for 6.5.0-15-generic
Building for architecture x86_64
Building initial module for 6.5.0-15-generic
ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/nvidia-dkms-550.0.crash'
Error! Bad return status for module build on kernel: 6.5.0-15-generic (x86_64)
Consult /var/lib/dkms/nvidia/550.40.07/build/make.log for more information.
dpkg: error processing package nvidia-dkms-550 (--configure):
 installed nvidia-dkms-550 package post-installation script subprocess returned error exit status 10
Setting up libnvidia-encode-550:amd64 (550.40.07-0ubuntu0~gpu23.10.1) ...
Setting up libnvidia-encode-550:i386 (550.40.07-0ubuntu0~gpu23.10.1) ...
dpkg: dependency problems prevent configuration of nvidia-driver-550:
 nvidia-driver-550 depends on nvidia-dkms-550 (<= 550.40.07-1); however:
  Package nvidia-dkms-550 is not configured yet.
 nvidia-driver-550 depends on nvidia-dkms-550 (>= 550.40.07); however:
  Package nvidia-dkms-550 is not configured yet.

dpkg: error processing package nvidia-driver-550 (--configure):
 dependency problems - leaving unconfigured
No apport report written because the error message indicates its a followup error from a previous failure.

The make.log’s contents are:

# cat /var/lib/dkms/nvidia/550.40.07/build/make.log
DKMS make.log for nvidia-550.40.07 for kernel 6.5.0-15-generic (x86_64)
Tue Jan 30 01:28:47 PM MST 2024
make[1]: Entering directory '/usr/src/linux-headers-6.5.0-15-generic'
make --no-print-directory -C /usr/src/linux-headers-6.5.0-15-generic \
-f /usr/src/linux-headers-6.5.0-15-generic/Makefile modules
warning: the compiler differs from the one used to build the kernel
  The kernel was built by: x86_64-linux-gnu-gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0
  You are using:           cc (Ubuntu 13.2.0-4ubuntu3) 13.2.0
make -f ./scripts/Makefile.build obj=/var/lib/dkms/nvidia/550.40.07/build need-builtin=1 need-modorder=1
/var/lib/dkms/nvidia/550.40.07/build/Kbuild:233: /var/lib/dkms/nvidia/550.40.07/build/header-presence-tests.mk: No such file or directory
make[3]: *** No rule to make target '/var/lib/dkms/nvidia/550.40.07/build/header-presence-tests.mk'.  Stop.
make[2]: *** [/usr/src/linux-headers-6.5.0-15-generic/Makefile:2037: /var/lib/dkms/nvidia/550.40.07/build] Error 2
make[1]: *** [Makefile:234: __sub-make] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-6.5.0-15-generic'
make: *** [Makefile:85: modules] Error 2

The crash log is similar:

root@gaia:~# cat /var/crash/nvidia-dkms-550.0.crash
ProblemType: Package
DKMSBuildLog:
 DKMS make.log for nvidia-550.40.07 for kernel 6.5.0-15-generic (x86_64)
 Tue Jan 30 01:10:46 PM MST 2024
 make[1]: Entering directory '/usr/src/linux-headers-6.5.0-15-generic'
 make --no-print-directory -C /usr/src/linux-headers-6.5.0-15-generic \
 -f /usr/src/linux-headers-6.5.0-15-generic/Makefile modules
 warning: the compiler differs from the one used to build the kernel
   The kernel was built by: x86_64-linux-gnu-gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0
   You are using:           cc (Ubuntu 13.2.0-4ubuntu3) 13.2.0
 make -f ./scripts/Makefile.build obj=/var/lib/dkms/nvidia/550.40.07/build need-builtin=1 need-modorder=1
 /var/lib/dkms/nvidia/550.40.07/build/Kbuild:233: /var/lib/dkms/nvidia/550.40.07/build/header-presence-tests.mk: No such file or directory
 make[3]: *** No rule to make target '/var/lib/dkms/nvidia/550.40.07/build/header-presence-tests.mk'.  Stop.
 make[2]: *** [/usr/src/linux-headers-6.5.0-15-generic/Makefile:2037: /var/lib/dkms/nvidia/550.40.07/build] Error 2
 make[1]: *** [Makefile:234: __sub-make] Error 2
 make[1]: Leaving directory '/usr/src/linux-headers-6.5.0-15-generic'
 make: *** [Makefile:85: modules] Error 2
DKMSKernelVersion: 6.5.0-15-generic
Date: Tue Jan 30 13:10:47 2024
Package: nvidia-dkms-550 550.40.07-0ubuntu0~gpu23.10.1
PackageVersion: 550.40.07-0ubuntu0~gpu23.10.1
SourcePackage: nvidia-graphics-drivers-550
Title: nvidia-dkms-550 550.40.07-0ubuntu0~gpu23.10.1: nvidia kernel module failed to build

The only reference to header-presence-tests.mk is the include, so I don’t think it’s supposed to be generated:

root@gaia:~# rg "header-presence-tests.mk" /usr/src/nvidia-550.40.07
/usr/src/nvidia-550.40.07/Kbuild
233:include $(src)/header-presence-tests.mk

I did find this link which suggests that the missing header-presence-tests.mk is present in the open package, but that’s different from the proprietary package.

Checking that directory confirms that the file is indeed missing:

root@gaia:/var/lib/dkms# ls -l /var/lib/dkms/nvidia/550.40.07/build/
total 328
drwxr-xr-x 3 root root   4096 Jan 30 13:28 common
-rwxr-xr-x 1 root root 265207 Jan 17 11:34 conftest.sh
-rw-r--r-- 1 root root    922 Jan 17 11:34 count-lines.mk
-rw-r--r-- 1 root root   1195 Jan 29 00:07 dkms.conf
-rw-r--r-- 1 root root   9359 Jan 18 01:06 Kbuild
-rw-r--r-- 1 root root   4874 Jan 18 01:06 Makefile
-rw-r--r-- 1 root root   1143 Jan 30 13:28 make.log
drwxr-xr-x 5 root root   4096 Jan 30 13:28 nvidia
drwxr-xr-x 2 root root   4096 Jan 30 13:28 nvidia-drm
drwxr-xr-x 2 root root   4096 Jan 30 13:28 nvidia-modeset
drwxr-xr-x 2 root root   4096 Jan 30 13:28 nvidia-peermem
drwxr-xr-x 3 root root  12288 Jan 30 13:28 nvidia-uvm
drwxr-xr-x 2 root root   4096 Jan 30 13:28 patches
root@gaia:/var/lib/dkms# ls -l /var/lib/dkms/nvidia/550.40.07/
total 4
drwxr-xr-x 9 root root 4096 Jan 30 13:28 build
lrwxrwxrwx 1 root root   25 Jan 30 13:28 source -> /usr/src/nvidia-550.40.07
root@gaia:/var/lib/dkms# ls -l /usr/src/nvidia-550.40.07
total 324
drwxr-xr-x 3 root root   4096 Jan 30 13:28 common
-rwxr-xr-x 1 root root 265207 Jan 17 11:34 conftest.sh
-rw-r--r-- 1 root root    922 Jan 17 11:34 count-lines.mk
-rw-r--r-- 1 root root   1195 Jan 29 00:07 dkms.conf
-rw-r--r-- 1 root root   9359 Jan 18 01:06 Kbuild
-rw-r--r-- 1 root root   4874 Jan 18 01:06 Makefile
drwxr-xr-x 5 root root   4096 Jan 30 13:28 nvidia
drwxr-xr-x 2 root root   4096 Jan 30 13:28 nvidia-drm
drwxr-xr-x 2 root root   4096 Jan 30 13:28 nvidia-modeset
drwxr-xr-x 2 root root   4096 Jan 30 13:28 nvidia-peermem
drwxr-xr-x 3 root root  12288 Jan 30 13:28 nvidia-uvm
drwxr-xr-x 2 root root   4096 Jan 30 13:28 patches
root@gaia:/var/lib/dkms# find /usr/src/nvidia-550.40.07 -name "header-presence-tests.mk"
(no result found)

Running an RTX 3060, FWIW.

❯ nvidia-smi
Tue Jan 30 13:38:37 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05             Driver Version: 535.154.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3060        Off | 00000000:0A:00.0  On |                  N/A |
| 46%   50C    P3              43W / 170W |   2035MiB / 12288MiB |     20%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Any suggestions on how to proceed? I haven’t found much reference to this issue elsewhere.

FWIW, just manually downloading that file into the src directory and then reconfiguring seems to complete the build. Maybe it just got missed during packaging?

# wget https://fossies.org/linux/misc/NVIDIA-open-gpu-kernel-modules-550.40.07.tar.gz/open-gpu-kernel-modules-550.40.07/kernel-open/header-presence-tests.mk?m=t -O /usr/src/nvidia-550.40.07/header-presence-tests.mk
root@gaia:~# dpkg --configure -a
Setting up nvidia-dkms-550 (550.40.07-0ubuntu0~gpu23.10.1) ...
update-initramfs: deferring update (trigger activated)
INFO:Enable nvidia
DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/put_your_quirks_here
DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/dell_latitude
DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/lenovo_thinkpad
Removing old nvidia-550.40.07 DKMS files...
Deleting module nvidia-550.40.07 completely from the DKMS tree.
Loading new nvidia-550.40.07 DKMS files...
Building for 6.5.0-15-generic
Building for architecture x86_64
Building initial module for 6.5.0-15-generic
Done.

nvidia.ko.zst:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/6.5.0-15-generic/updates/dkms/

nvidia-modeset.ko.zst:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/6.5.0-15-generic/updates/dkms/

nvidia-drm.ko.zst:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/6.5.0-15-generic/updates/dkms/

nvidia-uvm.ko.zst:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/6.5.0-15-generic/updates/dkms/

nvidia-peermem.ko.zst:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/6.5.0-15-generic/updates/dkms/
depmod...
Setting up nvidia-driver-550 (550.40.07-0ubuntu0~gpu23.10.1) ...
Processing triggers for initramfs-tools (0.142ubuntu15.1) ...
update-initramfs: Generating /boot/initrd.img-6.5.0-15-generic

After a reboot, 550 is loaded and running just fine:

❯ nvidia-smi
Tue Jan 30 13:48:43 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.40.07              Driver Version: 550.40.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060        Off |   00000000:0A:00.0  On |                  N/A |
| 51%   52C    P0             48W /  170W |     599MiB /  12288MiB |      8%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

Looks like the maintainer of the graphics driver ppa noticed as well and updated the packages to include the file now.

I’ve also installed the same driver on the same system ver. and some of my windows are looking like this now:

I tried the driver downloaded from the NVIDIA page but it gives me the same result.

Here is my nvidia-smi output:

Thu Feb  1 09:18:22 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.40.07              Driver Version: 550.40.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 ...    Off |   00000000:01:00.0  On |                  N/A |
|  0%   45C    P0             43W /  285W |     876MiB /  16376MiB |      1%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A     14263      G   /usr/lib/xorg/Xorg                            256MiB |
|    0   N/A  N/A     14448      G   /usr/bin/gnome-shell                          161MiB |
|    0   N/A  N/A     16469      G   /usr/lib/firefox-esr/firefox-esr              212MiB |
|    0   N/A  N/A     17533      G   /usr/bin/kgx                                  232MiB |
+-----------------------------------------------------------------------------------------+

And my nvidia bug report
nvidia-bug-report.log.gz (763.2 KB)

Known bug in 550 beta, has already been reported.

For your information, I have managed to install nvidia-driver-550 (550.40.07) on a desktop computer containing a RTX 4070 SUPER. Configuration: Ubuntu 22.04, kernel 6.5.0-18-generic #18~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC.

Could you please elaborate on this? Did you install the drivers downloding them from Nvidia site or by

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-driver-550

Thanks