Cannot get the nvidia driver to work

nvidia-bug-report.log.gz (213.8 KB)

I cannot get the nvidia-driver to work on Ubuntu 22.04 LTS. Installing the driver either through the drivers gui from ubuntu or through ppa and selecting “nvidia” through prime-select causes black screen.

sudo ubuntu-drivers list

outputs

dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
nvidia-driver-535-server-open
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
nvidia-driver-545-open
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
nvidia-driver-535
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
nvidia-driver-470
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
nvidia-driver-535-open
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
nvidia-driver-535-server
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
nvidia-driver-450-server
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
nvidia-driver-470-server
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
dpkg: warning: version 'unsigned-5.8.0-050800' has bad syntax: version number does not start with digit
nvidia-driver-545

First of all delete /etc/X11/xorg.conf - It’s not suitable for Optimus Laptops.

Second you made quite a mess mixing .run file installs and distro installs.
Now you ended up with a version mismatch of userspace and kernel components of the driver:

Apr 30 19:12:11 mark2 kernel: NVRM: API mismatch: the client has the version 535.171.04, but
NVRM: this kernel module has the version 470.239.06. Please
NVRM: make sure that this kernel module and all NVIDIA driver
NVRM: components have the same version.

1: execute the .run file installer with the --uninstall parameter.
2: sudo apt purge 'nvidia*' 'libnvidia*'
3: sudo apt install nvidia-driver-550
4: reboot

If it does not work create a new bug report and post the output of dpkg -l | grep nvidia

1 Like

@Mart Thank you for the response!

  1. I deleted the xorg.conf file.
  2. using the --uninstall parameter shows an error “There is no NVIDIA driver currently installed”. I used the 470.239.06 .run file. However I still went through the next steps and rebooted it.
  3. nvidia-smi still returns
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

nvidia-bug-report.log.gz (164.7 KB)

ii  libnvidia-cfg1-550:amd64                                    550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-550                                        550.78-0ubuntu0~gpu22.04.1                                     all          Shared files used by the NVIDIA libraries
ii  libnvidia-compute-550:amd64                                 550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA libcompute package
ii  libnvidia-compute-550:i386                                  550.78-0ubuntu0~gpu22.04.1                                     i386         NVIDIA libcompute package
ii  libnvidia-decode-550:amd64                                  550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-550:i386                                   550.78-0ubuntu0~gpu22.04.1                                     i386         NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-550:amd64                                  550.78-0ubuntu0~gpu22.04.1                                     amd64        NVENC Video Encoding runtime library
ii  libnvidia-encode-550:i386                                   550.78-0ubuntu0~gpu22.04.1                                     i386         NVENC Video Encoding runtime library
ii  libnvidia-extra-550:amd64                                   550.78-0ubuntu0~gpu22.04.1                                     amd64        Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-550:amd64                                    550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-550:i386                                     550.78-0ubuntu0~gpu22.04.1                                     i386         NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-550:amd64                                      550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-550:i386                                       550.78-0ubuntu0~gpu22.04.1                                     i386         NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  nvidia-compute-utils-550                                    550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA compute utilities
ii  nvidia-dkms-550                                             550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA DKMS package
ii  nvidia-driver-550                                           550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA driver metapackage
ii  nvidia-firmware-550-550.78                                  550.78-0ubuntu0~gpu22.04.1                                     amd64        Firmware files used by the kernel module
ii  nvidia-kernel-common-550                                    550.78-0ubuntu0~gpu22.04.1                                     amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-550                                    550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA kernel source package
ii  nvidia-prime                                                0.8.17.1                                                       all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                                             510.47.03-0ubuntu1                                             amd64        Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-550                                            550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA driver support binaries
ii  screen-resolution-extra                                     0.18.2                                                         all          Extension for the nvidia-settings control panel
ii  xserver-xorg-video-nvidia-550                               550.78-0ubuntu0~gpu22.04.1                                     amd64        NVIDIA binary Xorg driver
  1. sudo prime-select query returns on-demand
  2. ubuntu-drivers devices returns
== /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0 ==
modalias : pci:v000010DEd00001F99sv0000103Csd000087B1bc03sc00i00
vendor   : NVIDIA Corporation
model    : TU117M
manual_install: True
driver   : nvidia-driver-535-open - distro non-free
driver   : nvidia-driver-470 - distro non-free
driver   : nvidia-driver-535 - distro non-free recommended
driver   : nvidia-driver-545 - distro non-free
driver   : nvidia-driver-535-server - distro non-free
driver   : nvidia-driver-535-server-open - distro non-free
driver   : nvidia-driver-450-server - distro non-free
driver   : nvidia-driver-470-server - distro non-free
driver   : nvidia-driver-545-open - distro non-free
driver   : xserver-xorg-video-nouveau - distro free builtin

  1. I still end up getting the same output as before with sudo ubuntu-drivers list

Installation looks good now.
No errors to see, but the driver does not load.

Things that come to mind:
1: The initramfs was not updated - try: sudo update-initramfs -u -k all and reboot.
2: The driver is blacklisted: show the output of grep -R nvidia /etc/modprobe.d/

/etc/modprobe.d/evdi.conf:softdep evdi pre: nvidia_drm amdgpu 
/etc/modprobe.d/blacklist-framebuffer.conf:blacklist nvidiafb
/etc/modprobe.d/nvidia-graphics-drivers-kms.conf:# This file was generated by nvidia-driver-550
/etc/modprobe.d/nvidia-graphics-drivers-kms.conf:options nvidia-drm modeset=1

Most likely unrelated, but there is an error in your kernel boot parameters:

[ 0.030103] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.5.0-26-generic root=UUID=7d980f33-23ab-40df-8c21-34c820598727 ro resume=UUID=7d980f33-23ab-40df-8c21-34c820598727 resume_offset= 16123904 amdgpu.exp_hw_support=1
[ 0.030193] Unknown kernel command line parameters “16123904 BOOT_IMAGE=/boot/vmlinuz-6.5.0-26-generic”, will be passed to user space.

resume_offset= 16123904 - the space after the equal sign needs to be removed.
Why do you need resume_offset and amdgpu.exp_hw_support?

I don’t really know why the driver does not load.

What happens if you do sudo modprobe -vv nvidia ?

Did you try sudo prime-select nvidia - reboot?

I removed all the parameters and it’s just blank now.

On sudo prime-query intel, sudo modprobe -vv nvidia output is

modprobe: INFO: ../libkmod/libkmod.c:367 kmod_set_log_fn() custom logging function 0x5b81b9b55830 registered
modprobe: ERROR: ../libkmod/libkmod-module.c:838 kmod_module_insert_module() could not find module by name='off'
modprobe: ERROR: could not insert 'off': Unknown symbol in module, or unknown parameter (see dmesg)
modprobe: INFO: ../libkmod/libkmod.c:334 kmod_unref() context 0x5b81ba1ad460 released

After switching to nvidia using prime select and rebooting. I only see a black screen both on laptop display and external monitor. Switching to console and the output for sudo modprobe -vv nvidia is

modprobe: INFO: ../libkmod/libkmod.c:367 kmod_set_log_fn() custom logging function 0x5d655d80a830 registe  \red
modprobe: INFO: ../libkmod/libkmod.c:334 kmod_unref() context 0x5d655eb2c430 released    

I suspect there is still a configuration file somewhere that turns off the nvidia module.
Check all the possible directory in which modprobe looks for config files:
grep -R nvidia /lib/modprobe.d/ /usr/local/lib/modprobe.d /run/modprobe.d /etc/modprobe.d

entries like alias nvidia off would produce the error we are seeing.

@Mart Should I do this in prime-select intel or prim-select nvidia mode?

when prime-select query says intel the output is

/lib/modprobe.d/nvidia-kms.conf:# This file was generated by nvidia-prime
/lib/modprobe.d/nvidia-kms.conf:options nvidia-drm modeset=1
/lib/modprobe.d/nvidia-graphics-drivers.conf:options nvidia-drm modeset=1
/lib/modprobe.d/blacklist-nvidia.conf:# This file was generated by nvidia-prime
/lib/modprobe.d/blacklist-nvidia.conf:blacklist nvidia
/lib/modprobe.d/blacklist-nvidia.conf:blacklist nvidia-drm
/lib/modprobe.d/blacklist-nvidia.conf:blacklist nvidia-modeset
/lib/modprobe.d/blacklist-nvidia.conf:alias nvidia off
/lib/modprobe.d/blacklist-nvidia.conf:alias nvidia-drm off
/lib/modprobe.d/blacklist-nvidia.conf:alias nvidia-modeset off
grep: /usr/local/lib/modprobe.d: No such file or directory
grep: /run/modprobe.d: No such file or directory
/etc/modprobe.d/evdi.conf:softdep evdi pre: nvidia_drm amdgpu 
/etc/modprobe.d/blacklist-framebuffer.conf:blacklist nvidiafb
/etc/modprobe.d/nvidia-graphics-drivers-kms.conf:# This file was generated by nvidia-driver-550
/etc/modprobe.d/nvidia-graphics-drivers-kms.conf:options nvidia-drm modeset=1

I see the alias nvidia off, but isn’t that expected when prime-select is set to intel?

If you choose intel with prime-select, then these files are generated.
When selecting on-demand or nvidia the files should be removed.

Edit: to be precise… the ones that turn off, or blacklist the nvidia modules.
Edit 2: /etc/modprobe.d/blacklist-framebuffer.conf:blacklist nvidiafb ← this one should still be there.

@Mart This is with nvidia selected on prime-select

/lib/modprobe.d/nvidia-kms.conf:# This file was generated by nvidia-prime
/lib/modprobe.d/nvidia-kms.conf:options nvidia-drm modeset=1
/lib/modprobe.d/nvidia-graphics-drivers.conf:options nvidia-drm modeset=1
grep: /usr/local/lib/modprobe.d: No such file or directory
grep: /run/modprobe.d: No such file or directory
/etc/modprobe.d/evdi.conf:softdep evdi pre: nvidia_drm amdgpu 
/etc/modprobe.d/blacklist-framebuffer.conf:blacklist nvidiafb
/etc/modprobe.d/nvidia-graphics-drivers-kms.conf:# This file was generated by nvidia-driver-550
/etc/modprobe.d/nvidia-graphics-drivers-kms.conf:options nvidia-drm modeset=1

Does this look right?

yes.
can you modprobe the module now?

@Mart Do you mean the “sudo modprobe -vv nvidia” ? If so the output is very similar to

Looks it’s loading.
You can check with lmsod | grep nvidia

So if you prime-select nvidia and reboot, does it work now?

I can’t boot into the GUI if I select Nvidia under prime-select.

Please do a fresh boot after selecting nvidia and from the console create a new bug report. Then switch to intel, or on-demand to upload it.

nvidia-bug-report.log.gz (363.5 KB)
This is the new bug report.

Ok, at least the nvidia driver now loads fine without errors.
Could you please run sudo prime-select on-demand. Reboot and then post the output of:
__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia glxinfo|grep vendor

It says Error: unable to open display

nvidia-smi works though

The needed amdgpu driver is installed but not loading. Did you blacklist it? If so, please revert.