Driver failed to load on Ubuntu 24 LTS

hi, nvidia:
I Installed driver by sudo ubuntu-drivers --gpgpu install

my video card:

lspci -k | grep -EA3 'VGA|3D|Display':

2d:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070] (rev a1)
Subsystem: LeadTek Research Inc. GA104 [GeForce RTX 3070]
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
2d:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1)

sudo lshw -C display:

*-display
description: VGA compatible controller
product: GA104 [GeForce RTX 3070]
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:2d:00.0
logical name: /dev/fb0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller cap_list fb
configuration: depth=32 latency=0 resolution=1920,1080
resources: iomemory:38ff0-38fef iomemory:38ff0-38fef memory:b9000000-b9ffffff memory:38ffe0000000-38ffefffffff memory:38fff0000000-38fff1ffffff ioport:8000(size=128) memory:ba080000-ba0fffff

but I got:
nvidia-smi:

NVIDIA-SMI has failed because it couldnā€™t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

nvitop:

NVML ERROR: Driver Not Loaded

for troubleshooting, what I have done:

lsmod | grep nvidia ā†’ blank output

lsb_release -a :

No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.04.1 LTS
Release: 24.04
Codename: noble

sudo dmesg | grep -i nvidia:

[ 7.369542] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:2c/0000:2c:00.0/0000:2d:00.1/sound/card1/input9
[ 7.369914] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:2c/0000:2c:00.0/0000:2d:00.1/sound/card1/input10
[ 7.370132] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:2c/0000:2c:00.0/0000:2d:00.1/sound/card1/input11
[ 7.370345] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:2c/0000:2c:00.0/0000:2d:00.1/sound/card1/input12

cat /proc/driver/nvidia/version

cat: /proc/driver/nvidia/version: No such file or directory

dkms status

nvidia/570.86.15, 6.8.0-52-generic, x86_64: installed

rmmod nvidia

rmmod: ERROR: Module nvidia is not currently loaded

sudo modprobe nvidia

modprobe: ERROR: could not insert ā€˜nvidiaā€™: Key was rejected by service

this is my log: nvidia-bug-report.log.gz or
nvidia-bug-report.log.gz (212.6 KB)

after reinstalled driver serveral times, problem still there
I had my HP Z8G4 secure boot down, help pls

you have both nvidia and nouveau loaded:

$ wget -qO- https://forums.developer.nvidia.com/uploads/short-url/8SDJ0lXid5O5yLk6gxyQdeW573K.gz |gunzip |grep nouveau
/usr/share/vulkan/icd.d/nouveau_icd.x86_64.json
	Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

Although nvidia seems to be in control of the card, try blacklisting nouveau: maybe it will help. As root:

# echo 'blacklist nouveau' >/etc/modprobe.d/blacklist-nouveau.conf

ā€¦and reboot just in case.

Disable secure bootā€¦

1 Like

thx for reply!

cat blacklist.conf:

# This file lists those modules which we don't want to be loaded by
# alias expansion, usually so some other driver will be loaded for the
# device instead.

# evbug is a debug tool that should be loaded explicitly
blacklist evbug

# these drivers are very simple, the HID drivers are usually preferred
blacklist usbmouse
blacklist usbkbd

# replaced by e100
blacklist eepro100

# replaced by tulip
blacklist de4x5

# causes no end of confusion by creating unexpected network interfaces
blacklist eth1394

# snd_intel8x0m can interfere with snd_intel8x0, doesn't seem to support much
# hardware on its own (Ubuntu bug #2011, #6810)
blacklist snd_intel8x0m

# Conflicts with dvb driver (which is better for handling this device)
blacklist snd_aw2

# replaced by p54pci
blacklist prism54

# replaced by b43 and ssb.
blacklist bcm43xx

# most apps now use garmin usb driver directly (Ubuntu: #114565)
blacklist garmin_gps

# replaced by asus-laptop (Ubuntu: #184721)
blacklist asus_acpi

# low-quality, just noise when being used for sound playback, causes
# hangs at desktop session start (Ubuntu: #246969)
blacklist snd_pcsp

# ugly and loud noise, getting on everyone's nerves; this should be done by a
# nice pulseaudio bing (Ubuntu: #77010)
blacklist pcspkr

# EDAC driver for amd76x clashes with the agp driver preventing the aperture
# from being initialised (Ubuntu: #297750). Blacklist so that the driver
# continues to build and is installable for the few cases where its
# really needed.
blacklist amd76x_edac

blacklist nouveau
options nouveau modeset=0

then I thought I had nouveau disabled.
shall I add another to blacklist-nouveau.conf?

hmm, thatā€™s really strangeā€¦ I assume this blacklist.conf file is located in /etc/modprobe.d/ folder, correct? Blacklisting it twice is unlikely to change anything, I guess, but itā€™s so strange that Iā€™m not sure of anything at this pointā€¦
Is it possible that these logs where gathered without rebooting after the installation of Nvidia drivers? Thatā€™s the only explanation I can think ofā€¦
Also maybe comment-out the options nouveau modeset=0 line: not sure if thatā€™s how modprobe works, but maybe defining options for it overrides blacklistingā€¦

modinfo nvidiafb

filename:       /lib/modules/6.8.0-52-generic/kernel/drivers/video/fbdev/nvidia/nvidiafb.ko.zst
license:        GPL
description:    Framebuffer driver for nVidia graphics chipset
author:         Antonino Daplas
srcversion:     D6BF27984135B42B3EE8FFE
alias:          pci:v000010DEd*sv*sd*bc03sc*i*
depends:        fb_ddc,i2c-algo-bit,vgastate
retpoline:      Y
intree:         Y
name:           nvidiafb
vermagic:       6.8.0-52-generic SMP preempt mod_unload modversions
sig_id:         PKCS#7
signer:         Build time autogenerated kernel key
sig_key:        54:DF:B5:D8:C8:13:EA:72:4F:9B:B6:37:32:A2:D8:9E:DA:58:91:9F
sig_hashalgo:   sha512
signature:      87:F6:12:40:AE:73:BA:ED:AD:FC:97:B8:76:95:D4:92:9A:F3:94:16:
                CB:72:BF:BC:E1:64:C2:7F:BD:4F:E8:02:4B:3D:5E:23:5E:BF:D4:3A:
                63:09:39:25:8C:23:2F:21:02:C9:FB:88:29:CD:5E:EA:AE:66:15:C8:
                36:B1:3C:D6:AB:F1:CD:7A:C8:B7:1E:37:FD:50:F7:F6:A6:A3:FE:BA:
                73:33:67:92:39:4A:F4:8A:27:04:26:9C:63:7B:90:A5:15:5F:ED:7A:
                17:E6:0C:01:10:EA:03:B8:1E:43:67:43:17:DF:52:3C:57:64:7E:C2:
                46:9E:B2:FB:7A:2A:35:01:F3:47:85:DB:7A:70:6F:72:78:32:AD:8F:
                1E:19:5C:97:07:B5:B3:1A:C6:A8:C6:B6:51:10:CC:F0:FA:F8:96:82:
                27:B0:48:C4:65:AF:78:76:33:D1:B2:73:BC:0A:5F:58:8E:E7:E1:87:
                E3:F9:CC:5F:C6:09:D2:EC:17:7D:5F:47:CD:C8:F7:CF:1B:E2:E3:3E:
                0F:35:E6:2D:03:56:C7:00:C8:DB:1C:DD:C3:ED:36:58:A2:73:A6:E2:
                B9:B6:B3:C7:CE:7D:DC:A0:4C:33:90:0B:B9:6C:10:11:42:C4:95:03:
                49:90:64:38:61:46:CB:D5:61:3F:E5:BF:4F:CE:67:07:06:65:EE:D0:
                77:A4:20:D8:C1:33:07:60:CC:15:55:82:A5:9D:FA:30:3C:C9:FD:F9:
                DB:5E:8F:A4:BD:1E:73:63:C2:BC:48:7A:91:E3:71:66:E5:A9:1A:12:
                11:5A:DC:8A:DE:9F:08:B3:E1:9A:33:36:87:61:64:B2:25:7E:CD:CA:
                6E:B7:C7:73:30:57:94:AF:24:96:62:AA:FA:AE:F8:5A:CA:73:21:C8:
                7F:5F:EE:26:82:DD:9D:AF:82:A7:07:8C:24:EF:A5:B2:3F:60:0D:A0:
                C7:C0:E7:DC:2E:6A:96:B6:67:5B:01:8A:59:C8:40:D6:15:51:3E:03:
                A6:D2:5A:CB:37:81:63:AB:B7:03:F9:A5:BE:E2:B4:FF:CF:F1:7E:A5:
                CD:93:47:87:F5:3F:AF:14:7F:50:B2:72:4F:C7:21:91:38:35:F9:8A:
                FA:C9:EC:D9:6D:6C:5A:FA:89:D5:E2:10:C2:BC:47:49:88:B2:19:15:
                1F:E3:34:28:A1:90:23:1A:B0:27:06:F8:32:E8:29:67:F6:5F:3B:01:
                84:17:A7:F2:A2:C8:7B:51:02:CF:3D:B8:08:2C:3C:99:0F:BA:C5:25:
                64:2F:F6:F0:9B:B0:02:74:8A:16:84:3B:3D:D4:78:76:C8:E0:7D:F7:
                DE:9F:42:23:C4:75:92:60:FA:76:B5:E2
parm:           flatpanel:Enables experimental flat panel support for some chipsets. (0=disabled, 1=enabled, -1=autodetect) (default=-1) (int)
parm:           fpdither:Enables dithering of flat panel for 6 bits panels. (0=disabled, 1=enabled, -1=autodetect) (default=-1) (int)
parm:           hwcur:Enables hardware cursor implementation. (0 or 1=enabled) (default=0) (int)
parm:           noaccel:Disables hardware acceleration. (0 or 1=disable) (default=0) (int)
parm:           noscale:Disables screen scaling. (0 or 1=disable) (default=0, do scaling) (int)
parm:           paneltweak:Tweak display settings for flatpanels. (default=0, no tweaks) (int)
parm:           forceCRTC:Forces usage of a particular CRTC in case autodetection fails. (0 or 1) (default=autodetect) (int)
parm:           vram:amount of framebuffer memory to remap in MiB(default=0 - remap entire memory) (int)
parm:           mode_option:Specify initial video mode (charp)
parm:           bpp:pixel width in bits(default=8) (int)
parm:           reverse_i2c:reverse port assignment of the i2c bus (int)
parm:           nomtrr:Disables MTRR support (0 or 1=disabled) (default=0) (bool)

and you are right secure boot still working on Ubuntu
mokutil --sb-state

SecureBoot enabled

google told me use sudo mokutil --set-sbat-policy delete to disable it, shall I proceed with it? Please give me your advice and thx in advance

I have no idea about it too, but let me reboot again and collect nvidia-bug-report.log.gz for you

please do comment-out options nouveau modeset=0 also before doing so! ;-)

You should be able to disable Secure boot in the bios too.

1 Like

I think disabling secure-boot may only be done from UEFI/BIOS settingsā€¦
[@shelter was faster than me ;-) ]

now:
sudo mokutil --set-sbat-policy delete ā†’ blank output
options nouveau modeset=0 was commented out in /etc/modprobe.d/blacklist.conf
reboot

and I got
mokutil --sb-state:

SecureBoot enabled

I have set Boot option in BIOS from ā€œdisable legacy support and enable secure bootā€ to ā€œdisable legacy support and disable secure bootā€, seems itā€™s keep working.
Itā€™s HP Z8G4, shall I do anything more about secure boot or clear any thing?

update:
latest bug report:
nvidia-bug-report.log.gz (212.6 KB)

$ wget -qO- https://forums.developer.nvidia.com/uploads/short-url/8SDJ0lXid5O5yLk6gxyQdeW573K.gz |gunzip |grep -i nouveau
  * Nouveau is running: any attempt to disable it will not take effect until after a reboot.
/usr/share/vulkan/icd.d/nouveau_icd.x86_64.json
	Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

Iā€™m officially lostā€¦
Does lsmod |grep -i nouveau confirms that it is indeed loaded?

idk, as you can see, it outputs nothing for me
image

ah, so this info in the log file must have been stale (probably it included installer logs or sthā€¦).

I think I solved this problem: secure boot was not disabled successful at BIOS
I updated the BIOS by online flash tool from HP, then I got :
sudo mokutil --sb-state

SecureBoot disabled
Platform is in Setup Mode

then nvitop and smi and nvcc are back to life:

thanks for your help @shelter and @morgwai666 sincerely!

2 Likes