I have tried multiple ubuntu version and nothing seems to work as with all my other Nvidia builds.
Is it the A40? Any ideas?
[ 2.021946] nvidia: loading out-of-tree module taints kernel.
[ 2.021955] nvidia: module license ‘NVIDIA’ taints kernel.
[ 2.021956] Disabling lock debugging due to kernel taint
[ 2.022125] kvm: Nested Virtualization enabled
[ 2.022145] SVM: kvm: Nested Paging enabled
[ 2.025494] Huh? What family is it: 0x19?!
[ 2.034825] pcieport 0000:00:1c.3: pciehp: Failed to check link status
[ 2.040325] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 2.042761] pcieport 0000:00:1c.1: pciehp: Failed to check link status
[ 2.050764] pcieport 0000:00:1c.2: pciehp: Failed to check link status
[ 2.053052] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[ 2.054912] NVRM: request_mem_region failed for 0M @ 0x0. This can
NVRM: occur when a driver such as rivatv is loaded and claims
NVRM: ownership of the device’s registers.
[ 2.055355] nvidia: probe of 0000:01:00.0 failed with error -1
[ 2.055370] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 2.055370] NVRM: None of the NVIDIA devices were initialized.
[ 2.055964] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237
[ 2.074192] audit: type=1400 audit(1624618208.032:2): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“nvidia_modprobe” pid=596 comm=“apparmor_parser”
[ 2.074196] audit: type=1400 audit(1624618208.032:3): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“nvidia_modprobe//kmod” pid=596 comm=“apparmor_parser”
[ 2.074735] audit: type=1400 audit(1624618208.032:4): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“/usr/bin/man” pid=598 comm=“apparmor_parser”
[ 2.074738] audit: type=1400 audit(1624618208.036:5): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“man_filter” pid=598 comm=“apparmor_parser”
[ 2.074739] audit: type=1400 audit(1624618208.036:6): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“man_groff” pid=598 comm=“apparmor_parser”
[ 2.074785] audit: type=1400 audit(1624618208.036:7): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“libreoffice-oopslash” pid=591 comm=“apparmor_parser”
[ 2.075746] audit: type=1400 audit(1624618208.036:8): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“libreoffice-senddoc” pid=590 comm=“apparmor_parser”
[ 2.076467] audit: type=1400 audit(1624618208.036:9): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“lsb_release” pid=601 comm=“apparmor_parser”
[ 2.076663] audit: type=1400 audit(1624618208.036:10): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“/usr/sbin/cups-browsed” pid=592 comm=“apparmor_parser”
[ 2.356483] Huh? What family is it: 0x19?!
[ 2.411622] Huh? What family is it: 0x19?!
[ 2.463269] Huh? What family is it: 0x19?!
[ 2.531640] Huh? What family is it: 0x19?!
[ 2.711943] Huh? What family is it: 0x19?!
[ 2.716228] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[ 2.717653] NVRM: request_mem_region failed for 0M @ 0x0. This can
NVRM: occur when a driver such as rivatv is loaded and claims
NVRM: ownership of the device’s registers.
[ 2.717986] nvidia: probe of 0000:01:00.0 failed with error -1
[ 2.718003] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 2.718003] NVRM: None of the NVIDIA devices were initialized.
[ 2.718226] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237
k@u2104serv01:~$ nvidia-smi
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
k@u2104serv01:~$ sudo lspci -s 01:00 -v
01:00.0 3D controller: NVIDIA Corporation GA102GL [RTX A40] (rev a1)
Subsystem: NVIDIA Corporation Device 145a
Physical Slot: 0
Flags: fast devsel, IRQ 16
Memory at (32-bit, non-prefetchable) [disabled]
Memory at (64-bit, prefetchable) [disabled]
Memory at (64-bit, prefetchable) [disabled]
Capabilities: [60] Power Management version 3
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [c8] MSI-X: Enable- Count=6 Masked- Capabilities: [100] Virtual Channel Capabilities: [250] Latency Tolerance Reporting Capabilities: [128] Power Budgeting <?>
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidiak@u2104serv01:~$ lspci -n -s 01:00
01:00.0 0302: 10de:2235 (rev a1)
k@u2104serv01:~$ grep nvidia /etc/modprobe.d/* /lib/modprobe.d/*
/etc/modprobe.d/blacklist-framebuffer.conf:blacklist nvidiafb
k@u2104serv01:~$ grep nouv /etc/modprobe.d/* /lib/modprobe.d/*
/lib/modprobe.d/nvidia-graphics-drivers.conf:blacklist nouveau
/lib/modprobe.d/nvidia-graphics-drivers.conf:blacklist lbm-nouveau
/lib/modprobe.d/nvidia-graphics-drivers.conf:alias nouveau off
/lib/modprobe.d/nvidia-graphics-drivers.conf:alias lbm-nouveau off
k@u2104serv01:~$ sudo modprobe nvidia
modprobe: ERROR: could not insert ‘nvidia’: No such device
nvidia-bug-report.log.gz (725.1 KB)