I have a machine which works well with NVIDIA 460.86 driver.
Recently I need run something with CUDA 11.5, so I decide to upgrade the NVIDIA driver to 510.60.02 via ubuntu-drivers. But for many times tries it still not work: “No devices were found” is what I get by nvidia-smi
.
From dmesg I found some message, it seems something wrong with snd_hda_intel:
[ 2.728502] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
[ 2.738022] snd_hda_intel 0000:01:00.1: Disabling MSI
[ 2.738028] snd_hda_intel 0000:01:00.1: Handle vga_switcheroo audio client
[ 2.738163] snd_hda_intel 0000:06:00.1: Disabling MSI
[ 2.738167] snd_hda_intel 0000:06:00.1: Handle vga_switcheroo audio client
[ 2.742174] snd_hda_intel 0000:00:1f.3: no codecs found!
[ 2.769305] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input13
[ 2.769364] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input14
[ 2.769414] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input15
[ 2.769464] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input16
[ 2.769511] input: HDA NVidia HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input17
[ 2.769556] input: HDA NVidia HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input18
[ 2.769603] input: HDA NVidia HDMI/DP,pcm=12 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input19
[ 2.771480] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.1/0000:06:00.1/sound/card2/input20
[ 2.771634] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.1/0000:06:00.1/sound/card2/input21
[ 2.771774] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.1/0000:06:00.1/sound/card2/input22
[ 2.771870] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.1/0000:06:00.1/sound/card2/input23
[ 2.771933] input: HDA NVidia HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:01.1/0000:06:00.1/sound/card2/input24
[ 2.772066] input: HDA NVidia HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:01.1/0000:06:00.1/sound/card2/input25
[ 2.772221] input: HDA NVidia HDMI/DP,pcm=12 as /devices/pci0000:00/0000:00:01.1/0000:06:00.1/sound/card2/input26
[ 2.779522] nvidia 0000:06:00.0: enabling device (0000 -> 0003)
[ 2.779630] nvidia 0000:06:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[ 2.822503] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.60.02 Wed Mar 16 11:24:05 UTC 2022
[ 2.847713] usb 1-10: reset high-speed USB device number 4 using xhci_hcd
[ 2.849613] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 510.60.02 Wed Mar 16 11:17:28 UTC 2022
[ 2.851203] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[ 2.871504] loop9: detected capacity change from 0 to 126760
[ 2.872835] intel_tcc_cooling: Programmable TCC Offset detected
[ 2.876060] intel_rapl_common: Found RAPL domain package
[ 2.876061] intel_rapl_common: Found RAPL domain core
[ 2.876063] intel_rapl_common: Found RAPL domain dram
[ 2.876065] intel_rapl_common: RAPL package-0 domain package locked by BIOS
[ 2.876069] intel_rapl_common: RAPL package-0 domain dram locked by BIOS
[ 2.998090] mt7601u 1-10:1.0: ASIC revision: 76010001 MAC revision: 76010500
[ 3.029930] mt7601u 1-10:1.0: EEPROM ver:0d fae:00
[ 3.243942] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[ 3.244144] usbcore: registered new interface driver mt7601u
[ 3.245823] mt7601u 1-10:1.0 wlx1cbfce8b566c: renamed from wlan0
[ 3.599635] loop10: detected capacity change from 0 to 509456
[ 4.051841] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x56:1463)
[ 4.051970] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 4.052327] [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice
[ 4.052509] [drm:nv_drm_probe_devices [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to register device
[ 4.052608] [drm] [nvidia-drm] [GPU ID 0x00000600] Loading driver
[ 4.319615] loop11: detected capacity change from 0 to 133552
[ 5.191284] NVRM: GPU 0000:06:00.0: RmInitAdapter failed! (0x26:0x56:1463)
[ 5.191311] NVRM: GPU 0000:06:00.0: rm_init_adapter failed, device minor number 1
[ 5.191381] [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000600] Failed to allocate NvKmsKapiDevice
[ 5.191526] [drm:nv_drm_probe_devices [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000600] Failed to register device
[ 5.194701] nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint.
[ 5.196445] nvidia-uvm: Loaded the UVM driver, major device number 506.
Please help.
Thanks very much.
nvidia-bug-report.log.gz (249.4 KB)