Nvidia-smi show "No devices were found" after installing NVIDIA-Linux-x86_64-535.86.05.run on Ubuntu 20.04 for RTX 3060

nvidia-bug-report.log.gz (386.4 KB)

No error during installing NVIDIA-Linux-x86_64-535.86.05.run, nvidia-bug-report.log is attached, thanks in advance.

→ An alternate method of installing the NVIDIA driver was detected. (This is usually a package provided by your distributor.) A driver installed via that method may integrate better with your system than a driver installed by nvidia-installer.

You installed the .run file version over a distro package version.
This usually creates a mess.
Now clean everything up:
With version 535.86.05 .run file:

nvidia-installer --uninstall
sudo apt purge '*nvidia*'

Then do a clean install of the nvidia driver.
It’s generally recommended to use the distro package:
sudo apt install nvidia-driver-XXX

Remove the nomodeset kernel parameter.
Reboot.

I’m also having the same error with my 3060 in Ubuntu 22.04.3 LTS :( Yesterday, everything was working fine, but today I woke up to the nvidia-smi showing the “No devices were found message”. I tried to revert to the 525 driver to no avail. I really hope this gets fixed fairly quickly.

Attached is my nvidia-bug-report.log.gz
nvidia-bug-report.log.gz (138.6 KB)

I just wonder how you got the idea of having the same error?

From your log:

ago 08 09:03:10 centurysturgeon kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 535.86.05 Fri Jul 14 20:20:58 UTC 2023
ago 08 09:03:10 centurysturgeon kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
ago 08 09:03:11 centurysturgeon kernel: NVRM: GPU at PCI:0000:01:00: GPU-9fbb641f-6585-c828-db61-e5edc648780c
ago 08 09:03:11 centurysturgeon kernel: NVRM: Xid (PCI:0000:01:00): 62, pid=‘’, name=, 0000(0000) 00000000 00000000
ago 08 09:03:59 centurysturgeon kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x25:0x65:1470)
ago 08 09:03:59 centurysturgeon kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

According to this:
https://docs.nvidia.com/deploy/xid-errors/index.html

Xid 62 = Hardware error, Driver error, or Thermal issue.
As reverting to a working driver didn’t help, I’d suspect something hardware related.

Hello Mart, thank you so much for your reply. This is the very first graphics card I own, so I apologize if I may come as ignorant in certain regards.

So yesterday my computer was running fine, but today when I rand the nvidia-smi command it said it didn’t find any device. So I ran the “sudo lshw -C display” command to see if the system detected it (which it did, model and everything) so then I supposed it was a driver issue.

Next, I ran the “dkms status” which showed me the driver version 535 instead of the one that was previously shown when running nvidia-smi (525 or something around that number). I figured that the driver might be the issue, since that happened to me in the past, so I purged “sudo apt purge nvidia* libnvidia*” and reinstalled “sudo apt-get install nvidia-driver-525” the 525 version which also didn’t work.

At this point, I suspected the hardware might be the issue, and following to your reply, I borrowed my friends PC to test the GPU. I plugged it in and tested it by running Cyberpunk in high settings and GForce experience (both of them showing the card is working correctly).

So at this point, I’m not sure on how to proceed. But I’m open to any advice on how to fix it :)

From the hardware side I’d do:

Make sure the PCI slot is dust free.
Make sure the card is well seated and the cables well connected.
Make sure the RAM is ok, by running memtest.

Maybe look for a BIOS update.
Did the kernel also get upgraded, when the card broke? I’d try an older, or a newer kernel.

Searching this forum for the

ago 08 09:03:59 centurysturgeon kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x25:0x65:1470)
ago 08 09:03:59 centurysturgeon kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
ago 08 09:03:59 centurysturgeon kernel: [drm:nv_drm_load [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice

error, I found this:

Looks very similar.

Hi Mart, I was just about to go through everything you said. But when I ran the nvidia-smi command, suddenly everything seemed to be just fine!

±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 535.86.05 CUDA Version: 12.2 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:01:00.0 Off | N/A |
| 0% 32C P8 9W / 170W | 20MiB / 12288MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1112 G /usr/lib/xorg/Xorg 9MiB |
| 0 N/A N/A 1368 G /usr/bin/gnome-shell 3MiB |
±--------------------------------------------------------------------------------------+

I think some cable must’ve been loose or something along the way. Thank you so much for all your help!

Hi!
I have a similar problem. I just installed Ubuntu 22.04 on a VM that has a GeForce RTX 3090 attached as GPU.

I installed Nvidia driver by:

sudo apt install nvidia-driver-535-server

then did reboot. Here is lshw -c video output:

*-display
description: VGA compatible controller
product: SVGA II Adapter
vendor: VMware
physical id: f
bus info: pci@0000:00:0f.0
logical name: /dev/fb0
version: 00
width: 32 bits
clock: 33MHz
capabilities: vga_controller bus_master cap_list rom fb
configuration: depth=32 driver=vmwgfx latency=64 resolution=1176,885
resources: irq:16 ioport:1070(size=16) memory:e8000000-efffffff memory:fe000000-fe7fffff memory:c0000-dffff
*-display
description: VGA compatible controller
product: GA102 [GeForce RTX 3090]
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:03:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller bus_master cap_list
configuration: driver=nvidia latency=248
resources: irq:18 memory:fc000000-fcffffff memory:d0000000-dfffffff memory:e4000000-e5ffffff ioport:4000(size=128)

Here is also dpkg -l |grep nvidia output:

ii libnvidia-cfg1-535-serv> er:amd64 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA binary OpenGL/GLX configuration library
ii libnvidia-common-535-server 535.161.08-0ubuntu2.22.04.1 all Shared files used by the NVIDIA libraries
ii libnvidia-compute-535-server:amd64 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA libcompute package
ii libnvidia-decode-535-server:amd64 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA Video Decoding runtime libraries
ii libnvidia-encode-535-server:amd64 535.161.08-0ubuntu2.22.04.1 amd64 NVENC Video Encoding runtime library
ii libnvidia-extra-535-server:amd64 535.161.08-0ubuntu2.22.04.1 amd64 Extra libraries for the NVIDIA Server Driver
ii libnvidia-fbc1-535-server:amd64 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
ii libnvidia-gl-535-server:amd64 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii nvidia-compute-utils-535-server 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA compute utilities
ii nvidia-dkms-535-server 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA DKMS package
ii nvidia-driver-535-server 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA Server Driver metapackage
ii nvidia-firmware-535-server-535.161.08 535.161.08-0ubuntu2.22.04.1 amd64 Firmware files used by the kernel module
ii nvidia-kernel-common-535-server 535.161.08-0ubuntu2.22.04.1 amd64 Shared files used with the kernel module
ii nvidia-kernel-source-535-server 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA kernel source package
ii nvidia-utils-535-server 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA Server Driver support binaries
ii xserver-xorg-video-nvidia-535-server 535.161.08-0ubuntu2.22.04.1 amd64 NVIDIA binary Xorg driver

But nvidia-smi prints No devices were found.

Here is the nvidia-bug-report.sh output.
nvidia-bug-report.log.gz (155.4 KB)

Thanks in advance.