RTX 2080ti -- No devices found when running nvidia-smi

Hello,

I recently reached out to support and they directed me to this this forum for linux support as I am running Ubuntu.

As the topic suggests I am no longer able to detect my RTX 2080ti through nvidia-smi and I’m hoping to get assistance troubleshooting. It had been a little while since I last used it but I was not having any issue before. I was previously running headless drivers as I want to use the GPU for my deep learning workloads only, but through my troubleshooting I have ditched headless drivers for now.

I’ve attempted to…

  • Install various driver versions
  • Remove and put the GPU back in to make sure it is seated properly
  • Reinstalled drivers
  • Fresh install of ubuntu with reinstalled drivers

Throughout these troubleshooting attempts I’ve had varying degrees of success from not being able to boot properly to being able to boot but still getting the “No devices found” error when running nvidia-smi. Even with the fresh install of ubuntu, with fresh driver install, and reinstalling the GPU I still get the result “No devices found” when I run nvidia-smi.

Currently I am running:

  • Ubuntu 18.04.4 LTS
  • Kernel version: 4.15.0-96-generic
  • NVIDIA driver metapackage from nvidia-driver-435 (proprietary) installed through Ubuntu Software & Updates

Here is output of lspci | grep -i nvidia:

01:00.0 VGA compatible controller: NVIDIA Corporation GV102 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 10f7 (rev a1)
01:00.2 USB controller: NVIDIA Corporation Device 1ad6 (rev a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1ad7 (rev a1)

nvidia-bug-report.log (1.9 MB)

I’ve attached the bug report associated with running nvidia-bug-report.sh with the system in its current state. One thing of note that I haven’t seem to find a solution to is that the log is reporting:

NVRM: GPU 0000:01:00.0: RmInitAdapter failed!

Please let me know if there is any more information I can provide. I appreciate any help or suggestions!

Most likely broken. Since you already reseated the card in its slot, I can only advise to check if it works in another system and RMA if possible.

Thanks for confirming my suspicions. Unfortunately I do not have another system to test in so I will look to pursue RMA.

Hi, I’m having the same issue with my RTX 2080Ti. I’ve tried reboot the server/reseat the card but none work.
Here is my bug report:
nvidia-bug-report.log.gz (1.7 MB)
System info:

OS: Ubuntu 18.04
Driver Version: 450.102.04
GPUs: 1 x RTX2080Ti

nvidia-smi gives:

No devices were found

lspci |grep -i nvidia give:

01:00.0 VGA compatible controller: NVIDIA Corporation GV102 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 10f7 (rev a1)
01:00.2 USB controller: NVIDIA Corporation Device 1ad6 (rev a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1ad7 (rev a1)

dmesg | grep NVRM give:

[ 2619.071087] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0x65:1224)
[ 2619.071108] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

sudo lshw -C display give:

*-display
description: VGA compatible controller
product: GV102
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:01:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
configuration: driver=nvidia latency=0
resources: irq:16 memory:74000000-74ffffff memory:60000000-6fffffff memory:72000000-73ffffff ioport:3000(size=128) memory:c0000-dffff

lspci -v -s $(lspci | grep ' VGA ' | cut -d" " -f 1) give:

01:00.0 VGA compatible controller: NVIDIA Corporation GV102 (rev a1) (prog-if 00 [VGA controller])
Subsystem: Micro-Star International Co., Ltd. [MSI] Device 3770
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at 74000000 (32-bit, non-prefetchable) [size=16M]
Memory at 60000000 (64-bit, prefetchable) [size=256M]
Memory at 72000000 (64-bit, prefetchable) [size=32M]
I/O ports at 3000 [size=128]
[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities:
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

Can anyone support me with it? Thanks in advanced!