I have an Ubuntu Server 20.04.3 LTS (kernel 5.4.0) VM with a 3070 passed through via ESXI. It has been running stable for months with v470 drivers until a few days ago when the card was no longer recognized by the driver it seems (nvidia-smi
produces “No devices were found”) .
What I have tried:
- Spun up a Windows 10 VM and passed through the GPU, installed the driver and used it without issues, so it does not appear to be hardware.
- Tested another GPU in the system with the same results.
- Spun up 2 more Ubuntu Server VMs (one bios install and one EFI install) with the same results (
No devices were found
) - Purged and reinstalled the driver + CUDA many times, trying 450, 470, 470-sever, and now 510 with the same results each time.
- Updated the motherboard BIOS to the latest v2.3 (Supermicro H12SSL-CT)
- Reinstalled ESXI 7.0 on the host.
Would appreciate any help to solve this! I have attached my bug report as well.
I have seen some mentions the dmesg
output could mean the GPU has failed but it works fine with Windows VMs and works fine in other machines.
nvidia-smi
No devices were found
sudo lspci |grep -i nv
03:00.0 VGA compatible controller: NVIDIA Corporation Device 2484 (rev a1)
03:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)
dmesg
(After running nvidia-smi
)
[ 1606.332778] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x26:0x56:1463)
[ 1606.332912] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[ 1607.004207] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x26:0x56:1463)
[ 1607.004349] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
cat /etc/modprobe.d/blacklist-nvidia-nouveau.conf
blacklist nouveau
options nouveau modeset=0
cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
GCC version: gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Fri_Dec_17_18:16:03_PST_2021
Cuda compilation tools, release 11.6, V11.6.55
Build cuda_11.6.r11.6/compiler.30794723_0
nvidia-bug-report.log.gz (664.4 KB)