I have an Ubuntu 20.04 VM under ESXi with a Quadro P400 passed through and for a couple years, it worked beautifully for transcoding for Plex. Suddenly, I suspect due to automatic updates being configured for this VM, it suddenly stopped working and the output of nvidia-smi is simply “No devices Found”. After hours of troubleshooting, I decided to spin a Debian 12 VM and start fresh following this guide: https://phoenixnap.com/kb/nvidia-drivers-debian
Unfortunately, it doesn’t work and the result is the same.
lspci -nn | egrep -i "3d|display|vga"
00:0f.0 VGA compatible controller [0300]: VMware SVGA II Adapter [15ad:0405]
13:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P400] [10de:1cb3] (rev a1)
nvidia-detect
Detected NVIDIA GPUs:
13:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P400] [10de:1cb3] (rev a1)
Checking card: NVIDIA Corporation GP107GL [Quadro P400] (rev a1)
Your card is supported by all driver versions.
Your card is also supported by the Tesla 470 drivers series.
It is recommended to install the
nvidia-driver
lsmod | grep nvidia
nvidia_uvm 1540096 0
nvidia_drm 77824 0
nvidia_modeset 1314816 1 nvidia_drm
video 65536 1 nvidia_modeset
nvidia 56795136 2 nvidia_uvm,nvidia_modeset
drm_kms_helper 212992 4 vmwgfx,nvidia_drm
drm 614400 8 vmwgfx,drm_kms_helper,nvidia,drm_ttm_helper,nvidia_drm,ttm
sudo dmesg | grep -i nvidia
[ 2.130300] nvidia: loading out-of-tree module taints kernel.
[ 2.130313] nvidia: module license 'NVIDIA' taints kernel.
[ 2.251731] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 2.447507] nvidia-nvlink: Nvlink Core is being initialized, major device number 245
[ 2.456455] nvidia 0000:13:00.0: enabling device (0000 -> 0003)
[ 2.457208] nvidia 0000:13:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[ 2.581646] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 535.216.01 Tue Sep 17 16:54:04 UTC 2024
[ 2.601294] audit: type=1400 audit(1743860140.695:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=490 comm="apparmor_parser"
[ 2.601298] audit: type=1400 audit(1743860140.695:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=490 comm="apparmor_parser"
[ 2.963338] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 535.216.01 Tue Sep 17 16:46:49 UTC 2024
[ 3.309517] [drm] [nvidia-drm] [GPU ID 0x00001300] Loading driver
[ 3.309520] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:13:00.0 on minor 1
[ 26.703549] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
[ 26.737104] nvidia-uvm: Loaded the UVM driver, major device number 243.
nvidia-smi
No devices were found
Could this be hardware related even though the host sees the GPU?
Logs:
nvidia-bug-report.log (870.9 KB)

