My hardware is an HP Z2 G4 with an i5-8500 with 64GB RAM and a PNY GeForce RTX 4060 Ti 16 GB. This host machine is running ESXi 8.0.3 updated from 8.0.1 as part of attempt to fix this problem. The guest VM is running Ubuntu 22.04 LTS.
I configured the GPU passthru using the following guide…
I specifically did the build with this driver version: ./NVIDIA-Linux-x86_64-535.183.01.run -m=kernel-open
nvidia-smi gives No devices were found
The device is visible as follows:
lspci -nnk | grep -A 3 -i vga
00:0f.0 VGA compatible controller [0300]: VMware SVGA II Adapter [15ad:0405]
Subsystem: VMware SVGA II Adapter [15ad:0405]
Kernel driver in use: vmwgfx
Kernel modules: vmwgfx
02:00.0 Serial Attached SCSI controller [0107]: VMware PVSCSI SCSI Controller [15ad:07c0] (rev 02)
02:05.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2803] (rev a1)
DeviceName: pciPassthru1
Subsystem: PNY Device [196e:13ec]
Kernel driver in use: nvidia
IOMMU is on:
sudo dmesg | grep -i dmar
[sudo] password for gabor:
[ 0.138913] DMAR: IOMMU enabled
But I don’t see any IOMMU groups which seems like a major problem.
The other seemingly important diagnostic information I found is in the /var/log/hostd.log file on the host machine. The following messages appear when I run nvidia-smi:
NvidiaVgpuInfo: Failed to open nvidia library
NvidiaDeviceGroupInfo: vgpuInfo not available.
I hope someone can help me sort this out. Bug report file should be attached.
Processing: nvidia-bug-report.log.gz…