We have a Dell R730 running VMware ESXi, 6.5.0, 13932383 with a Tesla M10 GPU installed.
A colleague yesterday installed the NVIDIA ESX VIB onto it and configured GPU passthrough
This morning it gave a purple screen of death as below.
IOMMU Fault detected for (vmgfx1/nvidia)
NOTE: Backtrace likely does not yield the culprit.
I opened a case with VMWare support who responded with the following.
[i]We had a vmkernel panic with IOMMU fault.
The IOMMU fault happened because the PCI device (0000:06:00.0 which is the nvidia graphics pcie device) trying to access the memory address (IOaddr: 0x6055b0d000) via DMA operation which nvidia device (vmgfx1/nvidia) is NOT intended to access the memory and IOMMU unit faulted the illegal memory access and panic the system.
The illegal DMA memory access may be caused by buggy nvidia driver or nvidia firmware running inside the card.
Kindly check with you NVIDIA if there is further update on the driver and firmware that can be performed.
We don’t have an active NVIDIA support contract so I’m hoping someone here has experienced similar and has a solution?