I am trying to setup a vGPU cluster using Tesla v100 32GB GPUs on several HP ProLiant DL380 Gen10 running ESXI 6.7u3. So far I have been able to setup the vSAN and install the NVIDIA .vib version 11.2, and can successfully run
nvidia-smi on each host.
I now attempting to configure a VM, adding a single I have tried adding a single
V100D-16C vGPU to a VM and installing the 450.89 grid driver with and without DKMS, however cannot seem to load the kernel module. Dmesg tell me that the "PCI I/O region assigned to your device is invalid: NVRM …" and the supposed address of the GPU. I get the same error on Debian Buster, and Ubuntu Server bionic and focal.
I have checked the hidden settings in the HP BIOS, and is enabled "PCI Express 64-bit BAR Support". ECC should be working fine on a "C" style VGPU.
Is there anything I am missing?