The documentation seems to indicate that the P40 does not require a gpumodeswitch. However, after installing the NVIDIA GRID VIB, I see the following (dmesg):
2017-11-07T00:15:17.689Z cpu15:69668)NVRM: loading NVIDIA UNIX x86_64 Kernel Module 384.73 Mon Aug 21 15:16:25 PDT 2017
2017-11-07T00:15:17.689Z cpu15:69668)
2017-11-07T00:15:17.689Z cpu15:69668)Device: 191: Registered driver ‘nvidia’ from 91
2017-11-07T00:15:17.690Z cpu15:69668)Mod: 4968: Initialization of nvidia succeeded with module ID 91.
2017-11-07T00:15:17.690Z cpu15:69668)nvidia loaded successfully.
2017-11-07T00:15:17.691Z cpu13:66219)IOMMU: 2176: Device 0000:3b:00.0 placed in new domain 0x4304cc3e8af0.
2017-11-07T00:15:17.691Z cpu13:66219)DMA: 945: Protecting DMA engine ‘NVIDIADmaEngine’. Putting parent PCI device 0000:3b:00.0 in IOMMU domain 0x4304cc3e8af0.
2017-11-07T00:15:17.691Z cpu13:66219)DMA: 646: DMA Engine ‘NVIDIADmaEngine’ created using mapper ‘DMAIOMMU’.
2017-11-07T00:15:17.691Z cpu13:66219)NVRM: This is a 64-bit BAR mapped above 16 TB by the system
NVRM: BIOS or the VMware ESXi kernel. This PCI I/O region assigned
NVRM: to your NVIDIA device is not supported by the kernel.
NVRM: BAR1 is 32768M @ 0x3820$
This is with vSphere 6.5 Enterprise Plus. I am unable to install the gpumodeswitch VIB to even try it out…
Perfect. Thanks for sharing. I’m sure other customers will run into the same issue with this new hardware. I will try to get in contact with DELL to have this a default config for GPU enabled systems.
Thank you for sharing. I had the same problem and solved it with the above bios changes.
After changing:
SR-IOV Global Enable to Enabled
Memory Mapped I/O Base to 512 GB
The server recognizes the GPU card and the nvidia-smi command works fine.
But… When trying to power on a vm with shared pci device, I get the following error:
could not initialize plugin ‘/usr/lib64/vmware/plugin/libnvidia-vgx.so’ for vgpu ‘grid_p40-2q’