430.30 GRID vGPU driver fails to load on Linux (Tesla T4 x2)

Guest:
$ lspci | grep -i nvidia
02:02.0 VGA compatible controller: NVIDIA Corporation Device 1eb8 (rev a1)

$ tail /var/log/nvidia-installer.log
ERROR: Unable to load the ‘nvidia-drm’ kernel module.
ERROR: Installation has failed. Please see the file ‘/var/log/nvidia-installer.log’ for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

$ dmesg | tail
[ 585.162303] nvidia-nvlink: Nvlink Core is being initialized, major device number 241
[ 585.163241] vgaarb: device changed decodes: PCI:0000:02:02.0,olddecodes=none,decodes=none:owns=none
[ 585.163387] NVRM: The NVIDIA GPU 0000:02:02.0 (PCI ID: 10de:1eb8)
NVRM: installed in this system is not supported by the
NVRM: NVIDIA 430.30 driver release.
NVRM: Please see ‘Appendix A - Supported NVIDIA GPU Products’
NVRM: in this release’s README, available on the operating system
NVRM: specific graphics driver download page at www.nvidia.com.
[ 585.163715] nvidia: probe of 0000:02:02.0 failed with error -1
[ 585.163733] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 585.163734] NVRM: None of the NVIDIA devices were initialized.
[ 585.163946] nvidia-nvlink: Unregistered the Nvlink Core, major device number 241

$ uname -a
Linux <REMOVED> 3.10.0-957.12.1.el7.x86_64 #1 SMP Tue Apr 23 12:06:18 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
Target: x86_64-redhat-linux
Configured with: …/configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.8.5 20150623 (Red Hat 4.8.5-36.0.1) (GCC)

Host:
uname -a
VMkernel <REMOVED> 6.7.0 #1 SMP Release build-11675023 Jan 7 2019 19:29:34 x86_64 x86_64 x86_64 ESXi

esxcli software vib list | grep -i nvidia
NVIDIA-VMware_ESXi_6.7_Host_Driver 430.27-1OEM.670.0.0.8169922 NVIDIA VMwareAccepted 2019-07-22

nvidia-smi

Thu Jul 25 17:35:33 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.27       Driver Version: 430.27       CUDA Version: N/A      |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:05:00.0 Off |                    0 |
| N/A   46C    P8    18W /  70W |   7648MiB / 15359MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla T4            On   | 00000000:07:00.0 Off |                    0 |
| N/A   49C    P8    17W /  70W |     75MiB / 15359MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0   2111564    C+G   <REMOVED>                                   7566MiB |
+-----------------------------------------------------------------------------+

nvidia-smi vgpu
Thu Jul 25 17:36:07 2019

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.27                 Driver Version: 430.27                    |
|---------------------------------+------------------------------+------------+
| GPU  Name                       | Bus-Id                       | GPU-Util   |
|      vGPU ID     Name           | VM ID     VM Name            | vGPU-Util  |
|=================================+==============================+============|
|   0  Tesla T4                   | 00000000:05:00.0             |   0%       |
|      3251642093  GRID T4-8A     | 2111565  <REMOVED>           |      0%    |
+---------------------------------+------------------------------+------------+
|   1  Tesla T4                   | 00000000:07:00.0             |   0%       |
+---------------------------------+------------------------------+------------+

PowerCLI:
( get-vmhost ).ExtensionData.Config | select GraphicsInfo, SharedPassthruGpuTypes

GraphicsInfo SharedPassthruGpuTypes


{NVIDIATesla T4, NVIDIATesla T4} {grid_t4-8q, grid_t4-8c, grid_t4-8a, grid_t4-4q…}

( get-vmhost ).ExtensionData.Config.GraphicsConfig

HostDefaultGraphicsType SharedPassthruAssignmentPolicy DeviceType


sharedDirect performance {0000:05:00.0, 0000:07:00.0}

( ( $myvm | get-view ).Config.Hardware.Device | where-object Key -eq 13000 ).Backing

Vgpu

grid_t4-8a

Apparently the Linux driver does not like the ‘a’ profile. Switched to ‘q’ profile and the driver now loads, although I have not finished testing, fully.