M60 on ESXi: No Profiles

I’ve installed the VIB that shows up in the licensing portal, but the profiles aren’t showing up in the VM settings. We’re runnin ESXi, 6.5.0, 7967591 (and View 7.2). The vib appears to be installed properly, when I run:

esxcli software vib list | grep -i nvidia

I get:

NVIDIA-VMware_ESXi_6.5_Host_Driver 410.68-1OEM.650.0.0.4598673 NVIDIA VMwareAccepted 2018-12-04

However when I try to run:

nvidia-smi

I get:

Failed to initialize NVML: Unknown Error

I’ve verified the card is in graphics mode (as opposed to compute).

I’ve seen a few comments suggesting that I may have the wrong VIB, but it is the only one offered in the licensing portal. Does anybody have any ideas?

Hi,

you should run dmesg on your host to figure out what the issue is. I assume a BIOS issue with MMIO. Which server are we talking about? Dell R740?

Regards

Simon

1 Like

Hi Simon, thanks for responding.

To answer your question, the server is a Cisco C240-M4SX.

After I posted this message I ran across:

And as an experiment I enabled DirectPath, and at that point the profiles started showing up in the VM settings. After disabling DirectPath again (and after rebooting) I get the following:

dmesg | grep -i nvidia

VMB: 323:    name: /NVIDIA_V.v00
2018-12-05T18:28:48.612Z cpu0:65536)VisorFSTar: 1982: NVIDIA_V.v00 for 0x482d082 bytes
2018-12-05T18:29:03.431Z cpu13:66178)Loading module nvidia ...
2018-12-05T18:29:03.450Z cpu13:66178)Elf: 2043: module nvidia has license NVIDIA
2018-12-05T18:29:03.862Z cpu13:66178)NVRM: loading NVIDIA UNIX x86_64 Kernel Module  410.68  Sat Oct 13 22:59:52 CDT 2018
2018-12-05T18:29:03.862Z cpu13:66178)Device: 191: Registered driver 'nvidia' from 20
2018-12-05T18:29:03.863Z cpu13:66178)Mod: 4968: Initialization of nvidia succeeded with module ID 20.
2018-12-05T18:29:03.863Z cpu13:66178)nvidia loaded successfully.
2018-12-05T18:29:46.532Z cpu30:67021)Starting service nvidia-init
2018-12-05T18:29:46.532Z cpu30:67021)Activating Jumpstart plugin nvidia-init.
2018-12-05T18:29:46.555Z cpu3:68278)ALERT: NVIDIA: module load failed during VIB install/upgrade.
2018-12-05T18:29:46.564Z cpu0:68279)NVIDIA: Starting vGPU Services.
2018-12-05T18:29:46.578Z cpu1:68282)NVIDIA: Starting Xorg service.
2018-12-05T18:29:48.051Z cpu20:68427)NVIDIA: Starting the DCGM node engine.
2018-12-05T18:29:54.596Z cpu26:67021)Jumpstart plugin nvidia-init activated.

lspci | grep NVIDIA

0000:8f:00.0 Display controller: NVIDIA Corporation NVIDIATesla M60 [vmgfx0]
0000:90:00.0 Display controller: NVIDIA Corporation NVIDIATesla M60 [vmgfx1]

xorg will try to start, and then stop.

If anybody runs into this on ESXi 6.5, it turns out that there’s a problem with the size of the VIB being a little to big for ESXi. Apply this patch:

As I understand it, this should not be a problem on 6.7.

Hello sschaber
i have same problem and run dmesg on my host ESXi 7.0 con servidor Dell R740
I get:
ALERT: NVIDIA: module load failed during VIB install/upgrade.
i want virtualized a GPU RTX Quadro 6000
so on my vcenter enviroment my graphix card is available but whitout memory
can you help me