Unable to run CUDA Sample Code

I’m not sure if this is the right place for this question, but I was directed to the vGPU forums from the Dev forums.

I have a P40 (GRID 5.2, Windows Driver 386.09), with Dell R740xd, VMWare ESXi 6.5 and Horizon 7.1.

I am able to compile the sample CUDA code fine in C++ (Windows 10. Visual Studio 2015), but at runtime, I am getting a cudaErrorInsufficientDriver(35).

Does anyone know if there is a setting that I’m missing to make this work?

The system seems to recognize the GPU, and using Tensorflow, I am able to use the GPU’s for processing.