Virtual GPU using GRID and Accelerated Computing?


We are currently running a server with VMWare and a Nvidia Tesla M6.

I have succesfully configured a SINGLE Virtual Machine (Ubuntu Server 16.04) to run on that server, by ‘passing through’ the Nvidia card to the SINGLE VM. CUDA is working, cuDNN working…

Unfortunately, this means the GPU card can only be used by a single VM, which kinda reduces its usefulness for us.

I am aware of Nvidia Grid software, that could ‘cut up’ the card and let the GPU be used by multiple VM’s, but this seems to be aimed at running the card as a graphics unit for faster/better graphics in VDI’s

Does anybody know if GRID and vGPU’s can be used in Accelerated Computing setups, so that multiple VM’s could use the card at the same time, but independant from eachother???


Currently, the only way GRID supports CUDA/compute is when the vGPU profile used corresponds to the “largest” profiles that effectively assign an entire GPU (ie. the entire M6 in your case) to a single VM. Conceptually not any different than PCI passthrough. There are probably some “manageabillity” benefits, but it still requires the whole GPU per VM.

This may change in the future, of course.