My opinion: NVidia and commercial hypervisor vendors unbelievable escalated the cost of vGPU technology and NVidia support is unusable for me. Ok, this approach forced me to accept challenge to resolve this problem with own simplified (NVidia unsupported) Xen based virtualization stack. Few days of experiments (to make it compatible with GTX/Quadro) and now vGPU can be run with any Xen (few public XenServer patches and some more for vgpu ioreq_server (Citrix stops to distribute vgpu sources from XenServer 7.5)), any compatible NVidia GTX or Quadro (small 42 lines "magic" script daemon and change one line in driver install script but no need to patch any NVIDIA files), any linux kernel and any linux distribution. I will not publish how - challenge yourself.
I used my notebook (i7-4710HQ/C220 VT-d aware with Intel graphics (as display frontend) and NVidia GTX860M (as virtualization backend)) as demonstration of Xen with vGPU virtualization server (Dom0 - xen4.10.1 with Fedora 28 and kernel 4.16, Windows DomU - Win2008r2 with enabled CUDA and running GPU-Z, CUDA-Z, Unigine Haeven, Linux DomU - CentOS7.5 with enabled CUDA and running nvenc Video Codec SDK sample):
UPDATES:
Verified with Grid 4.6 (license less), 6.1 …
Verified with Grid/vGPU SW 6.2 - xen 4.11, dom0 - Fedora 28/kernel 4.16 (kernels 4.17-4.19 do not work with xen)
Verified with vGPU SW 7.1 - xen 4.11, dom0 - Fedora 28/kernel 4.16/4.20, xen 4.11.1 - Fedora 29/kernel 4.20 (with Quadro K2200)
Verified with vGPU SW 8.0 - xen 4.12, dom0 - Fedora 29/kernel 5.0 (with Quadro K2200)
Verified with vGPU SW 9.0 - xen 4.12, dom0 - Fedora 29/kernel 5.0 (with Quadro K2200)