Support request for combined vGPU types in one VM

It would be really nice instead of having to provision a separate VM master and create a separate catalog for each GPU type to be able to support them all and be able to just change the definition of which vGPU (e.g., K260Q, K240Q, K220Q, K200) just by changing the allocated type via XenServer (e.g., via XenCenter). Is that possible or are there too many driver and configuration issues involved?

If you want to change the GPU profile type assigned to a particular VM, you can do this already from the dom0 console or XenCenter. First you shutdown the target VM, reassign a new profile type, and then just reboot. There is no need to change anything with the driver in the host VM.

More details on the commands can be found in the NVIDIA GRID vGPU User Guide. You can also script this up for groups of VMs.

That doesn’t seem to work if you have a catalog built with a specific GPU instance and then later try to switch GPU setting. I’m aware you can change it in the VM, but when the image and catalog are built with a particular instance (e.g., K140Q) it does not work to change the GPU type on the VM via XenCenter (e.g., to K120Q). Am I still missing a step somewhere?

Tobias, are you looking for the ability to change the vGPU type within the catalog and have that change propagate to all the VM’s as they next restart?

If so, I would say this is more likely a feature enhancement for XenDesktop. It’s an iterating feature that I can see obvious value in, but I don’t know how much work would be involved in writing it into XenDesktop.

For individual VM’s you can change the GPU type as long as the VM is shutdown. I do this all the time with my VM’s that are built using MCS, but you must shut them down first.

Hi, Jason:
No, I wanted to just change just one individual VM that was indeed created from a catalog, but not change anything in the catalog itself – only alter the GPU type in that one VM.

I am even more perplexed now. I swear, I tried this (probably back when the technical preview came out) and it did not work. I just tried now shutting down a VM with a K140Q associated catalog, changed it to a K120Q profile, rebooted the VM, and – voila – it does work. Is this something new with the new driver? However, when I just now ran a benchmark, it is running many times slower than a "stock" K120Q VM with its own catalog, so now I wonder if there isn’t perhaps extra overhead or something else that creates this discrepancy??? And, yes, these are all images built with MCS. I take it you do not see this issue? In fact, all video and screen operations in general are slower and frequent temporary freezing of the screen takes place every so often (10-30 seconds or so).

I tried it again, and got the same effect – it seems to kick in and out of even doing the processing, stalling at times for many seconds between displaying frames (the GPU-Util value drops to zero in the nvidia-smi output). What is also very interesting is that I tried this the other way around, namely changing the GPU setting on a K120Q VM to be a K140Q,a nd this time, it performed as well as the "stock" VM created from the K140Q master. Bizarre. It’s completely reproducible, BTW.

So, is the "secret" that all changes have to be made to the catalog’s VM and then reprovision new VMs to make this work correctly, perhaps (hence, the performance issue)?

What am I doing wrong, if anything?

Has anyone else been able to replicate this?

Any change here? It would be great to have the ability to use 1 master VM as the base to create multiple catalogs of VMs using MCS with different vGPU profiles. Same software, just a different vGPU config. Then update 1 image and apply to multiple catalogs. Driver is the same… shouldn’t be an issue inside the VM right?

Strange that I dropped off this thread for some reason, but anyway…

Tobias,

Once the VM is built you should be able to change the vGPU profile on an individual VM without issue. Absolutely supported and it shouldn’t have a performance hit. I do it all the time! Definitely need to look at why the performance is degraded in your instance.

jstroebel,

With MCS (or VMware’s Linked Clones) I’d approach your requirements today with multiple snapshots. Single VM, then snapshot 1 with 260Q, change the vgpu to 240Q then snapshot again. PVS may be a better option for you in the scenario described as you want the same disk image but different virtual hardware configurations…

Hi Jason,

I have a similar issue to this thread relating to changing vGPU type.
Using XenServer 3.5, GRID K1 GPU, XenDesktop 7.6., PVS 7.6 with PvD

I can successfully change K1 vGPU type on a VM and it picks up the new type following a reboot. But when I use PVS I get BSOD after about 2 mins after logging in. I can see that NVCplsetupeng.exe is running just before the BSOD. The master image was built with a K120Q vGPU.

Have you managed to use PVS and successfully change the vGPU type?
My next test is try PVS image again but without PvD.

Thanks
Jason

What I believe you’re seeing there is plug & play detecting a hardware change and attempting to update itself accordingly.

After you changed the vGPU profile, did you update the master image before pushing it out to all the other VM’s or just boot the VM’s from the PVS master?

Whether using PVS or MCS, you’ll need to boot the master image at least once to detect the changed hardware and let it undergo the driver refresh before pushing it to the VM’s. It shouldn’t be more than a 1 minute task added into the process.