We’re currently running the numbers to see if a VDI deployment would be viable within our company. I’m having a bit of trouble getting information on how people deploy AutoCAD using Citrix technologies…
Do you need to use Pooled or Assigned VDI? Does it work well enough with Hosted Shared desktops? etc.
Does vSGA provide enough performance for AutoCAD and similar applications? or do you need to go to vGPU?
I vaguely understand the concept of the GRID profiles (e.g. 140Q) but is there any concept of over provisioning with vGPU? If you have a K2 and you assign 260Q profile to your Pooled VDI’s does that mean you only get 8 (or is it 4) users per host? How does this work?
Given the amount of back and forth to sort out all the answers I am asking your local Solution Architect to join in and continue this with us offline. I think it would be great to circle back around once we have the answer sorted out and post that though.
Derek Thorslund did a GTC talk featuring so Autodesk case studies that Victoria has posted along with some others that look relevant https://gridforums.nvidia.com/default/topic/11/session-recordings-graphics-virtualization-summit-at-gtc-2014/
So, can someone explain to me how the scaling on these Grid cards works. If I choose the K2, and I have 2 of them in a host, lets say XenServer with vGPU for example.
From what I’ve read, it looks like I set a profile - if I was to use the new one - K220Q, I read it scales to 16 sessions per card, so a total of 32 on the host.
Does that mean the 33rd desktop running on this card gets no acceleration? In effect, this isn’t actually that same type of virtualisation you get on a CPU, but more just dividing a beefy card into smaller chunks and then passing them individually to a group of VMs?
How does this work on XenApp? I still split my card into K220Q’s but each “sub-card” is applied to a VM, and so I get x number of users per Hosted Shared desktop, and hypotheitcally 8 K260Q VM’s per host?
Mongie, in short yes, but the resource limited in terms of division is the memory with the GPU, 4GB each. You must choose a profile to use for each GPU, but it does not have to match across GPUs, so if on a K2 you choose K220Q for profiles on the first GPU, you can use the K240Q on the second GPU, but you can’t blend profiles on a single GPU. You can also select a GPU to do pass-thru to either a single very high end user who needs that level of GPU, or as a complete GPU and all its RAM for a XenApp server. I would let the apps and current workstation builds help define the starting point for VDI with GPU, using a profile that is as good or better than what the current user experience is. For XenApp you would pass through the GPU to the Win2k12r2 guest and then allow XA to share it out. So the final density possibilities will vary greatly depending on the deployment choices you make for the GPUs available on that host. And yes, additional non-GPU enhanced guests can also be deployed to increase density assuming CPU, storage, and other resources are not issues.
You can use either.
sVGA is limited to DirectX 9 and OpenGL 2.1, if your applications need anything higher than that they will not work. Beyond that you then need to get into testing and assessing how many users you can load onto the system, just as you would for any other virtualisation solution.
No, it’s a hard limit. Profiles are defined to give a guaranteed user experience when the GPU’s are at 100% load. Given the nature of many GPU accelerated applications they will fully utilise the GPU where possible.
Consider the profile as a guaranteed minimum level of performance.
4 per GRID K2 card, so if you can install 2 cards, you could have 8 x 260Q profiles.
You may find the Solution Guide: GRID vGPU Scalability for AutoCAD useful in helping to understand how to assess for density.
I’ve read the GRID vGPU guide for using vGPU with AutoCAD referenced in Jason’s post above and still am debating if a XenApp-hosted solution with GPU passthrough might be more efficient for a fairly large number of light- to middle-weight users (see for example, http://on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-gpu-accelerated-xendesktop-beyond-3d-designers.pdf). Above all, with few users on at a given moment, they can all leverage whatever power there is available and with a lot of users, they all share resources in what would seem to be a more “fair” way. See also the related comment I made under https://gridforums.nvidia.com/default/topic/247/nvidia-grid-vgpu/grid-vgpu-scalability-guide/.
To clarify one question raised above by mongie, it was under my impression that with vGPU assignments if the number of active users exceeds the number of vGPU instances available that the n+1 user would not be locked out of accessing a GRID vGPU, but rather the assignments of the vGPUs would start to then multiplex so that the overcommitment would have to be handled by users sharing the resource of one or more vGPU instances. Is that the correct interpretation?