Can mix NVIDIA GRID software edition in one Tesla M10 ?

Can mix NVIDIA GRID software edition in one Tesla M10 ?
due to Tesla M10 support up to 64 concurrent users [512 MB profile].
Is it possible to buy Grid Virtual PC x 32 concurrent users and Grid Virtual Workstation x 32 concurrent users ?

I would strongly advise against using a 512MB profile. There’s no NVenc when you do that … 1GB should be your minimum profile, this will allow NVenc to work and improve your users experience.

As the M10 is built with 4 GPUs on the same board, you can run different licenses on each physical GPU.


Organizations that simply want to enable virtual application delivery to PCs or thin clients can now purchase VDI Complete with Horizon Apps. This solution allows for high session density with up to 1,400 hosted apps, so more apps can be delivered to more users faster. IT administrators can add, update and retire applications in real time, and instantly roll out apps to all users or selected groups of users. End users can self-manage the apps that they need at any given time, reducing endpoint resource requirements.

With Windows 10 do NOT use the 512MB Profile.

Your desktops will most likely lockup.

Furthermore you can only have 1 profile per GPU core… so once you power on a VM with a specific profile that GPU core will be "Locked In’ to that profile.

In our environment we use the M10-1B Profile for Windows 10 (1703 and 1709) Office Desktops and then we have Engineers who use the M10-4Q Profile with the same Windows 10 Desktops.

With Round-Robin GPU assignment the first 4 VM’s you power on will lock in all 4 GPU cores on the M10 so then each GPU core will only accept more 1B Profiles. If you power on 2 1B Profiles it will leave 2 GPU’s available so you can then power on 2 4Q Profile VM’s and lock-in the other 2 GPU’s.

Running mixed like this on a single server is a pain in the a$$. I started like this and ended up dedicating a R730 just for 4Q Profile VM’s in it’s own standalone cluster. If you have a bunch of people piling in on a server shared with multiple profiles you could lockout all GPU cores before the other profile logs in.

The workaround to this is to use a density GPU assignment option. The problem with this is it will max out the first GPU with VM’s before moving to the next. This leads to over-crowding and reduced performance of user VM’s instead of a nice load-balanced design.

Anyways I hope this helps. I have been using Nvidia Grid /w M10’s in our environment for 2 years now (We were a launch customer of the M10’s).

Great to hear you’re having such success with the M10s!

I take it you’re running vSphere in your environment, and not XenServer? Just for reference, with XenServer you have a lot more granularity for vGPU placement. In a single Resource Pool, you can configure which vGPU Profiles you want to run on specific GPUs. With multi-GPU boards (M10 / M60), you can specify which vGPU profile you want to run on each GPU. So in your case with the M10s, if you were running XenServer, you could dedicate specific GPUs per board, or anywhere in the Resource Pool to run only 4Q profiles and the others to run only 1B profiles. You would use this granularity in combination with the "performance vs density" (breadth vs depth) setting to get the load balancing you desire.

If you’re concerned about contention when load balancing … When you move to Pascal or newer, you won’t have that issue. With Maxwell you’re using the "Best Effort" scheduler, which is effectively a "free for all" amongst all users on that GPU and the scheduler does its best to ensure everyone gets a fair share, however it doesn’t stop someone from consuming the entire GPU and impacting other users. With Pascal and newer, you can use "Fixed Share" (for consistent performance regardless of what anyone else does) or "Equal Share" (which splits the performance / GPU cycles equally between the amount of powered on VMs) to dedicate GPU performance. Then you can use your current method to load balance more effectively within a single Resource Pool.

Currently, neither (VMware) DRS or (Citrix) "Workload Balancing" account for the GPU load / density when making an optimization recommendation or initial VM placement. Hopefully this will come in future releases.