GPU-Accelerate >>one<< RDSH (Windows Server 2019) on VMWare Essentials Plus

Dear NVIDIA experts.

Since Windows Server 2016/2019 RDSH we are experiencing high CPU load with at many customers, with the same amount of users, where Server 2012R2 had no problems at all.
Trying out and talking a lot with others, we’ve run into the result, that 2016/2019 simply needs GPU performance

Now we want to accelerate NORMAL RDSH users (no CAD application, etc.) on ONE RDSH host with NVIDIA GPU performance at the minimum level of at least license costs (so NO GRID, NO VMWare Enterprise plus if possible).
As Hypervisor we run VMWare 7.0 (Essentials Plus license).
We know without GRID and VMWare Enterprise plus, we can only share the full GPU direct to the VM, and lose also functions like VMotion, Snapshot, etc.

We’ve checked the VMWare Compatibilty guide, and the
TESLA T4
would be supported in our Hardware/VMWare constellation with vDGA, so far. So we’ve already put a T4 to one of our VMWare hosts, and it’s recognized. But now we have some questions:

  • There are around 20 (or more) PCI devices, all called T4 in the server now, some let themselve configure for direct pass-through, some not, why is this?
  • Is it possible to run the TESLA T4 without GRID licenses at all, passing it through to ONE VM directly?
  • Which driver should we use to enable GPU performance on a Windows Server 2016/2019 with RDSH
  • Is it important to change the default graphic device to the T4 on that RDSH VM, to experience the better performance?
  • Are there any other settings, they should be enabled for RDSH optimization in combination with T4 GPU?

Hi

You’re using the wrong GPU.

If you want to use any Tesla GPU with Graphics, you’ll need to pay for that feature. If you don’t want to pay for any licenses, you need to use a Quadro GPU in Passthrough, then you can use the standard Quadro driver from the public website.

You need to configure the default Graphics adapter and you can do that with a GPO.

Regards

MG

Hi,

Sorry for that delay, didn’t got the notification.
So TESLA GPU is only possible with GRID implementation and licenses, right?

For our purpose (one 2019-RDSH, fixed to one VM 7.0 ESXi host), would you say the Quadro adapter is the better choice, if we just experience high CPU load, because 2016/2019 seems to expect GPU for a lot of things?
Would be a NVIDIA Quadro RTX4000 be usable WITHOUT GRID, so for our purposes?

Another thing, I’m confused: How can you change the default graphics adapter?
I just found the GPO, which tells RDP using the default graphics adapter, but not how to define a NVIDIA card as the default graphics adapter (instead of the VMWare one)

And a second question using GRID licenses:

Of course we would like to pay for the GRID licenses as they are not that expensive.
But is there actually any method, using the T4 with GRID WITHOUT a VMWare Enterprise license?

In the moment we just have VMWare vSphere 7 Essentials Plus licensed at the customer. And the really expensive seems to be the VMWare Enterprise license, not the GRID itself.

Publishing the T4 just to one VM, limited to one ESXI host wouldn’t be the problem for us. As well, not having VMotion available or that memory has to be taken the full amount as reserved.

Because upgrading the license to VMWare Enterprise is the really expensive topic here, i think.

Or what would you recommend?

Hi

Yes, the RTX4000 will be fine for initial testing, but it only has 8GB of framebuffer, this may be ok depending on your workload and user density, but it will be the first thing you max out.

You don’t need to explicitly define the Default Graphics Adapter, just configuring the GPO will be sufficient.

If you run the T4 in Passthrough, you should be able to use the vGPU driver with it and then license it accordingly (QvDWS / vApps / vCS) per CCU depending on your workload. As you’re not virtualizing the GPU, you won’t need (VMware) Enterprise Plus licensing. Obviously, you’ll then run in to all the usual limitations of not virtualising, but at least it should work.

Regards

MG

Hi Mr. Grid,

So in our case I think you would recommend using the T4, as there is a “cheaper” way to use it without VMWare Enterprise Plus, as you described, and in case it’s too less, we still have the option to use more complex GPU virtualization scenarios like cascading the T4 and use GPU virtualization with Enterprise plus license, right?

I’m a little bit afraid of the RTX4000, because you can find some informations of crashing DWM.exe, if there are more than 15-30 users on a RDSH, so it could be a one-way-street, with problems:

https://docs.microsoft.com/en-us/answers/questions/113122/rdsh-2019-quadro-rtx4000-only-30-user-before-wdmex.html

Or is there a third way, you could recommend to us in our case?

Hi,

We’ve successfully passthrough the T4 to the GRID driver in Windows Server over ESXi 7.0 without VMWare Enterprise license
GPU is responsable and GRID licensing is working.
Just a last question:

In the PCI Device passthrough-section of ESXI-host I have 32 device IDs from 1 T4 card.
First of all, I cannot select all of them for passthrough, but is it enough to select ONE device ID, and the whole GPU is addressed to my VM?
And how can I check, that the whole GPU performance is available on my VM?


Hi

If that link is the only source you have of a reported issue with the RTX4000, then I really wouldn’t worry about it. Besides, with only 8GB of FB and 30 users crammed on to it, the user you’ve linked to is probably running out of Framebuffer which is causing his issue. He states he tested with lower resolutions compared to his production ones. No idea why you’d test with one use case, then use a different one in production? …

For the T4, adding just one of them will be sufficient. You can test whether it’s working by using tools like GPUProfiler: https://github.com/JeremyMain/GPUProfiler/releases

Regards

MG

Hi,

Thank you, so checking one T4 hardware ID in the passthrough section is sufficient.

Regarding dwm.exe there are several issues reporting crashing dwm.exe with RTX4000 like also:

https://www.reddit.com/r/techsupport/comments/j6vce3/rdsh_2019_quadro_rtx4000_only_46_user_before/
https://social.technet.microsoft.com/Forums/lync/en-US/6779b586-c158-491c-b76b-353d5a490642/server-2016-rds-connections-maxing-out-and-crashing-dwmexe?forum

In the meantime, ourself we experienced crashing DWM.exe, with T4 passthrough and GRID driver, when around 50-55 users are logged in to the server, and GPU memory of 15 GB is nearly used completely, that’s the time, when dwm.exe crashes.
Is there any possibility to avoid that? If we put a second T4 into the system, and passthorugh again, we can use both T4 and the double of memory, over the GRID driver, and should be able to avoid that crashing dwm.exe, or not?

Hi

You can’t put 2x GPUs in the same VM and split the load across the GPUs, that’s not how RDSH works. You’ll need another VM and run the additional T4 in Passthrough with that. Then you’ll need to load-balance the VMs so you get an even user / workload distribution.

Regards

MG

Okay, so that won’t solve my problem.
What’s about that crashing dwm.exe? Is it actually connected to the size of the buffer, and if the buffer is full or not, or is it about something else, which won’t be solved with a SINGLE GPU with 32GB memory for example?
Offloading the user to two RDSH will be a solution of course too, but we normally would have just one…

Btw: Are you sure, RDSH is not supporting multiple GPUs?
Because at Microsoft they tell, loadbalancing from multiple GPUs presented to the OS is supported since Server 2019

Yes, I’m sure. Try it and see for yourself if you like :-)

Is 50 concurrent users on a single VM not enough? The whole idea of this is that you then have multiple VMs of the same spec running on the host and scale out across the physical host. With most modern servers supporting at least 6x T4s, if not more, that would be 300 concurrent users per host, assuming you didn’t run into CPU or Storage contention before hitting that number.

Regards

MG

1 Like