we have a setup with two Quadro M4000 GPUs in our low-end remote desktop session host. After a view year we have replaced our infrastructure with newer hardware. We have installed a fresh Windows Server 2022 with 2x RTX A4500 in DDA mode. After some tests we have noticed that two RTX A4500 accept a lot of less user connections as one single M4000 GPU. Is there from nvidia side any limitation implemented in the GPU and why can a significantly weaker card accepting 3 times as many user sessions as such a new card with much more hardware resources
- 32 Cores CPU AMD Epyc
- 256GB of RAM
- 2x Nvidia RTX A5000
- 2TB NVMe storage
- GPU DDA to one VM
We have created 30 Testuser and set the following groupe policies:
- Disabled UDP protocol only TCP
- Disabled WDDM driver
- Set physical graphics adapter to use for all RDP sessions
This is really very frustrating. Now we have our standard graphics cards that we can’t really use.
Many thanks in advanced!
Hello @cij and welcome to the NVIDIA developer forums.
I am afraid I am unable to help here, but I think there might be more knowledgeable people in the virtual GPU category, so I refer you to those forums.
I think it might be necessary for you to use vGPU drivers instead of DDA through the VM to achieve better utilization, but I am not the expert here. With DDA it might be limitation of the VM that forces the low user count.
Hello @MarkusHoHo, thank you very much. I can understand the concerns and also the point with vGPU. But you need to understood my point also that with the older version Maxwell and Pascal this is working perfectly and with the new hardware which is double powerful we getting issues. This looks strange to me and the first thing I think is that there may be an artificial limitation here in order to have to switch to vGPU if necessary. I think to myself how can it be that vGPU would be possible with a high number of users and not with DDA if there is no limitation. But maybe I’m thinking wrong.
So, I don’t want to insinuate anything, that’s why I’m asking. But I have often read that users with an RTX 4000 with DDA manage over 30, 40 users’ sessions and if I only manage 8 it looks strange. Maybe there is a bug in the bios / driver I don’t know.
My info was always, if you want to assign a GPU exclusively to a VM then use DDA without VGPU. If you want to use a GPU in multiple VMs use vGPU but maybe I’m wrong.
Exact same issues here. We switched from 2019 RDSH with 2 x T4 to 2022 RDSH with 2 x RTX A4000. Totally new HPE hardware with 2 x 1400W PSU. Old setup took 50 users easilly, new setup crashes with 8+ users. We even tried bare metal install, same. Also tried numerous different drivers, all same results. Google search (or even here in this forum) shows almost every system administrator has these issues, so it would seem odd it still hasn’t been addressed yet?
Indeed, this is odd as I wouldn’t assume you already saturated the FB with just 8 users.
In general, our KMD need to reserve memory for each user session on RDSH and this is different for each GPU/OS combination. So it looks like A4000 and A4500 hit a resource limit very soon.
Unfortunately, with running a workstation GPU for RDSH you are not using vGPU licensing and therefore you are not eligible to raise a support ticket. Please contact your OEM to raise a ticket but someone with the repro needs to actively work on it and deliver the required logs/dumps.
I don’t have these GPUs in my lab, so I cannot try to reproduce this issue
Nvidia A16 GPU would be the recommended GPU for RDSH in datacenter deployments. You could run 4 VMs with 16GB FB each on 1 A16 GPU.
Tnx Simon, I will definetly look into the A16 then!