GRID K2 pass-through + vSphere 6.5 + Citrix XenApp RDSH VDA Server 2016 = 11 user max

The first of this year, we launched our first VDI deployment in our library environment. I started noticing very quickly that thin clients weren’t able to login at after a VDA had 11 sessions. RDP also would not work to that same VDA.

Long story short, after talking with citrix support and doing some troubleshooting of my own, if I remove the K2 card from being used by the VM (pass-through, not vGPU), there is no 11-user limit per VDA. We have two K2 cards per host, so four pass-through GPUs available per 2U server. With the GPU pass-through available, any thin client or fat client that tries to connect in to that Delivery Group basically just kicks them out with no error message of any kind.

I noticed in another thread the following statement:

Probably the first and the last officially supported drivers for win2016 and GRID K* are R367 U4/U5 (369.26/369.49)

I sincerely hope that is not the case, or we just wasted 10’s of thousands of dollars buying twelve K2 cards that are not going to be usable at least in the context of pass-through for our 2016 RDSH VMs. Moving forward, we know we’ll have to buy the M10 cards since the K2’s are no longer purchasable.

We are doing pass-through because we don’t have a high enough vSphere license for that. Would vGPU profiles being assigned to the XenApp much like the pass-through card fix the 11-user problem?

I’m going to bring up a 2012 R2 based XenApp with a pass-through GPU to test with.

Any assistance would be greatly appreciated!

Hi,

from you description it is hard to say what is going on. Could you specify what workloads you’re running on the XenApp hosts? Keep in mind that you share 4GB of Framebuffer for the users on the XenApp host. We’re in internal testing of XenApp density currently especially for Server 2016. I personally don’t have any real use numbers yet for 2016 but from my experience you should get around 30 users on a M60 (8GB FB) on 2012 R2 (for sure it always depends on workloads).
Could you also specify what you mean with the driver message for Server 2016? It is a fact that 367.x branch is the last official path supported for K1/K2 but we support each path for several years. That said there will only be bug fixes instead of new features which should be understandable as these products are EOL.

Best regards

Simon

Simon,

I thought perhaps it was workload related in that maybe the GPU was running out of memory. GPU-Z indicated that only a few hundred MB of the 4 GB was being utilized. I’ve tried 11 sessions all running YouTube and I’ve tried 11 sessions just logging in not running any application. Unfortunately, the 12th sessions is unable to connect in either scenario. Everything works great as long as you stay under 11 users connected to that VDA with the pass-through card.

What I mean with the driver message was I had thought I read it was a beta release for the K* cards and not a final release in that regards. If the problem is indeed not driver related, then disregard that statement.

I’m in the process of trying to acquire a vSphere enterprise plus trial license so I can test with vGPU profiles instead of pass-through. Do you think going that route and applying one of the K2XXQ profiles would allow more than 11 sessions per XenApp?

Update:

I can confirm that a Server 2012 R2 VDA does not have the 11 session limit with the latest drivers available to 2012 R2 369.49.

I think it will be healthy to bring up another Server 2016 VDA and test against it similarly the way I did this 2012 R2. I will update this thread with my findings.

Hi,

thanks for your update. To answer your question concerning vGPU and XenApp it might not help to use vGPU profiles as you often need as much FB as possible. Therefore Passthrough is already the correct choice.
But the strange thing is really the 11 user limit you reported for Server 2016. I will try to reproduce this.
In general you should try to run 4 XenApp VMs on a 2 socket host to best leverage the host resources.

Regards
Simon

Update 2:

The newly created 2016 VDA optimized the same way the 2012 R2 VDA was optimized (PVS Optimization and VMwareOSOPtimizationTool_b1084, disable task offload) still experiences the 11 user limit.

Hi,

I can confirm that this seems to be a general issue with Server 2016 and Passthrough GPU. The issue doesn’t occur with vGPU profiles. Further investigation with MSFT is ongoing.

Regards

Simon

Thank you for the update. If I need to do anything on my end, don’t hesitate to let me know.

Gary Adkins

Any updates/information on this by chance?

Gary Adkins

Hi Gary,

sorry for the late reply. Seems to be really weired. Our performance Eng team couldn’t reproduce the issue after starting from scratch even on passthrough it worked. Unfortunately I haven’t heard back anything further. I’m just in the process to try to reproduce myself.

regards

Simon

Hi Gary,

I was now able the reproduce in my homelab. Funny thing here is that I can use 12 sessions and the 13th session won’t work. Desktop Window Manager creates session port and afterwards dwm.exe crashes.
Same for you?

Regards

Simon

Simon,

For me on my end, it’s always the 12th session that won’t work. I’m glad to see that you were able to reproduce it even though it was one off, but fundamentally it’s doing the same thing on my end.

Gary Adkins

Was there ever a solution found for this? I’m running into the exact same issue. On the 12th Citrix Session dwm.exe crashes.

GPU-Z indicates only 3.1GB of the 4GB are used.