Successful XenApp GRID deployments

Hello GRIDders,

Is there a recommended methodology to determine the successful delivery of GPU functionality with applications delivered via a remoting mechanism such as XenApp? There seem to be many variables regarding whether the application expects certain display driver, full screen, WPF, etc. functionality that need to be evaluated. This really makes GPU resource delivery via application remoting a crapshoot, at least compared to standard VDI. Even if an application vendor mentions that they will support being remoted with GPU resources, sometimes a subset feature or benchmark within the application will not.

TIA for any input and guidance

The NVIDIA GRID application certification list is useful:

Citrix, OEMs and NVIDIA are ramping up support to help the software vendors too:

USually the application shouldn’t know it is virtualised and on Citrix technologies we don’t use anything other than NVIDIA driver support, vSGA on vSphere can have problems as it supports only older versions of OpenGL.

It’s not perfect but I think the information is improving rapidly as the software vendors realise the demand.

I actually think that the GRID application certification list is more confusing than helpful. I know from talking to some of the listed vendors that there is a difference between certified on XenApp and certified on XenDesktop. Yet the referenced certification list does not touch on that subject. Further, in the XenApp environment, there is, according to the Citrix folks I have talked with, a difference between XenApp in a VM and XenApp on bare metal. In a VM, the XenApp instance will only see a single vGPU (whether it is a single core or a virtualized portion of the GPU. However, when running on bare metal, XenApp will see all available cores and utilize them all across various XenApp users. Don’t ask me how but that’s what I was told so, which is going to give better performance and/or user density: single XenApp instance on bare metal or multiple XenApp instances in VM’s on any given single piece of hardware. Then add to that, for any given application, is a K1 or a K2 (or multiples thereof) going to provide the best TCO?

Bottom line, there are a lot of unanswered questions for for XenApp w/ GRID. Or should I say XenDesktop for Applications V7.x as the improvements there go way beyond what is available in XenApp V6.5 IMO.

If putting a XenApp as a VM on a XenServer instance, the most efficiency will be gained by assigning a GPU passthrough to the XenApp VM, so, up to two with a K2 or four with a K1 per board. There is naturally no rule that you have to use all of the GRID engines just for the XenApp VM(s), so you could also host XenDesktop instances on the same server, if you wish.

If you install XenApp on bare metal, you can indeed leverage the whole GPU but as I understand it, only as a RemoteFX instance. In other words, you cannot leverage OpenGL, DirectX, etc. but only what the host Windows OS natively supports.

What is ultimately better will heavily depend on what users are doing. At the same time, the multiple-to-one mapping via GPU passthrough will generally gain you better performance. I have run some tests but do not have the numbers handy in terms of running a VM using a vGPU vs. the same application under XenApp using a GPU passthrough instance. Again, without having multiple concurrent users, it is hard to say how things would scale as more instances are run (assuming, of course, you are not exhausting the host server’s CPUs).

Hi everyone,

what’s the cheapest solution without compromising user experience or the best price-performance-ratio solution must be tested during a PoC, because different users, differwsent applications , different wokloads and different workstyles of users can lead to a significantly changed result.

But apart from that discussion

From what I know (for the time being) you can only pass one GPU ( passthrough whole GPU or a vGPU profile into a VM ) which means you are limited to 1 GPU per XenApp Server as long as you need a supported environment ( all other dirty tricks and hacks on your own risk). If you install ( Patch XA650W2K8R2X64038 ) you can use OpenGL 4.x based applications and share the GPU resource for OpenGL 4.x and DirectX 9 ( because XA 6.5 is WS08R2 ) accelerated apps between multiple user sessions. There is no classic IMA based XA 6.5 for WS12R2, but all the functionality is built-in into Server VDA >7.x ( for FMA based XenApps / Server VDAs) and because of WS12R2 you get DirectX 11.1 ( DirectX 11 for WS12 ) acceleration as well. Citrix XenApp leverages GPU support for DirectX apps running in RDS sessions from the Microsoft implementation.


AFAIK Citrix recommends to only use one more powerful GPU per XenApp Server instead of multiple low-end GPUs. For DirectX applications, only one GPU will be used. By default, Citrix is leveraging the Microsoft DirectX support which always uses the same GPU for all RDS sessions. It seems that all DirectX based accelerated applications are processed on same GPU where the first accelerated session has been started on. There is an undocumented and UNSUPPORTED registry key to spread the load across all available GPUS ( i.e. bare-metal installations),


For DirectX applications, only one GPU is used by default. That GPU is shared by multiple users. The allocation of sessions across multiple GPUs with DirectX is experimental and requires registry changes. Contact Citrix Support for more information


For OpenGL applications, you should see more than one GPU being used, but the GPUs may not be evenly distributed among sessions.

The is even experimental GPU acceleration for CUDA or OpenCL applications


Never… never … never install the OpenGL Software Accelerator ( MESA based OpenGL 2.1 software renderer ) on a Server which has a GPU attached ( baremetal, vGPU or paathrough). This will break OpenGL acceleration.

Dependent on your OS you may need to set a policy for the adapter which should be used for rendering DirectX content in RDS/XenApp Sessions as well

Remote Desktop Services (RDS) sessions on the RD Session Host server use the Microsoft Basic Render Driver as the default adapter. To use the GPU in RDS sessions on Windows Server 2012, enable the Use the hardware default graphics adapter for all Remote Desktop Services sessions setting in the group policy Local Computer Policy > Computer Configuration > Administrative Templates > Windows Components > Remote Desktop Services > Remote Desktop Session Host > Remote Session Environment.


I have implemented multiple HDX 3D Pro projects using XenApp. It’s all about acheiving the highest user density per server with consistent optimal performance. Using baremetal with GPU, XS GPU Passthrough, or vSphere vDGA are the recommended configuration. The baremetal confguration with a high-power card like the K5000 or K6000 will provide the highest user density from what I have seen for the use cases that I worked on. For GPU Passthrough, the K2 card will work best to acheive a high user density for medium to complex use cases. The K1 card will work to if the use case is more video memory bound the GPU core bound. You should leverage the GPU affinity/locality configuration which can increase the performance by 10%. Hyperthreading may degrade performance if all the XA VMs exhibits an average of 60-80% CPU utilization therefore do not put too many VMs per host; size it properly.



The other option to consider is even with, say a K1, you can run four XenApp VMs on a XenServer platform and assign a GPU passthrough to each of those VMs, hence tapping into the full benefits of the K1. Virtualized XenApp instances run very efficiently (we run several, as well as one on bare metal so we can compare performance). With distributing how connections to XenApp servers are spread around, you achieve – as Ron stated above – the maximum user density this way, plus with four VMs for a K1 or two for a K2, you end up fully utilizing the GRID.

I agree with GRIDfan that the certification by ISVs is confusing so i wrote this to try and clarify really its something Citrix has little control of as the application vendors are choosing to self certify on limited scenarios and publish their support all over the place we can try and collate and promote where it is but because we are virtualisation and that windows VM doesnt know it is virtualised and all graphical support is via native NVIDIA drivers there isnt much we can certify as every app has different CPU RAM etc requirements and the application vendors choose their own support matrix… We support the use of the apps and the Citrix components but performance and bugs in CAD programs are up to the software vendor

I installed GRID K2 as Passthrough for XA 6.5 VM
I then Installed K@ Drivers and can see the GRID card in device manger . I also ran GPU-Z and it detected GPU fine.

However the application launched is not using the card.

What else do i have to accomplish.

I am not sure if there are any more settings that i have to apply.

DHow do i enable HDX 3D pro

Kitaab, you have to install drivers on the server as well as any of the VMs themselves. Please see the full installation guide that steps through the process for both the server hosts as well as the VM (XenApp in your case) implementation process. There are a lot of steps involved. See and for details.

I have done all the xenserver bit as well. I ran Xendesktop 7.1 VM’s and they use GPU fine.

My issue is i am not sure if Pass-through works out of the box or there are some citrix policies in xenapp 6.5 or some reciver settings that need to be applied . FP2 something etc


passthrough just works out of the box with XS and you see the card in Devicemanager means it’s passthroughed into the VM. Nvidia-smi should not see it anymore when executed in XS Dom0. Which application are you using ? What’s the API used ? OpenGL or DirectX ? Did you check all my suggestions in on of the posts above ?


Hi Ron,

I am not able to validate whether the GPU is being used or not.
When i run an application that should use the g{U , i see the GP{U-Z do not show any indication of GP{U being used at all.

In case of Xenapp 6.5 do i have to install Patch XA650W2K8R2X64038 befor GPU can be used by openGl based apps.

I do not want to share the GPU , but want multiple users to be able to utilize the GPU that has been Passthrough.

Not sure what i should expect though.

Also do i need HDX 3D pro or something?

With the GPU passthrough configured on your Windows VM, you should see activity by running nvidia-smi on that Windows VM itself (as it won’t show up on the XenServer host).

When i ran Unigen Valley Benchmark GPU is being Used . I found that by default it used Direct3D11 API.
It did run in ICA session and i could see that GPU was being used .
It could now be that only openGL apps have issues.
However the performance was terrible (around 25 fps) in the ICA session.
GPU has been passthrough and correct drivers installed. But somehow wheni ran dxdiag i can see that it doesnt pick the nViDIA driver. But GPU-Z utility and nvidia-smi show it fine.

I am not able to attach images , is there any other way to add images to the post

Within the iCA session if i look at the display properties , i can see
Default Monitor on Citrix Systems Inc. Display Driver and not the nVIDIA one

Image attached

You’ll need to view the screen on an external connection, not on the console within XenServer itself.

I had to install patch XA650W2K8R2X64040 to make OpenGL apps use the GPU.

I am involved in a project for using Xendesktop/Xenapp for Autocad virtualization. They already have Hyper-v so Xenserver is not a option ( initial plan to use Grid K2 but hyper-v is not supported)
My proposal is using 2x new Dell Wokrstation 5810 with new Nvidia Quadro K5200 and bare metal installation of Win 2012 and Xenapp but i am confused is this OK because i have read some remarks that this kind of configuration cannot use opengl that it is only for RemoteFX so please some help will this kind of setup be OK?


You’ll be fine with that config. It will use the same driver as the GRID cards and it will support OpenGL, though AutoCAD is DirectX so you don’t need to worry…

Driver’s here btw.

Not sure what your scalability will be like though as you’ll have quite a bit of CPU load and IOPS to contend with.