Scaling issue with 8Q vGPU

Hi

I seem to be having some sort of scaling issue when using a vGPU. This issue only happens on newly built XenApp VMs, all existing XenApp VMs are fine. If I use XenDesktop, there are no issues, this is purely a XenApp / Windows Server issue.

I’m using ESXi 6.0 U2, XenApp / XenDesktop 7.9, Windows 2012 R2 servers and I’ve tried either fully patched or brand new without any updates from an .iso. GPUs are M60 and I’m using 8GB profiles. Lots of CPU and RAM, resources are not the issue.

My GRID drivers are 362.56 for all my VMs.

Applications are varied, random NVIDIA demo type, Redway, generic 3D stuff you can just download and play with. At this stage, the applications are not important, because the issue isn’t with them.

Performance of the application is fine. But looking at the applications / desktop the visual experience is degraded, it’s as if the server isn’t seeing the GPU, but SMI and every monitoring tool I use is telling me the GPU is being used. Application FPS supports this.

Here’s what I’ve tried so far:

  • The GPU is properly licensed and I’ve checked the license server to ensure a license is being checked out. I’ve also tried restarting the license server (just in case).
  • The GPO to tell a XenApp server to “Use the hardware default graphics adapter….” is applied and I’ve been through the registry to ensure it is working.
  • I’ve tried reinstalling the NVIDIA drivers then pushing out the new image. The fact that the drivers install at all, tell me the VM can see the GPU.
  • I’ve tried building a brand new VM with no Windows Updates. I’ve also tried all current Windows updates.
  • I’ve tried with no GPOs or Citrix Policies applied, but it’s only using the same Policies that other VMs are using.
  • I’ve rebooted the physical host.
  • I’ve tried disabling the VMware SVGA 3D adapter so that only the NVIDIA adapter is present and active.
  • NVIDIA SMI shows that the GPU is assigned to the correct VM and the GPU is being used.
  • DXDIAG says that the Citrix Display Driver is being used, Direct 3D Acceleration is “Enabled” and that there are “no problems found”.
  • The issue is also validated by the fact that the resolution is completely wrong when opening a published application or logging onto the servers desktop, it looks like it’s scaled and running a lower resolution.
  • Direct RDP does not have the same desktop and icon scaling issues, but I can tell it isn’t right when I look within an application at the fonts being used.

I’ve uploaded a few screenshots. Heaven is running at 1024x768 (although looking at the image, you’d never think that). The “Desktop Resolution” screenshots were taken from my MacBook Pro, which runs a 1920x1200 resolution when connecting to ICA resources and “SMI” shows that the GPU is being used.

The attached "RDP GPU.jpg" shows 2x RDP connections. The server on the left, is not experiencing the scaling issue, the server on the right is experiencing the issue. I can see this by looking at the font size on the monitor that is open. The one on the right is bigger than the one on the left, despite the desktop resolutions and icons being the same size. I know that when I access this through ICA, the whole desktop will be scaled.

I hope that makes sense. If it doesn’t, let me know and I’ll add some more content.

If anyone has any suggestions, that would be great.

Thanks

Ben
Correct Desktop Resolution.jpg
Scaled Desktop Resolution.jpg
Heaven 1 @ 1024x768.jpg
Heaven 2 @ 1024x768.jpg
Heaven 3 @ 1024x768.jpg
SMI.jpg
RDP GPU.jpg

I’ve also just tried a clean build Windows 2012 R2 VM, installed all Windows Updates as of today, no GPU attached and just installed the 7.9 VDA to create a new XenApp desktop. This works perfectly. The published desktop delivers the correct resolution without any scaling. When I install a vGPU to the master VM and push out the update, the scaling issues are immediately apparent.

Very strange. I’ll keep looking…

Bit more investigation…

If I change the vGPU to Passthrough on the same VM, there are no issues. This only seems to be related to using a vGPU.

In the attached images, VM1 is an original W2012 R2 build with a vGPU attached, it does not have the scaling issue. VM2 is a clean build (from an .iso not template) W2012 R2 with all current Windows Updates, the same 362.56 drivers and 7.9 VDA. When using a vGPU it has the scaling issue. However, when I change the GPU for a Passthrough, the issue goes away.

I’ll carry on testing…
VM1 with vGPU.jpg
VM2 with Passthrough GPU.jpg
VM2 with vGPU.jpg

Randomly, I thought I’d try a different vGPU profile, and if I give the VM a 4Q profile, I do not get the scaling issue. Going back up to 8Q, I get the scaling issue. This is all on the same VM, screenshot attached.

This is the 3rd or 4th new VM I’ve built to troubleshoot this issue, but just for (my) sanity sake, I’ll try one last brand new VM and do a completely clean install, and also a reinstall of the GRID drivers into ESXi.
VM2 vGPU 4Q Profile.jpg

GRID driver reinstall into ESXi has not resolved the issue. On to the VM rebuild…

So a(nother) new VM has not resolved the issue. A 4Q profile works fine, an 8Q profile does not, and still scales. However, I notice that an 8A profile works fine and does not scale (see attached).

Strange that it only affects Windows Server 2012 R2. Windows 7, 8.1, 10 and 2016 are all fine. And it isn’t a dodgy build, because they’re all clean builds from a new .iso and as mentioned at the top, I’ve tried with no Windows Updates, and alternatively all of them to date, and they were all built using the same .iso.

I can’t drop down to an older GRID driver, because we are using functionality that is only available in the latest one.

I’ll carry on investigating …
VM3 8Q Profile.jpg
VM3 8A Profile.jpg

Just to keep this updated, I’ve been investigating with Jason Southern (NVIDIA) offline, this issue is only present with a clean install of the GRID 3.1 driver on Windows Server 2012 R2. (“Clean” meaning a VM without any earlier NVIDIA drivers installed).

Upgrading from (GRID 3.0) “361.40-362.13” to (GRID 3.1) “361.45.09-362.56” allows the VM to use the 8Q GPU profile without issue, whereas installing the GRID 3.1 driver directly into a clean VM, is when the issue occurs.

My XenServer lab is offline at the moment, so I’m unable to verify whether this is ESXi specific or whether the XenServer stack has the same issue.

It’s by no means a show stopper, but it can be time consuming to fix, as a Master Image update needs to be performed to correct it. The workaround I’m currently using for clean build Windows Server 2012 R2 VMs, is as mentioned above, install GRID 3.0 first, then upgrade to GRID 3.1.

I’ll keep this updated with progress…

Thanks for the update Ben! Much appreciated! Will make sure team are aware…

No worries Rachel :-)

A bug for this has now been raised with NVIDIA (thanks Jason), and I’ll update again when there’s progress.

If anyone else encounters this same issue, please DM your details and I can add your info to the bug report.