YouTube poor performance on Grid K1 with vSphere 6.0 U2 and Horizon 7.0

Hi!
I have a HP DL380 G9 server with 128GB RAM, 2x Intel Xeon E5-2630v3 CPUs and nVidia Grid K1 installed.
NVIDIA-vGPU-kepler-VMware_ESXi_6.0_Host_Driver 361.40-1OEM.600.0.0.2494585 driver is installed and running. Horizon View 7.0 is installed.
Have created a Windows Server 2012R2 VM with 4 vCPUs, 8GB fully allocated RAM and K180Q vGPU profile. I am running Agent 4.0 with direct connection plug-in and have Horizon Client 4.0.1 installed on the machine with Intel i5 3570 CPU and 8GB Ram from which I launch the connection. Have 1Gbit LAN. I use PCoIP. And when I try to view YouTube videos on the created VM I have noticable lagging (looks like 15-20 FPS in game). How can I enhance video playback performance? I have set HKLM->Software->Terdici keys pcoip.maximum_frame_rate to 120 and pcoip.max_link_rate to 900000 - seems to helped a bit, but not completely.

Further experience showed that any video playback appears to be not smooth. It obviously has something to do with framerate, because GPU load (according to nvidia-smi and GPU-Z) is about 70%, CPU load abput 22% and LAN about 40Mbit (out of 1000Mbit). Any suggestions on how to improved framerate?

Have been experimenting a little more and found out that if I set in connection settings Display->Fullscreen thus connecting only one of my 3 displays video playback starts running much smoother over PCoIP and even better when using VMware Blast. But when I set Display->All Monitors the playback becomes just disasterous via VMware Blast and laggy via PCoIP. My monitor setup: 1x 1920x1200 + 1x 2560x1600 + 1x1200x1920 (portrait mode).

You are encountering some of hte limitations of your protocol choices.

VMware should be able to advise on the best way to optimise the protocols to deliver the experience you’re expecting.

Setting the max frame rate to 120 fps is pointless.

YouTube runs at a more realistic 24-30fps, so you’re unnecessarily loading the system trying to reach higher than necessary FPS out of the VM.

I would suggest setting this no higher than 60, and if all your users are ever going to be doing is watching video, I’d reduce it further to 45fps.

You are observing the user experience at the end of the remote protocol though, not the experience in the VM, so you’re seeing the effect of the protocol.

YouTube should not drive the GPU to 70%, so look elsewhere. I’d also point out that measuring a VM’s GPU utilisation when it’s configured for vGPU is not possible to accurately achieve as we don’t provide any in VM metrics. I’d make sure you have some form of adblocking enabled in your browser too btw.

Your CPU’s are below the speed I would consider suitable for a good user experience at 30+ fps in a CPU encoded system. 2.8Ghz I would typically give as a minimum clock speed (Base not Turbo).

Have you measured the frames out that is being transmitted by PCoIP?

Be aware that when Blast Extreme runs in multiple monitor configurations it does not use h264 or GPU encode and falls back to JPG/PNG. This will have an effect on the performance.

It’s possible that the CPU speed in your VM is affecting the user experience as it’s not able to keep up with encodng PCoIP at the framerate you want. Your client is certainly capable of decoding quickly enough so shouldn’t be an issue.

Could you please advise how do I measure the frames out that are being transmitted by PCoIP?
About the CPUs - is the frequency that important? - can I compensate it by assigning more vCPUs to a VM?

Monitor from the WMI metrics, this is an old article from v5, but will set you on the right path

http://www.virtualizetips.com/2011/12/14/how-to-monitor-pcoip-performance-in-view-5-with-wmi-counters/

Frequency is hugely important, and adding more vCPU isn’t going to make any difference to the encoder performance once it has one vCPU to itself.

Make sure that your ESX power management settings have been updated from the default of "Balanced" to "High Performance" (per VMware VMware Knowledge Base). Per Jaosn’s comment The pcoip_server_win32.exe process should be using less than 25% of a 4 vCPU VM. If it is close to 25%, that may be limiting the "Imaging Encoded Frames/sec" shown in Perfmon. However, the CPU you have should be able to handle YouTube videos at 30fps without issue. Are you running them in native resolution or scaling them to full screen? If you are needing to support lots of pixel changes on these three monitors, you should consider using the PCoIP Hardware Accelerator card (Page Not Found) that offloads the PCoIP encoder from the CPU.

You should also consider getting a quad monitor PCoIP Zero Client as the i5 client you have probably can’t decode that many displays simultaneously either. The symptom for that is if the Encoded FPS is <30fps, but the pcoip_server_win32.exe process is will be below 25%, then the client is probably the limit.

Also agree with Jason that setting the pcoip.maximum_frame_rate to 30 (or no more than 45) is the prudent thing to do since the source content is 24-30fps. However, if you want to set the frame rate above 30, you also need to set registry key VMware SVGA DevTap\maxAppFrameRate which limits the sample rate of the vGPU.