vGPU + Cloud Gaming

rea-developer · January 6, 2024, 5:15pm

Hello, friends. I’m just an ordinary person who decided to set up a home cloud gaming system for my family.

Here are the steps I took:

I bought an RTX A6000 card and divided it into 3 profiles of 16-Q each
I licensed it with vWDS licenses for each profile
I extended each profile to a guest machine

I started testing in 3d mark, and the performance I’m getting is just laughable.

The card has 10762 CUDA cores. If I divide it into 3 profiles in fixed share mode, on average, I would get 3500 CUDA cores, which is comparable to a 3060.

My CPU is an Intel Xeon 8358, which I also divided into 3 parts of 8vCPU 16 threads each.

In 3d mark, I get a score of 7000 at 1080p, while a similar 3060 scores 14000. I’m in tears, and so are my children :)

Could someone please tell me what I’m doing wrong? Can it be said that I’m calculating correctly and that one 16-Q profile should replace a 3060?

Please advise me, maybe there are some great guides on how to properly set up

NVIDIA-SMI for such gaming tasks
How to properly configure the XML config for a VM (Windows), also for gaming.

I have another oddity, the GPU utilization jumps from 0 to 100 and back in a zigzag pattern.

What can I use on guest Windows to measure performance? MSI Afterburner doesn’t work with vGPU.

Please don’t delete this post and I ask for help from everyone who knows how to help. Experts, please respond)

MarkusHoHo · January 8, 2024, 1:11pm

Hi @rea-developer,

no worries, we are not deleting your post. But I would like to move it the the Virtual GPU category if that is ok. Because I am certain there you will have a much better chance of getting the attention of experts.

I hope that is ok!

Thanks.

rea-developer · January 8, 2024, 2:54pm

Thanks a lot! Sure, may i ask you move this post?

sschaber · January 8, 2024, 3:07pm

Hi, first of all I would set the scheduler back to best effort. Afterwards you should remove the FRL which is set to 60 FPS per default.
Then you can run your benchmark again and post the results.

rea-developer · January 8, 2024, 3:35pm

Hello, thank you for the advice. I switched from best effort to fixed share because dividing the card into 4 parts of 12-Q (12GB) in best effort mode really makes everything good, but if 3 people start heavy games (first profile - Cyberpunk 2077, second profile RDR2, third FarCry 6, fourth Dota 2), then everything immediately drops to 28-30 FPS. If there is a fixed share, it guarantees that at least some games will run at 60+ FPS, except for Cyberpunk and FarCry RDR2, they immediately drop to 30FPS :(

You also said to delete FRL, after rebooting the virtual machine, this flag is set again and the machine works again at 60fps… I don’t know what can be done here to completely remove such a thing.

Maybe I need to flexibly configure round-robin?

https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#changing-vgpu-scheduling-policy-nvidia-smi

I don’t have experience in terms of which is best to use. Do you know how to better adapt the scheduler for my tasks?

I came across an old T4 card yesterday, which was set as passthrough. It has 16GB and 2560 shaders. My profile has the same 2600 shaders in fixed share. But on the T4, Cyberpunk 2077 was running at 60 FPS, and I was shocked. Are vGPU profiles significantly weaker in performance?

Also, I saw an old NVIDIA Quadro RTX 6000 (4k shaders). I entered a part of its profile and everything worked well in CyberPunk 2077 there too. It was with vgpu …

What should I do in this case? How can I completely disable FRL so that it never comes back? I read that it’s just a test flag and should not be touched.

rea-developer · January 8, 2024, 3:44pm

In my opinion, even when best effort and FRL 60 fps are enabled, running 3 heavy games causes a drop to 35 fps, and they also drag down a less demanding game, which starts to suffer as well.

Maybe fixed share better than best effort ?

sschaber · January 8, 2024, 3:48pm

You cannot compare the number of CUDA cores between different GPU generations.

To be honest I don’t see a solution to your use case (expectation). You simply won’t be able to run multiple AAA games in parallel on a single GPU with constant 60 fps if you can easily fully leverage the A6000 with a single session.
vGPU is designed for professional applications where you don’t need 100% GPU performance on all sessions at the same time slot (1ms), so the best effort scheduler is very effective. This won’t work for your use case so you may end up with only 2 parallel sessions where you can guarantee the 60fps for AAA games.

Disable FRL should be permanent. Which hypervisor do you use?

rea-developer · January 8, 2024, 3:51pm

I use KVM/QEMU

sschaber · January 8, 2024, 3:55pm

Really depends. Equal should be used instead of fixed here. Equal or fixed guarantees the time slice. But equal depends on the number of running VMs so is much better for you.
For professional use, 98% of our customers are running best effort. Indeed, it might help to run with equal and tweak the time slice interval.

rea-developer · January 8, 2024, 3:59pm

Thanks a lot! But what about FRL? May I ask you, which command should I use to disable FRL in KVM/QEMU?

rea-developer · January 8, 2024, 4:23pm

"Could you please advise if there is a possibility to get not the 537 drivers but the latest 545 ones? Or do you need to have special support from Nvidia Cloud Gaming to use such things?

sschaber · January 8, 2024, 5:32pm

Unfortunately. I’m not very familiar with KVM. You may create a support ticket with our NVES to ask for the proper way to disable FRL permanently on KVM although it is not necessary at all when using equal or fixed share as you don’t have FRL active there.
Indeed, as vGPU customer you can only rely on the branch drivers available in the portal. Cloud gaming is a different story. Our cadence for vGPU drivers is much slower compared to cloud gaming but cloud gaming is nothing you can buy easily.

rea-developer · January 8, 2024, 6:07pm

Oh, that’s sad to hear that the drivers cut the performance. Could I find out approximately by how much? I’m not asking for an exact number, but is it 10-20%? What’s the range? I would be grateful for an answer, it would really help me a lot.

sschaber · January 8, 2024, 6:16pm

Sorry for being unprecise. There shouldn’t be a real difference in performance. I just meant the driver release cadence is different. Gaming drivers are released very often whereas professional driver branches are released less frequent.
For sure it can happen with new games that the optimization is only present in the latest gaming driver that is not available for the professional branch yet. The next vGPU major release will contain the latest driver branch and will be released soon.

You can even change the profile settings in the professional driver so that it should work identical to the gaming driver:

Topic		Replies	Views
Which NVIDIA GPUs are more suitable for high-performance computing? CUDA Programming and Performance	33	2096	November 13, 2024
vGPU for AutoCAD/RDSH questions General Discussion	17	12416	December 10, 2020
Putting best foot forward General Discussion	3	1975	February 20, 2020
Grid newbee, want to add CAD VDI to serveur. Need help General Discussion	16	13848	June 1, 2017
What software to use for our new single NVIDIA T4 Tesla card on VMware 6.7 ESXi Host General Discussion	14	10081	August 17, 2020
18 GPUs in a single rig and it works CUDA Setup and Installation	6	14316	August 10, 2014
Suitable GRID NVIDIA Virtual GPU Technology	14	14922	June 14, 2017
VRAM Allocation Issues Linux	83	20798	April 26, 2025
NVIDIA GRID VGPU support does not match desktop setting + Esxi console blank General Discussion	20	24417	June 15, 2017
vGPU Utilization Per VM NVIDIA Virtual GPU Technology	22	39507	August 25, 2016

vGPU + Cloud Gaming

Related topics