Video Encode and Decode GPU Support Matrix

Greetings,

The latest changes have been published.

Hi,
While on the NVENC support matrix there is a breakdown for several h264 flavors (YUV 4:2:0, YUV 4:4:4), on the NVDEC - nothing - just h264.
I’m asking since we failed to decode h264 YUV 4:4:4 on several GPU models (with the “no supported on GPU” error), even that for all those GPUs it is marked that they support h264.

any thoughts?

thanks,
Eyal

1 Like

Got an RTX 2080 to replace a failed GTX 1070 Ti. What you mention in the support matrix about Turing is completely false. You should present a test case where your argument about “The video encoder in Turing GPUs has substantially improved quality and performance compared with Pascal. The overall encoding capacity of one NVENC in Turing is comparable to two NVENC’s in Pascal.” is valid (especially the bold statements) or remove it.

I test with following sample ffmpeg command

ffmpeg-4.0.3 -copyts -start_at_zero -dn -sn -loglevel error -stats -hwaccel_device 0 -hwaccel cuvid -c:v h264_cuvid -deint adaptive -surfaces 12 -f mpegts -i input_1080i.ts -aspect 16:9 -flags:v +cgop -vcodec h264_nvenc -preset slow -g 400 -refs 4 -bf 3 -qmin 20 -qmax 29 -rc-lookahead 40 -b_adapt 1 -strict_gop 1 -temporal-aq 1 -forced-idr 1 -b_ref_mode middle -profile:v high -level 4.2 -c:a libfdk_aac -ac 2 -b:a 128k -r 50 -f mpegts -y /dev/null

performance on RTX 2080 is 225 fps with 1 encode, 115+115 fps with 2 encodes (230 fps total) while performance on GTX 1070 Ti is 331 fps with 1 encode, 262+262 fps with 2 encodes (524 fps total).

So with two encodes, GTX 1070 Ti is 128% faster and with 1 encode, GTX 1070 Ti is 47% faster.

Please give a sample command where your statement about Turing NVNEC performance stands true.

edit: I see different performance with -preset medium. 392 fps for Turing with 1 encode, 201+201=402 fps for Turing with 2 encodes, 342 fps for Pascal with 1 encode, 268+268=536 fps for Pascal with 2 encodes. Turing is ahead with this preset with 1 encode and behind by 33% with 2 encodes - performance much better when compared with preset slow. So why is there so much drop in performance when 2pass encoding (preset slow) is used? Can we expect a fix for that?

edit 2: In case you ask, I run tests with latest 415.18 Linux driver.

Is it correct the Quadro RTX 4000 uses the TU104 GPU and not the TU106?

Hello @EwoutH,

Yes, you are correct. The Quadro RTX 4000 uses the TU104 GPU.

Hi, I’m interested in Gpu transcoding… My question is, Nv Quadro 4000 or 1080ti 11gb? Wich card performance is better?

Hi Jezda06,

You will get better performance from the 1080ti. Check out these independent test stats.

[url]https://gpu.userbenchmark.com/Compare/Nvidia-Quadro-4000-vs-Nvidia-GTX-1080-Ti/m7693vs3918[/url]

Can you confirm that the Quadro RTX 4000 has two NVDEC chips?

Are there any numbers published for NVDEC performance between different generations or cards?

Thanks

Hi Ryandbair,

The documentation states there are 2 NVDEC chips in the Quadro RTX 5000/RTX 4000. You can find some performance charts here: [url]https://developer.nvidia.com/nvidia-video-codec-sdk[/url]

Best,
Tom

Could you add the new GTX 1660 Ti (based on the TU116 gpu) to the Video Encode and Decode GPU Support Matrix?

Hi EwoutH,

Thanks for bringing this up. I have sent a request to the documentation team to update the matrix. I will post back here when the document lists the new card.

Best,
Tom

Thanks a lot! Could you also send a request for the TU117 (GTX 1650)?

Hello,

The GTX 1650 and 1660 Ti have been added to the matrix.

[url]https://developer.nvidia.com/video-encode-decode-gpu-support-matrix[/url]

Cheers,
Tom

Thanks!

In the Video Codec SDK 9.0.18 there was a note about the TU117 having the same NVENC unit as the Volta-gpu (GV100). This note was removed in the SDK 9.0.20, but now the NVENC Support Matrix states that there is no HEVC B Frame support on the TU117.

Is it correct the TU117 has an NVENC unit from the Volta generation? If so, can this be added back to the SDK documentation?

And is the TU117 NVDEC unit identical to the other Turing NVDEC units?

Can someone from Nvidia add Quadro RTX 3000, T2000 and T1000 to table of NVENC?

Thanks! Can you also add the new mobile Quadro cards? (Quadro RTX 5000, RTX 4000, RTX 3000, T2000, T1000, P620 and P520)

Hello,

We are working to add more GPU’s to the matrix. Watch for an update soon.

Hello,

The Video Encode and Decode GPU Support Matrix has been updated.

[url]https://developer.nvidia.com/video-encode-decode-gpu-support-matrix[/url]

Hello ThomasK,

Thanks for your update on GPU Support Matrix. Can you please tell us how many streams can we decode simultaneously in one GPU Card(Tesla V100 16G) ?

Thanks,
Anand S

@anandnatarajan.s,

36 streams: [url]https://images.nvidia.com/content/pdf/vgpu/data-sheet/nvidia-v100.pdf[/url]