Expected performance gain

felix.don · October 31, 2024, 12:07pm

Hi Devs))

I’m trying to compare the performance of FFMPEG compiled with NVDEC versus NVDEC used directly with CUDA.
One of the differences I’ve noticed is that using directly NVDEC approach eliminates the Device to Host data transfer which is a costly transfer.
Another difference while using the FFMPEG approach is that ‘cuvidParseVideoData’ and ‘cuvidMapVideoFrame’ are executed in the same thread (opposite to what is recommended in the documentation, since ‘cuvidMapVideoFrame’ will block the execution) .
So eventually I was expecting a much higher performance gain while using CUDA.
What I actually experience is approx. 2 times faster rendering.

What differences I observe:

GPU frame duration using CUDA approach is ~4 times less.
The latency of CUDA API calls (such as call to ConvertNV12BLtoNV12) is of magnitude higher with CUDA approach - why is that?
cuStreamSynchronize takes much more time with CUDA approach.

Attached two reports of Nsight to demonstrate the case-

compare_reports.zip (908.4 KB)

Topic		Replies	Views
Comparison with FBO and nvcc option CUDA Programming and Performance	0	1806	March 27, 2007
CUDA vs. OpenVidia CUDA Programming and Performance	1	3340	November 14, 2007
[Linux] NVCuvid - Performarce CUDA Programming and Performance	13	4015	March 9, 2016
NVDEC - Post decode performance issue Video Processing & Optical Flow	6	1358	May 14, 2020
CUDA performance vs. openCL performance CUDA Programming and Performance	7	12360	June 8, 2012
Performance loss porting code from Ubuntu to Windows 10 CUDA Programming and Performance	3	345	April 29, 2021
CUDA performance ubuntu 16.04 vs windows 7? CUDA Programming and Performance	0	559	November 4, 2016
Cuda and ATI Stream technology diff CUDA Programming and Performance	8	3446	August 20, 2009
Cuda Vs. OpenGL CUDA Programming and Performance	1	6806	October 3, 2007
NVDEC performance Video Processing & Optical Flow	0	458	February 26, 2023

Expected performance gain

Related topics