GPU transcoding performance comparision using FFmpeg 4 and 5 branches

marcomay · October 19, 2023, 2:01pm

Hi,

Recently I’ve created some test runs for transcoding given MP4 files using FFmpeg with GPU acceleration.

I’ve noticed a drop in performance testing the recent FFmpeg 5.1 release with a simple full hardware transcode using h264_nvenc in comparison with FFmpeg 4.3/4.4.

Are there any comparable issues? Any hints to diagnose this?

Are benchmark results for GPU encoding/transcoding to compare?

Test

Old FFmpeg v4.4.4 version is called with parameter:

ffmpeg -y -hwaccel cuvid -c:v h264_cuvid -i input.mp4 -c:v h264_nvenc -b:v 2M output.mp4

New FFmpeg v5.1.3 version with is called parameter:

ffmpeg -y -hwaccel cuda -hwaccel_output_format cuda -extra_hw_frames 8 -i input.mp4 -c:v h264_nvenc -b:v 2M output.mp4

Added time ffmpeg -y -benchmark ... to measure the difference.

Results

For FFmpeg 4.4.4:

frame=32782 fps=344 q=19.0 Lsize=  343834kB time=00:21:51.32 bitrate=2148.0kbits/s speed=13.8x
video:322022kB audio:20521kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.377177%
bench: utime=30.184s stime=21.803s rtime=95.442s
bench: maxrss=135840kB
[aac @ 0x5601c60686c0] Qavg: 222.746

real    1m35.543s
user    0m30.184s
sys     0m21.845s

For FFmpeg 5.1.2:

frame=32782 fps=241 q=19.0 Lsize=  343956kB time=00:21:51.34 bitrate=2148.7kbits/s speed=9.63x
video:322022kB audio:20643kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.377043%
bench: utime=66.753s stime=27.008s rtime=136.567s
bench: maxrss=161032kB
[aac @ 0x55d7f93bd3c0] Qavg: 579.972

real    2m16.669s
user    1m6.777s
sys     0m27.032s

Setup

The input.mp4 was a video container a video track with ~ 21 min, H.264 (High@L3.1), 720p, 25 fps, 892 kbps bitrate, yuv420p and audio: AAC, 128 kbps.

Ffmpeg builds created in a docker container using nvidia/cuda:12.0.1-devel-ubuntu22.04 and nvidia/cuda:12.0.1-base-ubuntu22.04 with libnpp-12-0 installed as runtime image.

With headers from this tag: https://github.com/FFmpeg/nv-codec-headers/archive/refs/tags/n12.0.16.0.tar.gz.

FFMPEG was build with following configuration:

RUN cd /tmp/ffmpeg-${FFMPEG_VERSION} && \
    ./configure \
    --prefix=${PREFIX} \
    --disable-debug \
    --disable-doc \
    --disable-ffplay \    
    --enable-version3 \
    --enable-gpl \
    --enable-nonfree \
    --enable-small \
    --enable-libfdk-aac \
    --enable-openssl \
    --enable-cuda \
    --enable-cuvid \
    --enable-nvenc \    
    --enable-libnpp \
    --enable-nvenc \    
    --enable-shared \
    --extra-cflags="-I${PREFIX}/include -I${PREFIX}/include/ffnvcodec -I/usr/local/cuda/include/" \
    --extra-ldflags="-L${PREFIX}/lib -L/usr/local/cuda/lib64/" \
    --extra-libs=-ldl  && \
    make && \
    make install && \
    make distclean && \
    hash -r

Hardware: NVIDIA GeForce RTX 2080, Driver Version: 536.23, Docker 4.24.1 on Windows 10.

Thanks in advance for your comments and feedback on this issue.

val.zapod.vz · November 29, 2023, 8:27pm

Where is -hwaccel_output_format cuda -i .\test.mp4

It will be slow otherwise.

Topic		Replies	Views
GPU performances using ffmpeg GPU-Accelerated Libraries cuda , ffmpeg	6	3389	May 5, 2021
No Improvement Using Hwaccel for Video Transcoding with Ffmpeg CUDA Programming and Performance	3	1796	November 4, 2017
Nvenc performance degredation fromCUDA 11.4.2 to CUDA 11.6.2 Video Processing & Optical Flow	2	1092	October 23, 2023
FFMPEG GPU Transcoding Performance with Data Stream GPU-Accelerated Libraries ffmpeg	3	1392	July 31, 2022
FFMPEG doesn’t detect CUDA-capable device Video Processing & Optical Flow cuda , ffmpeg	0	1183	December 25, 2023
FFmpeg using NVIDIA GPUs fails Video Processing & Optical Flow	2	2639	December 18, 2020
ffmpeg failed at encoding on Tesla T4 card Video Processing & Optical Flow	2	2911	December 28, 2019
Ffmpeg: Mixing CPU and GPU processing Video Processing & Optical Flow cuda , ffmpeg	6	3853	January 27, 2022
NVIDIA FFmpeg Transcoding Guide Technical Blog	24	5263	June 21, 2022
FFmpeg nvenc: with Driver does not support the required nvenc API version. CUDA Setup and Installation	0	1536	December 19, 2019

GPU transcoding performance comparision using FFmpeg 4 and 5 branches

Related topics