Why video_decode sample experiences significant performance drop in Jetpack 5.1?

Here are the details of the benchmark.(Can anyone reproduce it on Orin?):

devices(MAXN jetson_clocks):

  • AGX Jetpack 5.1 L4T 35.2.1
  • Nano Jetpack 4.6.3 L4T 32.7.3


  • nvv4l2dec: ffmpeg -y -benchmark -c:v hevc_nvv4l2dec -i $input -f null -
  • gstreamer: gst-launch-1.0 filesrc location=$input ! h265parse ! nvv4l2decoder enable-max-performance=1 ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 -v
  • 00_video_decode: video_decode H265 --disable-rendering --stats $input
  • jetson_ffmpeg: ffmpeg -y -benchmark -c:v hevc_nvmpi -i $input -f null -


Stream #0:0: Video: hevc (Main), yuv420p(tv), 3840x2160, 23.98 fps, 23.98 tbr, 1200k tbn, 23.98 tbc
AGX(fps) Nano(fps)
00_video_decode 32.7 97.96
gstreamer 232.61 97.71
nvv4l2dec 24 63
nv_mpi 28 70
ffmpeg cpu 25 7.3


Stream #0:0: Video: h264 (High), yuv420p(tv, bt709, progressive), 3840x2160, 25 fps, 25 tbr, 1200k tbn, 50 tbc
AGX(fps) Nano(fps)
00_video_decode 25.7 88.10
gstreamer 132.84 87.29
nvv4l2dec 25 63
nv_mpi 23 69
ffmpeg cpu 64 19


Stream #0:0: Video: h264 (Main), yuv420p(progressive), 1280x544, 24.08 fps, 23.98 tbr, 1200k tbn, 47.95 tb
AXG(fps) Nano(fps)
00_video_decode 71.18 770.58
gstreamer 733.70 764.39
nvv4l2dec 47 409
nv_mpi 46 490
ffmpeg cpu 676 270

For the same video, when using gstreamer nvv4l2decoder, we obtained 200+fps on AGX and 97fps on Nano, which is normal. However, when using the 00_video_decode sample, we only obtained 32.7fps on AGX but 97fps on Nano!

Version of MMAPI:

$ apt show `dpkg -S /usr/src/jetson_multimedia_api | cut -d ':' -f1`
Package: nvidia-l4t-jetson-multimedia-api
Version: 35.2.1-20230124153320
Priority: standard
Section: Utils
Maintainer: NVIDIA Corporation
Installed-Size: 96.4 MB
Pre-Depends: nvidia-l4t-core (>> 35.2-0), nvidia-l4t-core (<< 35.3-0)
Depends: cuda-cudart-11-4, cuda-cudart-dev-11-4, libc6-dev, libglvnd-dev, libx11-dev, nvidia-l4t-camera (= 35.2.1-20230124153320), nvidia-l4t-multimedi
a (= 35.2.1-20230124153320), nvidia-l4t-multimedia-utils (= 35.2.1-20230124153320)
Homepage: http://developer.nvidia.com/jetson
Download-Size: 75.3 MB
APT-Manual-Installed: no
APT-Sources: https://repo.download.nvidia.com/jetson/common r35.2/main arm64 Packages
Description: NVIDIA Jetson Multimedia API is a collection of lower-level APIs that support flexible application development.

For information, do you compare AGX Orin with Jetson Nano in 4K decoding through 00_video_decode and see non-expected performance?

Sorry for the misleading information. I don’t get an AGX-ORIN. AGX ref to AGX Xavier. I’m sure it’ll reproduce on Orin.


FWIW, I am experiencing similar performance issue on AGX Orin

Notably, when I drop enable-max-performance=1 in gstreamer it performs on the same level as:

  • nvv4l2dec from Nvidia’s Gstreamer build
  • nvmpi of your variant of jetson_ffmpeg

I am testing on 4k H.264

Raw results

Nvidia Jetson FFmpeg build (decoding only)

ffmpeg -y -benchmark -c:v h264_nvv4l2dec -i ~/Downloads/iphone6s_4k.mov -f null -

# ...

frame=  540 fps= 73 q=-0.0 Lsize=N/A time=00:00:18.55 bitrate=N/A speed=2.52x 

jetson-ffmpeg mpi build (ported to new API)

./ffmpeg -y -benchmark -c:v h264_nvmpi -i ~/Downloads/iphone6s_4k.mov -f null -

# ...
frame=  549 fps= 78 q=-0.0 Lsize=N/A time=00:00:18.55 bitrate=N/A speed=2.65x 

Nvidia Jetson FFmpeg build without specyfing hardware

ffmpeg -y -benchmark -c:v h264 -i ~/Downloads/iphone6s_4k.mov -f null -

# ...

frame=  556 fps=127 q=-0.0 Lsize=N/A time=00:00:18.55 bitrate=N/A speed=4.24x

# faster!

Jetson sysem FFmpeg (not Nvidia build) software

ffmpeg -y -benchmark -c:v h264 -i ~/Downloads/iphone6s_4k.mov -f null -

# . 

frame=  556 fps=128 q=-0.0 Lsize=N/A time=00:00:18.55 bitrate=N/A speed=4.26x 

gstreamer with hardware decoder

gst-launch-1.0 filesrc location=$input ! qtdemux ! h264parse ! nvv4l2decoder enable-max-performance=1 ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 -v

# ...

/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 542, dropped: 0, current: 119,99, average: 119,32

gstreamer with hardware decoder not forcing performnace

gst-launch-1.0 filesrc location=$input ! qtdemux ! h264parse ! nvv4l2decoder  ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 -v

# ...

/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 548, dropped: 0, current: 72,94, average: 73,98

My 2017 laptop without specifying hardware

ffmpeg -y -benchmark -c:v h264 -i ~/Downloads/iphone6s_4k.mov -f null -'

# ...

frame=  556 fps= 86 q=-0.0 Lsize=N/A time=00:00:18.55 bitrate=N/A speed=2.88x

My 2017 laptop with NVDEC

ffmpeg -y -benchmark -hwaccel cuda  -i ~/Downloads/iphone6s_4k.mov -f null -

# ...

frame=  556 fps=174 q=-0.0 Lsize=N/A time=00:00:18.55 bitrate=N/A speed=5.81x 

I confirm loss of performance with AGX Jetpack 5.1 L4T 35.2.1 on AGX Orin 32 GB

My previous tests were made with

dpkg-query --show nvidia-l4t-core
nvidia-l4t-core	34.1.1-20220516211757
apt-cache show nvidia-jetpack
Package: nvidia-jetpack
Version: 5.0.1-b118

After flashing AGX Jetpack 5.1 L4T 35.2.1

cat /etc/nv_tegra_release 
# R35 (release), REVISION: 2.1, GCID: 32413640, BOARD: t186ref, EABI: aarch64, DATE: Tue Jan 24 23:38:33 UTC 2023
apt-cache show nvidia-l4t-core

Package: nvidia-l4t-core
Version: 35.2.1-20230124153320

Performance dropped from 111 to 39 fps (same hardware)

So far tested with community code (jetson-ffmpeg fork).

But considering tests by @dourokinga I expect problems on some other paths also.

For clearness, please create a new topic for AGX Orin. Would like to have this topic specific to AGX Xavier.

And please try Jetpack 5.1.1

