Nvvideoconvert issue, nvvideoconvert in DS4 is better than Ds5?

|•|Hardware Platform Jetson AGX Xavier|
|•|DeepStream Version 4.0 And 5.0|
|•|JetPack Version 4.2 and 4.4 DP|
|•|TensorRT Version 5.0 and 7.1|

Hello guys, during my tests I noticed that:
executing the following pipeline I get different results using DS4 or DS5

Pipeline:
DS5
gst-launch-1.0 filesrc location = /opt/nvidia/deepstream/deepstream-5.0/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! nvvideoconvert ! video/x-raw,format=RGBA ! fakesink
DS4
gst-launch-1.0 filesrc location = /opt/nvidia/deepstream/deepstream-4.0/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! nvvideoconvert ! video/x-raw,format=RGBA ! fakesink

on Jetson AGX Xavier Jetpack 4.4DP and DS5.0 the execution time is:0:00:08.959876829
on jetson AGX Xavier Jetpack 4.2 and DS4.0 the execution time is: 0:00:03.024086334

If I remove the nvvideoconvert and restart the execution I get the following results:
Pipeline:

DS5
gst-launch-1.0 filesrc location = /opt/nvidia/deepstream/deepstream-5.0/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! fakesink
DS4
gst-launch-1.0 filesrc location = /opt/nvidia/deepstream/deepstream-4.0/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! fakesink

on Jetson AGX Xavier Jetpack 4.4DP and DS5.0 the execution time is:0:00:02.943324920
on jetson AGX Xavier Jetpack 4.2 and DS4.0 the execution time is: 0:00:02.982979002

As you can see without nvvideoconvert the results are very similar, with nvvideoconvert through DS5 the execution seems slower than the previous one made through DS4.

Is there any reason?
Did I make any mistake?

Regards
Ric

Hi,
Please excute the steps to run VIC in max clock and try again:

  1. Disable runtime suspend of VIC
$ echo on > /sys/devices/13e10000.host1x/15340000.vic/power/control
  1. Set userspace governor
$ echo userspace > /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/governor
  1. Set max_freq
$ cat /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/available_frequencies
$ echo [max_freq_val] > /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/max_freq
  1. Set target frequency
$ echo [freq_val] > /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/userspace/set_freq
4 Likes
NvVideoEncoder: maximum number of streams and VIC compositions
Is it possible to increase the VIC module frequency on DRIVE AGX?
GStreamer lockup with H.264 encoder from nvarguscamerasrc
A lot of buffers are being dropped while capturing video frame and playback locally using gst-launch (L4T R32.4.3)
Cant merge frames with nvcompositor on jetson nano
Nvcompositor is not working on Jetpack 4.6
High latency in rtspout stream in DS-5.1
NvBuffer Transform much slower than OpenCV GPU
Latency issue: nvv4l2h265enc accumulates four images before releasing the first
Efficient jpeg decoding to RGB on AGX Xavier
Gst-launch-1.0 nvcompositor bug in tegra-l4t-r32.5.1?
Dropped frames when using test-launch (RTSP)
UYVY for cuGraphicsEGLRegisterImage in 32.2 SDK
GStreamer lockup with H.264 encoder from nvarguscamerasrc
Low camera frame rate
Video Image Compositor
Camera's frame rate unstable
NvBufferComposite, It takes too much time
How to synchronize NvBufferTransform reading from an Argus IBufferOutputStream with SYNC_TYPE_EGL_SYNC?
Gstreamer loses frames while encoding usb camera with rtsp camera
NvBufferComposite and NvDrmRender,one crack line appear on screen
Video file reading by cv::VideoCapture and Gstreamer is slow
Inconsistent VIC performance
About the nano dec 4k jpeg fps
TX2 mmu fault during V4L2 dmabuf capture + cuda analysis
Fast copy of DMA buffers via NvBufferTransform
Speed up stereo camera rectification and cropping (custom gstreamer plugin/ VPI)
How to composite several picture by NvBufferTransform
Nvarguscamerasrc missing video frames
Increase number of RTSP stream source regardless of FPS on Jetson Xavier Nx
Image formats when JPEG Compressing with Jetson Multimedia APIs
An important bug about nvargus and tee /queue when captured by using multiple sensors ?
How to use GPU option in nvvideoconvert plugin for AGX Xavier platform
Split video into 4 smaller ones
NvDdkVicConfugure failed, nvbuffer_transform Failed
NVENC latency
Jetson xavier with multiple gmsl camera
Decrease timecost for h264 encoding 8x 1920*1080@30Hz
Jetpack 4.4 [L4T 32.4.3] customize nvpmodel to make NVENC from 499.2MHz to 729.6MHz
Limit of multiple nvv4l2h264enc & nvvidconv instances at the same time
Large application doesn't use all available resources
Latency variation in camera stream displayed on an LCD
Multi-video encoding performance of TX2 NX
Nvvidconv is slow
Split video and share via with other processes
How to do "bl-output" for a single NVMM-buffer(NV12) which was been received by appsink?
DS app running on SD, not running on eMMC - Part II
Decode and Multiple render of tx2nx
Possible multimedia api regression with decode interlace source
Delay nvvidconv element in TX2i 4 channels
Gstreamer issue in NX: Nano is faster than NX!

Hi @DaneLLL
with the solution you provide me with, the results that I get on DS5 are similar to the ones of DS4.
Just one more thing, do you think that in further release of Jetpack (4.4 no developer preview) this settings will be enabled by default?

Ric

Hi,
We have developed a feature in r32.4.3:
Support for Dynamic Frequency Scaling (DFS) for Video Image Compositor (VIC) using actmon

Please check L4T page. If the loading is fine and performance is enough, you may keep it enabled for saving power. If you run a heavy-loading system, you can disable it and run in max clock.

@DaneLLL Under what circumstances should we do this ? When running multiple sources in a deepstream pipeline - anything that needs max performance?

How do we disable max clocks on VIC? Woud it be:

$ echo off > /sys/devices/13e10000.host1x/15340000.vic/power/control

Hi @DaneLLL,
Is there any criteria to set the best value for target frequency?
Is there a way to make these values persistent, also after a reboot of the device?
Thank you
Ric

Hi,
By default DFS is enabled. For manualy control, you can list all frequencies

$ cat /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/available_frequencies

And pick one fitting your usecase.

@DaneLLL - It seems strange that DFS is causing slower pipelines in DS5.0dp. What is the purpose of DFS if this is the case?

Are you able to help with previous questions:

  • Should we always switch to manual and max clocks (this is basically what jetson_clocks does for everything else)
  • How to disable manual and turn DFS back on
  • How to automatically set this so we don’t have to do it every time after a reboot?

Hi,
The purpose of DFS is for power saving. Actually sample_1080p_h264.mp4 is a 50-second content. It looks OK to have the execution time 0:00:08.959876829. If you don’t have concern in power consumption and like to turn off DFS, please add the commands in rc.local to manually configure it. There is an example of using rc.local: