Nvvideoconvert issue, nvvideoconvert in DS4 is better than Ds5?

|•|Hardware Platform Jetson AGX Xavier|
|•|DeepStream Version 4.0 And 5.0|
|•|JetPack Version 4.2 and 4.4 DP|
|•|TensorRT Version 5.0 and 7.1|

Hello guys, during my tests I noticed that:
executing the following pipeline I get different results using DS4 or DS5

Pipeline:
DS5
gst-launch-1.0 filesrc location = /opt/nvidia/deepstream/deepstream-5.0/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! nvvideoconvert ! video/x-raw,format=RGBA ! fakesink
DS4
gst-launch-1.0 filesrc location = /opt/nvidia/deepstream/deepstream-4.0/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! nvvideoconvert ! video/x-raw,format=RGBA ! fakesink

on Jetson AGX Xavier Jetpack 4.4DP and DS5.0 the execution time is:0:00:08.959876829
on jetson AGX Xavier Jetpack 4.2 and DS4.0 the execution time is: 0:00:03.024086334

If I remove the nvvideoconvert and restart the execution I get the following results:
Pipeline:

DS5
gst-launch-1.0 filesrc location = /opt/nvidia/deepstream/deepstream-5.0/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! fakesink
DS4
gst-launch-1.0 filesrc location = /opt/nvidia/deepstream/deepstream-4.0/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! fakesink

on Jetson AGX Xavier Jetpack 4.4DP and DS5.0 the execution time is:0:00:02.943324920
on jetson AGX Xavier Jetpack 4.2 and DS4.0 the execution time is: 0:00:02.982979002

As you can see without nvvideoconvert the results are very similar, with nvvideoconvert through DS5 the execution seems slower than the previous one made through DS4.

Is there any reason?
Did I make any mistake?

Regards
Ric

Hi,
Please excute the steps to run VIC in max clock and try again:

  1. Disable runtime suspend of VIC
$ echo on > /sys/devices/13e10000.host1x/15340000.vic/power/control
  1. Set userspace governor
$ echo userspace > /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/governor
  1. Set max_freq
$ cat /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/available_frequencies
$ echo [max_freq_val] > /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/max_freq
  1. Set target frequency
$ echo [freq_val] > /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/userspace/set_freq
4 Likes
Is it possible to increase the VIC module frequency on DRIVE AGX?
Inconsistent VIC performance
NvVideoEncoder: maximum number of streams and VIC compositions
Cant merge frames with nvcompositor on jetson nano
GStreamer lockup with H.264 encoder from nvarguscamerasrc
High latency in rtspout stream in DS-5.1
A lot of buffers are being dropped while capturing video frame and playback locally using gst-launch (L4T R32.4.3)
Nvcompositor is not working on Jetpack 4.6
NvBuffer Transform much slower than OpenCV GPU
Video Image Compositor
Jetpack 4.4 [L4T 32.4.3] customize nvpmodel to make NVENC from 499.2MHz to 729.6MHz
Question about nvv4l2decoder element
Gstreamer loses frames while encoding usb camera with rtsp camera
Large application doesn't use all available resources
GStreamer NVIDIA elements are slower in Jetpack 5.0.2 than in Jetpack 4.6. Is it normal?
NvBuffer sharing between processes without copying buffer
Camera's frame rate unstable
Dropped frames when using test-launch (RTSP)
Nvvidconv is slow
UYVY for cuGraphicsEGLRegisterImage in 32.2 SDK
GStreamer lockup with H.264 encoder from nvarguscamerasrc
Low camera frame rate
NvBufferComposite, It takes too much time
How to synchronize NvBufferTransform reading from an Argus IBufferOutputStream with SYNC_TYPE_EGL_SYNC?
NvBufferComposite and NvDrmRender,one crack line appear on screen
Video file reading by cv::VideoCapture and Gstreamer is slow
Nvv4l2h264enc dropping frames on live input
About the nano dec 4k jpeg fps
TX2 mmu fault during V4L2 dmabuf capture + cuda analysis
Agx Dynamic object video 60fps rtp send and recv some question?
Multi-video encoding performance of TX2 NX
Fast copy of DMA buffers via NvBufferTransform
Speed up stereo camera rectification and cropping (custom gstreamer plugin/ VPI)
How to composite several picture by NvBufferTransform
Deepstream nvof performance
Minimizing video latency between camera and program memory
Unreliable encoding using nvv4l2h264enc at 120fps
Nvarguscamerasrc missing video frames
Increase number of RTSP stream source regardless of FPS on Jetson Xavier Nx
An important bug about nvargus and tee /queue when captured by using multiple sensors ?
How to use GPU option in nvvideoconvert plugin for AGX Xavier platform
Split video into 4 smaller ones
NvDdkVicConfugure failed, nvbuffer_transform Failed
NVENC latency
Jetson xavier with multiple gmsl camera
Decrease timecost for h264 encoding 8x 1920*1080@30Hz
Limit of multiple nvv4l2h264enc & nvvidconv instances at the same time
Latency variation in camera stream displayed on an LCD
Getting pointer from NvBuffer
Video Glitches When Encoding Custom appsrc in GStreamer
IPC camera preview: latency is getting higher and higher over time
HDMI need to be replugged after starting 4k camera stream to get high fps
Monochrome sensor grey8 60fps performance issue
NvBufferTransform Failed
Red stripes using nvoverlaysink
NativeBuffer->copyToNvBuffer take a long time in 09_camera_jpeg_capture example when do 4K camera preview
Encode from cv::cuda::GpuMat
Ultra low latency on encoding and decoding
Streaming's latency
Streaming's latency
Gstreamer jpeg timestamp or metadata Continue
Image formats when JPEG Compressing with Jetson Multimedia APIs
Gstreamer lag increases with frame rate recording from Raspberry Pi camera (nvarguscamerasrc and nvivafilter)
Question about nvv4l2decoder element
Split video and share via with other processes
How to do "bl-output" for a single NVMM-buffer(NV12) which was been received by appsink?
Text flashes and then disappears when use nvosd to overlay text in a gstreamer video
DS app running on SD, not running on eMMC - Part II
NvBufferImportFd Implementation
Decode and Multiple render of tx2nx
Large "fwrite" written traffic affects the performance of nvbuffer and nvenc
VI/ISP throughput limit
Fps is reduced to 48 when running two 4k camera in 60fps in xavier nx platform
A bug about memory leak of the nvvideoconvert plugin by using GPU
Xavier Nx latency Capture vs Tx2 and Nano
Questions related to running the 14_multivideo_decode sample
Gstreamer RSTP server with nvcompositor latency increases as time goes by
Possible multimedia api regression with decode interlace source
Delay nvvidconv element in TX2i 4 channels
How to Convert CUeglFrame to EGLImageKHR?
Render Frames From Nvv4l2 camera src in OpenGL Renderer
Gstreamer issue in NX: Nano is faster than NX!
Latency issue: nvv4l2h265enc accumulates four images before releasing the first
Efficient jpeg decoding to RGB on AGX Xavier
Gst-launch-1.0 nvcompositor bug in tegra-l4t-r32.5.1?

Hi @DaneLLL
with the solution you provide me with, the results that I get on DS5 are similar to the ones of DS4.
Just one more thing, do you think that in further release of Jetpack (4.4 no developer preview) this settings will be enabled by default?

Ric

Hi,
We have developed a feature in r32.4.3:
Support for Dynamic Frequency Scaling (DFS) for Video Image Compositor (VIC) using actmon

Please check L4T page. If the loading is fine and performance is enough, you may keep it enabled for saving power. If you run a heavy-loading system, you can disable it and run in max clock.

@DaneLLL Under what circumstances should we do this ? When running multiple sources in a deepstream pipeline - anything that needs max performance?

How do we disable max clocks on VIC? Woud it be:

$ echo off > /sys/devices/13e10000.host1x/15340000.vic/power/control

Hi @DaneLLL,
Is there any criteria to set the best value for target frequency?
Is there a way to make these values persistent, also after a reboot of the device?
Thank you
Ric

Hi,
By default DFS is enabled. For manualy control, you can list all frequencies

$ cat /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/available_frequencies

And pick one fitting your usecase.

@DaneLLL - It seems strange that DFS is causing slower pipelines in DS5.0dp. What is the purpose of DFS if this is the case?

Are you able to help with previous questions:

  • Should we always switch to manual and max clocks (this is basically what jetson_clocks does for everything else)
  • How to disable manual and turn DFS back on
  • How to automatically set this so we don’t have to do it every time after a reboot?

Hi,
The purpose of DFS is for power saving. Actually sample_1080p_h264.mp4 is a 50-second content. It looks OK to have the execution time 0:00:08.959876829. If you don’t have concern in power consumption and like to turn off DFS, please add the commands in rc.local to manually configure it. There is an example of using rc.local: