An important bug about nvargus and tee /queue when captured by using multiple sensors ?

I am using the Jetson NX platform,and with Jetpack 4.4.1.
As shown in the following figure, I can get h264-data/yuv-data 25fps in the appsink/fakesink by using capture single sensor, no matter how many app-bin are followed by tee.

But after add a same pipeline to capture another sensor, I can get 25fps from the appsink/fakesink with 2 app-bins;I can get only 21fps from the appsink/fakesink with 3 app-bins;and get only 17fps with 4app-bin. As app-bin and queue increase, the number of captured frames decreases.
I had confirmed that Queue didn’t emit the signal which name “OVERRUN” .And in appsink/fakesink’s callback I only counter and didn’t do other time-consuming operations, even removing the h265encoder has the same effect.

How can I get 25fps capture data by using mulitiple sensor?

Hi,
Please apply the following items and check if there is improvement:

  1. Set sync=0 to all sinks. This disables synchronization mechanism in gstreamer frameworks.
  2. Execute sudo nvpmodel -m 2 and sudo jetson_clocks. All modes are listed in developer guide. Mode 2 is 4 CPU cores@1.4GHz. Please also try mode 0(2 CPU cores@1.9GHz)
  3. Refer to this post to run VIC at max clocks
  4. Refer to this elinux page to boost clocks of camera-related engines.

Hi,
My previous test environment was a 6-core 15W, and all sinkplugin was set objet-sync in false;
For others, I will try it by follow your suggestions;
But I personally think that the gStreamer framework or NvarGUS is more likely to have problems,
Because when two cameras are running and the number of queue-bins doesn’t exceed 2,it was work in 25fps, I think there is no problem with the capture capability of the hardware.

hello DaneLLL,
I had tried your all suggestion,but the resault was the same as before.
Can you help me out a little further,It looks like a software problem with nvargus and tee/queue,I think the performance of the NX is quite strong,these application shouldn’t be too stressful for it.

The same happens when I replace the capture with 1920*1080 mode

Hi,
The default setting of buffer number in nvarguscamerasrc is

#define MIN_BUFFERS 6
#define MAX_BUFFERS 8

Please increase to 12 and rebuild/replace libgstnvarguscamerasrc.so for a try. You can see source code of nvarguscamerasrc in
https://developer.nvidia.com/embedded/linux-tegra
Links to L4T Driver Package (BSP) Sources

Hello,DaneLL
it was the same result after set to 12

Hi,
Do you run two cameras in same process? Or one camera in one process?

Please also check sudo tegrastats and see if there is any clue from it.

Hello,DaneLLL
It was the same result.Whether it’s running two cameras in a single process, or two single camera in two process.

I think nvargusCamerasrc is a great plugin that includes a lot of features that V4L2 doesn’t have.However, this problem currently occurs when using multiple sensors,this bug is easy to reproduce.

run a camera in one process
image

and run two cameras in one process
image
Is VDDIN out of scope?

Hello,DaneLL
I had a new found. In the function bool StreamConsumer::threadExecute(GstNvArgusCameraSrc *src)(gstnvarguscamerasrc.cpp), I had printf the frame counter and it’s time,it was work well when only using a camera,it’s very standard 25 fps.


But when using two camera,the frame counter interval was more than 25,and the time is equal to the (frame counter interval) *40ms.


This means that some frames are not picked up on time,It’s not enough internal memory to capture? Or the delay in consumption caused the blockage for too long?

I ran some more detailed tests:print the consumption time.
It was fast when using a single camera to capture:


But a lot of times over 40ms when capturing with two cameras:

Hi,
We will try to reproduce the issue first. Will update.

And are you able to try Jetpack 4.6(r32.6.1)? Or you have to stay on r32.4.4?

Hello,DaneLLL
We will stay on r32.4.4 this year,can you git me a patch it after you fix it?
More detailed bug location:A bug of the function "NvBufferTransform" in gstnvarguscamerasrc.cpp

Hi,
Please run the command and check if you observe the issue:

gst-launch-1.0 -v nvarguscamerasrc sensor-id=0 ! 'video/x-raw(memory:NVMM),width=1920,height=1080,framerate=30/1' ! tee name=t ! queue ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t. ! queue ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t. ! queue ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t. ! queue ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 nvarguscamerasrc sensor-id=1 ! 'video/x-raw(memory:NVMM),width=1920,height=1080,framerate=30/1' ! tee name=t1 ! queue ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t1. ! queue ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t1. ! queue ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t1. ! queue ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0

We run on r32.6.1/Xavier NX developer kit + two Raspberry Pi V2 cameras and can see all sinks at 30fps. Please help give it a try.

Hello,DaneLLL
The comand was work well in Jetpack4.6 by using terminal ,I want to run it on the C-code to further locate the problem.
Do you now how to set “video-sink=fakesink” in c-code,I set it but it to fakesink,but it general a autovideosink

.

Hello,DaneLL
I further located the bug,it has nothing to do with Tee or Queue,but it has to do with a plugin which name is “nvvidconv”;
As shown in the figure below,it can work well 25fps(3mintues 4546frames) without nvvidconv;
,


And when using two queue to convert to 1920*1080,it will get 23.6fps(3minutes 4250frames)

With nvVidconvert number increased to 4,the framerate was drop to 15.5fps(3minutes 2800frames).

Hello,DaneLL
you can repetition the bug by the follow two commands in two terminals:
A:
gst-launch-1.0 -v nvarguscamerasrc sensor-id=0 ! ‘video/x-raw(memory:NVMM),width=4000,height=3000’ ! tee name=t ! queue ! nvvidconv ! ‘video/x-raw,width=4000,height=3000’ ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t. ! queue ! nvvidconv ! ‘video/x-raw,width=4000,height=3000’ ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t. ! queue ! nvvidconv ! ‘video/x-raw,width=4000,height=3000’ ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t. ! queue ! nvvidconv ! ‘video/x-raw,width=4000,height=3000’ ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0
B:
gst-launch-1.0 -v nvarguscamerasrc sensor-id=1 ! ‘video/x-raw(memory:NVMM),width=4000,height=3000’ ! tee name=t ! queue ! nvvidconv ! ‘video/x-raw,width=4000,height=3000’ ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t. ! queue ! nvvidconv ! ‘video/x-raw,width=4000,height=3000’ ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t. ! queue ! nvvidconv ! ‘video/x-raw,width=4000,height=3000’ ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 t. ! queue ! nvvidconv ! ‘video/x-raw,width=4000,height=3000’ ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0

Hi,
In your pipeline there is memory copy in nvvidconv plugin:

'video/x-raw(memory:NVMM),width=4000,height=3000' ! tee name=t ! queue ! nvvidconv ! 'video/x-raw,width=4000,height=3000'

It copies data from NVMM buffer to CPU buffer. So the performance is dominated by CPU cores. The resolution is > 4K so it shall hit constraint of CPU capability. For optimal solution we suggest keep NVMM buffer from sources to sinks. This would eliminate the memory copy and gives optimal performance on Jetson platforms.

Hello DaneLLL
Can Nvidia provide patches to configure support for more than 4K resolution?4000*3000 resolution is now common used in sensor(477/577/586 and so on).