Multiple gstreamer pipelines slow down when in the same process

If I run the following 4 gstreamer pipelines at the same time as separate processes they all run at 30 fps each:
$ gst-launch-1.0 -v videotestsrc ! fpsdisplaysink video-sink=nveglglessink &
$ gst-launch-1.0 -v videotestsrc ! fpsdisplaysink video-sink=nveglglessink &
$ gst-launch-1.0 -v videotestsrc ! fpsdisplaysink video-sink=nveglglessink &
$ gst-launch-1.0 -v videotestsrc ! fpsdisplaysink video-sink=nveglglessink &

If I run all 4 in the same process they only run at about 15 fps each:
$ gst-launch-1.0 -v videotestsrc ! fpsdisplaysink video-sink=nveglglessink videotestsrc ! fpsdisplaysink video-sink=nveglglessink videotestsrc ! fpsdisplaysink video-sink=nveglglessink videotestsrc ! fpsdisplaysink video-sink=nveglglessink

Hi,
For multiple sources in single process, we suggest use nvcompositor. Please try

gst-launch-1.0 \
nvcompositor name=mix sink_0::xpos=0 sink_0::ypos=0 sink_1::xpos=0 sink_1::ypos=240 ! nvegltransform ! nveglglessink \
videotestsrc ! nvvidconv ! mix.sink_0 \
videotestsrc ! nvvidconv ! mix.sink_1

We already have software written for the TK1 that we want to move to the Jetson Nano. The TK1 does not exhibit this behavior. Our pipelines are independent but are within the same process.

I see the same with Xavier and R32.4.2.
I think that the limitation might be about less than 60 fps to be shared among all nveglglessinks attached to one gstClock.
When you launch 4 separate pipelines, they each get a separate gstClock.

However, the solution provided by @DaneLLL is good. I’ve checked with up to 5 sources (not tried further):

gst-launch-1.0 -v nvcompositor name=mix sink_0::xpos=0 sink_0::ypos=0 sink_1::xpos=0 sink_1::ypos=240 sink_2::xpos=0 sink_2::ypos=480  sink_3::xpos=0 sink_3::ypos=720  sink_4::xpos=0 sink_4::ypos=960  ! nvegltransform ! nveglglessink  \
 videotestsrc ! nvvidconv ! tee name=t0 ! queue ! mix.sink_0    t0. ! queue ! fpsdisplaysink video-sink=fakesink text-overlay=false \
 videotestsrc ! nvvidconv ! tee name=t1 ! queue ! mix.sink_1    t1. ! queue ! fpsdisplaysink video-sink=fakesink text-overlay=false \
 videotestsrc ! nvvidconv ! tee name=t2 ! queue ! mix.sink_2    t2. ! queue ! fpsdisplaysink video-sink=fakesink text-overlay=false \
 videotestsrc ! nvvidconv ! tee name=t3 ! queue ! mix.sink_3    t3. ! queue ! fpsdisplaysink video-sink=fakesink text-overlay=false \
 videotestsrc ! nvvidconv ! tee name=t4 ! queue ! mix.sink_4    t4. ! queue ! fpsdisplaysink video-sink=fakesink text-overlay=false

The nvcompositor will not work with our current system architecture. We already have a working solution with the TK1 (which is no longer supported) that does not slow down the same way the Jetson Nano does. This would be a step backward for us.

Hi,
You may run sudo jetson_clocks or set sync=false to nveglglessink.

I’ve already run jetson_clocks and we can’t set sync=false since we need to sync to audio.

We would like to switch our project from the TK1 (which is EOL) to the Jetson Nano but this issue is a showstopper since we are getting much better performance from the TK1.

Hi,
We would suggest use nvcompositor in this usecase. There is similar implementation nvmultistreamtiler to composite 8 sources in DeepStream SDK. The solution should be good and stable.

Thank you for suggesting a possible workaround but switching to using the nvcompositor or nvmultistreamtiler would require major architectural changes to our existing software which already contains hundreds of thousands of lines of code. Since the Jetson Nano is basically the next generation after the TK1, I am surprised that it performs so much worse in this situation. We didn’t expect the TK1 to be EOL so quickly and would like to use the Jetson Nano instead but this issue is becoming a showstopper for us. Our software runs ok on the Jetson Nano otherwise.

It appears that the output rate is the limit of the sum of the stream rates in the process.

At 3840x2160p60 output I can run 1x60fps stream or 2x30fps streams. If I try to run 3x30fps streams each stream drops to 20fps.

At 3840x2160p30 output I can run 1x30fps stream. If I try to run 2x30fps streams each stream drops to 15fps.

The nvcompositor “solution” will not work with our software as architected since we currently need to have multiple streams in multiple processes. We would need to put all of our streams into one process in order to use the nvcompositor, which would be very difficult if not impossible for us to do.

Hi,
The issue is under investigation. It may take some time. On current L4T release(s), we suggest leverage nvcompositor plugin to composite sources into single video plane.

1 Like

Hi,
We have this supported in JP4.5(r32.5). Please upgrade to this release and give it a try. Thanks.