Filesrc loop in deepstream - framerate slowdown

• Jetson Nano
• DeepStream 5.0 (GA)
• JetPack 4.4
• TensorRT 7.1.3.0
• bug

I am trying to introduce functionality of looping over static video files, under deepstream, using python binding api.
This functionality is doing basically:

  • creating bin(s) with filesrc->qtdemux->queue->h264parse->nvv4l2decoder
  • this bin is then connected to nvstreammux
  • there is another processing in pipeline after nvstreammux

For restart i am adding simple probe to nvv4l2decoder sink

        sink_pad = nvv4l2decoder.get_static_pad("sink")
        if not sink_pad:
            LOG.error("Unable to get sink pad on nvv4l2decoder")
            return False
        else:
            sink_pad.add_probe(Gst.PadProbeType.EVENT_DOWNSTREAM | Gst.PadProbeType.EVENT_FLUSH,
                               self.restart_stream_probe,
                               None)

and restart is done in the following way

def restart_stream_probe(self, pad, info, c_element):
    event = info.get_event()
    if event is not None and event.type == Gst.EventType.EOS:
        LOG.debug(f'EOS on {self.input_id}')
        GLib.timeout_add_seconds(1, self.rewind_source, None)
        return Gst.PadProbeReturn.DROP
    elif event.type == Gst.EventType.FLUSH_START or event.type == Gst.EventType.FLUSH_STOP:
        return Gst.PadProbeReturn.DROP
    return Gst.PadProbeReturn.OK


def rewind_source(self, data):
    LOG.debug(f'entering seek on {self.input_id}')
    self.element.set_state(Gst.State.NULL)
    self.filesrc.seek(1.0,
                           Gst.Format.TIME,
                           Gst.SeekFlags.FLUSH,
                           Gst.SeekType.SET, 0,
                           Gst.SeekType.NONE, 0)
    self.element.set_state(Gst.State.PLAYING)
    return False

I observe that, whole app framerate goes continuously down, when app is doing this kind of restart. With live streams like rtsp there is no problem.
My first version was to use uridecodebin (as is used in provided deepstream samples) but result was even worse. Uridecodebin was recreating whole child pads, so my assumption was it is leak in uridecodebin. Thats why i rewrited this fnc, and during seek, there is no creation procedure.
When videofile is quite short (20s for example) problem shows more quickly.
Jetson has “standard” perf improvements

  • jetson_clocks
  • nvpmodel -m 0

And also nvv4l2decoder has

  • bufapi-version = 1
  • enable-max-performance=1
  • num-extra-surfaces=8 (this helps a lot)

So question is, how to do correct restart on EOS? Is there different way ? Or there is mem/performance bug, when stream is restarted in this way?
Similar functionality is in deepstream_perf_demo.cpp, but for simple “rewind” without timestamp progression i dont need this accumulated_base calculations.

Slowdown progress is in the following observation

Can you upload your python script so that we can reproduce the problem?

Dear Fiona,
I did even more detailed research and here is the result. However, I can state that I probably miss some essential feature of gstreamer, or nvstreammux during a long run.

First of all, I repeat that my motivation is to prepare a service that can analyze video files from a certain directory, something like a multiuri, but with the fact that I do not process separate images according to the mask, but the entire video sequence. E.g. offline security camera recordings.

These are therefore not “live” inputs and the long-term stability of the run is also important. It can be a job that will run for several days, gradually downloading individual video files.

I started with a simple test, just rotating one file to test the stability.

After going through probably all the possible examples and documentation on the net, the recommendation for dynamic file input is as follows (this could easily be added to “python examples” for deepstream, as there is something in the c code, but nothing is stated for python.

  1. For each file input, I make a separate bin that contains

a. filesrc

b. qtdemux (matroskademux)

c. queue (buffer, for data stream balancing)

d. h264parse (h265parse)

e. nvv4l2decoder

One might argue that decodebin is better. But I wanted to have control over all the elements, and not have “blackboxes” for the tests

Probe to restart at EOS

sink_pad = nvv4l2decoder.get_static_pad("sink")
if not sink_pad:
    LOG.error("Unable to get sink pad on nvv4l2decoder")
    return False
else:
    sink_pad.add_probe(Gst.PadProbeType.EVENT_DOWNSTREAM | Gst.PadProbeType.EVENT_FLUSH,
                       self.restart_stream_probe,
                       None)

def restart_stream_probe(self, pad, info, c_element):
    event = info.get_event()
    if event is not None and event.type == Gst.EventType.EOS:
        LOG.debug(f'EOS on {self.input_id}')
        GLib.timeout_add_seconds(2, self.rewind_source2, pad)
        return Gst.PadProbeReturn.DROP
    elif event.type == Gst.EventType.FLUSH_START or event.type == Gst.EventType.FLUSH_STOP:
        return Gst.PadProbeReturn.DROP
    return Gst.PadProbeReturn.OK

The restart itself is much more stable when no seek is done, but only filesrc is created

def rewind_source2(self, pad):
    LOG.info(f'recreating source on {self.input_id}')
    self.element.set_state(Gst.State.NULL)

    self.filesrc.unlink(self.demuxer)
    self.element.remove(self.filesrc)

    self.filesrc = Gst.ElementFactory.make("filesrc", f"filesrc{self.source_number}")
    uri = self.configuration.get(DEFLT, 'uri', fallback='')
    p = urlparse(uri)
    LOG.info(p.path)
    self.filesrc.set_property("location", p.path)
    self.element.add(self.filesrc)
    self.filesrc.link(self.demuxer)

    self.filesrc.sync_state_with_parent()
    self.element.set_state(Gst.State.PLAYING)
    return False

This version is stable even during long-term running.

And now the test info.

I created the following pipeline for the test.

And to measure FPS, I used a simple probe, with a function that is also given in the deepstream python examples. And it averages in 5s.

def get_fps(self):
    end_time=time.time()
    if(self.is_first):
        self.start_time=end_time
        self.is_first=False
    if(end_time-self.start_time>5):
        LOG.info("{} fps at {} is {} ".format( self.instance_name, self.stream_name,float(self.frame_count)/5.0))
        self.frame_count=0
        self.start_time=end_time
    else:
        self.frame_count=self.frame_count+1

I set up 4 streams as follows.
• Stream5 - short video (~ 20s)
• Stream6 - short video (~ 30s)
• Stream7 - long video (> 10min)
• Stream8 - videotestsrc, gstreamer plugin

I set the caps for stream8 to fixed fps - 20/1
Streams 5,6,7 are restarted according to the above procedure at EOS.
nvstreammux - batched-push-timeout = 40000, batch-size = 4

The test itself ran ~ 10h 30m on a jetson nano.
There are ~ 60k measured fps values, used to create following charts.

It can be seen from the chart above that the fps is gradually decreasing. Interesting is the behavior of stream 8 (videotest), which can maintain 20fps for about 2 hours and then also begin to decline.

The standard deviation is also interesting.

  • Stream8 does not restart at all, but should provide static fps - but this behavior is different
  • Stream7 restarts at long intervals and has a delay between restarts of 2s.
  • Streams 5.6 are restarted at short intervals and have a delay between restarts of 2s.

So I’m interested in why this degradation occurs and I’ll repeat again at the beginning that I probably miss some essential feature of the gstreamer (eg latency recalculation), or nvstreammux in the long run.

Seems you have add new functions to the sample python script, can you provide the complete script?

That’s a little problem, because it’s not just a script, but whole project. I need to create separate script 4 you.

I understand. But the whole pipeline will impact to each other, so the script itself is the best description for the whole pipeline.

I understand, that’s why i do more tests to separate problematic element(s). During whole weekend the same pipeline was running just with “videotest” src - to prove if there is not problem with pgie, analytics, … But framerate is still constant.

If it is hard to provide your app for us to reproduce the problem, can you close this topic? You can create a new one when you find better way to share us the useful information.

Dear Fiona, i am preparing example for u.

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Can this be reproduced with Nvidia prebuilt model?If so, please provide the whole script and config file you use.

@rho Hello, I would like to implement the same for the python applications. Do you have an example for the same. That would help me a lot. Thanks!

2 Likes