Pipeline with multiple input sources hang with DeepStream 6

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU (T4)
• DeepStream Version 6.0, 6.0.1, 6.1
• JetPack Version (valid for Jetson only)
• TensorRT Version DeepStream container default
• NVIDIA GPU Driver Version (valid for GPU only) 450 / 510
• Issue Type( questions, new requirements, bugs) bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

We have reliably working DeepStream 5.1 applications using multiple input sources which are connected to an nvstreammux element. We run them at scale in production without any issues. Since upgrading these applications to DeepStream 6 we can’t use multiple input sources anymore. I’ve tried out the three available versions of DS6:

DS6.0:

  • The pipelines with more than 1 input source randomly fails in ~5% of cases.

DS6.0.1 and DS6.1:

  • 1 input source works reliably.
  • 2 input sources fail randomly in ~5% of cases like with DS6.0.
  • With more than 2 input sources the pipeline immediately hangs and never shows any GPU utilisation.

All that has changed is the container we are using, the Python DeepStream bindings and the TensorRT engine which we generated for the required TensorRT version.

Furthermore, I tried out the new nvstreammux element in 6.1 but this showed the same issue as described above.

I’ve analysed the Gstreamer debug logs and double checked any message that is getting put on the message bus to the best of my knowlegde but couldn’t find anything helpful.

I came across other forum entries which sound similar to our problem, however, they were never resolved but just closed like this.

The release notes of DS 6.0.1 mention:

Minor bug fixes in Gst-nvvideo4linux2 encoder/decoder, Gst-nvstreammux, and Gst-
nvstreamdemux plugins

I suspect this to be a bug due to a change in DeepStream 6 and potentially related to the bug 6.0.1 tried to fix.

Please see the Gstreamer diagram below for further reference. It shows a case where a pipeline with 2 input sources was working. The error cases are pipelines with the same structure just more input bins.

Kind regards,
Jens

2 Likes

Can the deepstream-app sample work with your multiple sources on DS 6.0 and DS 6.1?

Thank you for your reply. I will test it in the next days and get back to you!

I had a look through the example apps but couldn’t find a suitable one. Is there any example which doesn’t need an external display attached? I’m running this on an AWS EC2 instance and am just writing the results to a file.

deepstream-app sample can support none display output. DeepStream Reference Application - deepstream-app — DeepStream 6.1.1 Release documentation

The output is configurable.

Thank you. I tried an example with multiple input streams from a local file and couldn’t reproduce it with the deepstream-app. However, the setup of the application is different (it uses a decodebin for example which I don’t use and there are other obvious difference like that I’m using the Python bindings and extract the data in a fakesink. That being said this setup worked well with DeepStream 5.1.

Which changes were made to the nvv4l2decoder or nvstreammux element from 5.1 to 6? What were the bugfixes about in 6.0.1 which I mentioned in my first post. They did change the described behaviour. It would be great if you could point me at a few things that are worth checking.

There are lots of changes which can be found in DeepStream 6.0.1 release note document.

You may also dump the graph of deepstream-app to compare the difference between the decodebin and your source bin.

What I found in the Release Notes of 6 and 6.0.1 is just:

Minor bug fixes in Gst-nvvideo4linux2 encoder/decoder, Gst-nvstreammux, and Gst-
nvstreamdemux plugins

Gst-nvinfer plugin:
● NHWC format support
● TAO converter update for ONNX path

Both don’t help and proper detailed inforamtion of the first point could be helpful to narrow down the source of the problem.

I’ve already tried out switching to uridecodebin and it doesn’t change the observed behaviour. (The uridecodebin and my source bin were already setup in the same way.)

As to multiple streams with python. There is sample deepstream_python_apps/deepstream_test_3.py at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub

Thank you, I had an in-depth look into all provided Python examples before I posted my question. This particular example uses the nvurisrcbin which I just tried out, too, as a replacement for the uridecodebin as it looked quite similar to me, however, that doesn’t change the behaviour again. From what I can tell it just chooses the same setup as I originally had setup for decoding MP4 files and that was the same with the uridecodebin. I’ve dumped a screenshot of a working pipeline with 1 source here just to show the chosen settings for the source bin so that it’s easier to compare with my provided screenshot above.


Is this plugin even supported anymore? It’s not listed in the DeepStream Plugin Guide.

The pipeline setup we are using in our cloud workloads are pretty simple as they just decode MP4 videos or HLS streams and after that perform inference with one nvinfer element and extract the data within a fakesink. This setup works perfectly fine and reliable with DeepStream 5.1 which we are heavly using in production. This stopped working after an upgrade to DeepStream 6, 6.0.1 and 6.1.

I was able to observe a change in the faulty behaviour when going from 6 to 6.0.1:

DS6.0:

  • The pipelines with more than 1 input source randomly fails in ~5% of cases.

DS6.0.1 and DS6.1:

  • 1 input source works reliably.
  • 2 input sources fail randomly in ~5% of cases like with DS6.0.
  • With more than 2 input sources the pipeline immediately hangs and never shows any GPU utilisation.

Is there anyone from the engineering team which could provide us with more detailed information about what and why something had been changed in the plugins? What were the bugfixes about? Due to the fact that they change the behaviour between versions while still not fixing our particular issue it suggests that our problem is related to whatever has been “fixed”.

Could you or someone from the engineering team please double check if there is anything obviously wrong based on the provided screenshot in my first post? (repost below:)

We would be very grateful for any information regarding this issue as we rely on DeepStream for our production deployments and we would like to make use of new features from TensorRT and DeepStream going forward.

What kind of error? Is there log for the failure?

I haven’t been able to pin down a specific error message. The last log messages are coming from the decoder.

The logs were produced with DS 6.0.1:
3-input-sources-log.txt (15.0 MB)

I’ll try to get you a log for the error that’s happening with 2 input sources, too.

Hello @jens.w Can you let us know if it is still an issue?

Yes, this is still an issue. Do you have any update?

I didin’t find any error in the log. What kind of failure happens to your app?

Thank you for having a look!

I couldn’t find any error on our side either after thorough and lengthy investigation. It works perfectly with DeepStream 5 and 5.1 which we are using in production.

The pipeline hangs / freezes. It doesn’t finish executing. The pipeline is “active” but not doing anything. In that state it is not printing any log messages at all and there is no activity on the GPU.

It comes across to me that some elements are waiting for buffers from other elements du to an error. I think this is an error in nvinfer or nvvideo4linux2. Especially, since the observed behaviour changes from 6.0 to after the “bugfix” Nvidia released with 6.0.1.

Could this bug report please be reaised with the DeepStream engineering team?

Thank you!

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

To make sure it is a bug, we need to reproduce the freeze issue. Can you provide a method to reproduce the issue? It is better to reproduce with DS sample apps.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.