Deepstream Reference App: DsExample Processing Problem

Please provide complete information as applicable to your setup.

**• GPU **
• DeepStream Version: 6.2
• TensorRT Version: Whatever comes in the DS 6.2 Docker container; also running Cloud Native Stack 10.2
• NVIDIA GPU Driver Version: 535.86.10
**• Issue Type( questions, new requirements, bugs): Question **

I have something of an open question as I try to understand more about the Deepstream SDK.

I’m running the Deepstream Reference app (“deepstream-app”) in a K8s cluster that also includes the Cloud Native Stack (and, therefore, GPU Operator). The DS app has been set to run the “DsExample” module/filter immediately after the tracking module. The DsExample filter has also been compiled to run with the “gst_optimized.cpp” version of the filter.

For a given number of RTSP streams passed in, say, 16, on some GPUs, e.g. an RTX 3060 Ti, things work just fine. In fact, I can throw 32, or even 48 streams at the RTX 3060 Ti and the GPU utilization averages around 97% and the FPS performance decreases in proportion to the number of streams I throw at it (I’ve yet to make it fall over).

However, if I use a “weaker” GPU (e.g. an A2), it doesn’t take very long for the application to stop processing (i.e. the application continues to run, the but performance metrics shows all streams grind to a halt at 0 FPS). This is for any number of streams greater than 8. I can appreciate that the DsExample requires some horsepower when processing full frames, which I’ve configured it to do. I can also see via the GPU Operator metrics (via DCGM exporter) that when it does stop processing, the GPU utilization seems to have hit 100% and then shortly after stopped. If I limit the number of streams to 8, the GPU utilization hovers around 96-97%.

With some debug code, I can see that “gst_dsexample_submit_input_buffer” ceases to be called, and the semaphore/conditional block on the process_lock (called if the process_queue is empty) is never signalled and so blocks forever.

I’m curious as to what about the A2 (or GPUs in general) would cause “gst_dsexample_submit_input_buffer” to not be called on the filter. It could very well be that it’s starved because no frames are making it through one or more upstream filters, but I’m not sure how to check that.

Thanks in advance for any help and/or insight!

“gstdsexample_optimized.cpp” is implemented as the “in-place” transform plugin. The input buffer is just the output buffer. If the “gst_dsexample_submit_input_buffer” is not called, that means there is no input buffer available from upstream.

We need your complete pipeline to reproduce the issue.

I figured as much, but I’m not asking you to reproduce the issue (at this point, anyway). It isn’t so much a bug but a question.

(Although it is interesting that if I disable the DsExample element in the pipeline, things seem to run just fine.)

Is there an easy way to tell where in the pipeline (i.e. which element/filter) the flow is being interrupted? Or whether the start of the pipeline is getting any input at all?

If you do want to see the actual pipeline, let me know what you’d need from me (code, containers, diagram, etc.) that would allow you to do so.

Maybe another question might be: What about the RTX 3060 Ti when compared to an A2 allows it to perform so much better?

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

There are GStreamer log for debugging. Basic tutorial 11: Debugging tools (gstreamer.freedesktop.org)

From your previous description, the gst-dsexample stops to “gst_dsexample_submit_input_buffer”, that means there is no buffer from upstream elements(for your case, there is no buffer output from the tracker). So we want to reproduce the problem to identify the root cause.

Theoretically, the two types of GPUs are for different areas and with different framework and drivers, we can not compare them directly. The GPU is not the only hardware which will impact the performance.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.