• GPU (Jetson / GPU) • DeepStream Version:7.0 • JetPack Version (valid for Jetson only) • TensorRT Version • NVIDIA GPU Driver Version (535.86.10) • Issue Type( questions)
We are developing a real-time video analytics application using NVIDIA DeepStream SDK, designed to process multiple RTSP camera streams (1920x1080) from various sources. Our system dynamically manages streams using the new nvstreammux plugin and incorporates a Python-based AI processing pipeline using nvinferserver. This AI pipeline leverages CuPy and TensorFlow for high-resolution object detection. The final processed video streams are encoded and transmitted via mediamtx as RTSP outputs.
However, we observe intermittent visual artifacts in the RTSP output, which appear sporadically and disappear after a short duration, only to reoccur. This issue persists despite normal performance in other scenarios. Reducing the number of active streams reduces the artifact frequency, indicating a resource or configuration bottleneck.
We with triton expensive code commented off, everything running fine.
We have 3 years of experience with the same rtsp setup, it is not network related.
The GPU loading is a little bit high which may cause the GstBuffer be consumed late in some time. The delayed GstBuffer consumption may cause the source element(in this case, the source is the rtsp client) queue buffer full which may cause packets loss. Can you try with other models whose loading is lower?
Our model is not a neural network based model, but a model which have expensive GPU computation. It is written in python-backend. Now my question is with this expensive GPU computation in the mix, will I be able to run deepstream pipeline with out artifacts, parallelly in a process or in a different container?
It is not the expensive GPU computation but the pending of the GstBuffer consumption which caused the artifacts. If you don’t want the this expensive GPU computation python-backend to impact the pipeline, you need to make sure the Triton processing has nothing to do with the Gstreamer pipeline.
You need to do the RTSP stream receiving, Triton inferencing and the RTSP send out in different threads or processes, the buffering between each threads should be large enough.
I have done similar.
Infact for AI, I use a seperate container, that will just decode video in 10 fps, controlling the 10 fps to triton inference server.
And that model will send overlays as pickle byte stream to the other container where rtsp-in-rtsp-out runs.
But I see still artifacts in the rtsp out, when the utilisation is high.
The DeepStream SDK is just a SDK which provide video/audio inferencing and processing functions which are accelerated by GPU.
Currently the nvinferserver performance is not smooth while you want it works with the constant frame rate input live stream , you need to guarantee the buffering between the decoding and sending container and the Triton inference server is large enough to cover the inferencing performance fluctuation which may introduce extra delay between RTSP decoding container and the Triton inferencing server.
I don’t understand it. rtsp-in-rtsp-out sample is not suitable for your case. And you have mentioned that there is a separated video decoding container, what does this video decoding container do? Is the RTSP protocol stack also included in the video decoding container?