Intermittent Artifacts in DeepStream RTSP Output with Dynamic Multi-Stream Video Analytics with triton inference server with python backend

• GPU (Jetson / GPU)
• DeepStream Version:7.0
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (535.86.10)
• Issue Type( questions)

We are developing a real-time video analytics application using NVIDIA DeepStream SDK, designed to process multiple RTSP camera streams (1920x1080) from various sources. Our system dynamically manages streams using the new nvstreammux plugin and incorporates a Python-based AI processing pipeline using nvinferserver. This AI pipeline leverages CuPy and TensorFlow for high-resolution object detection. The final processed video streams are encoded and transmitted via mediamtx as RTSP outputs.
However, we observe intermittent visual artifacts in the RTSP output, which appear sporadically and disappear after a short duration, only to reoccur. This issue persists despite normal performance in other scenarios. Reducing the number of active streams reduces the artifact frequency, indicating a resource or configuration bottleneck.

The artifacts issue happens when we extract batch for analytics using GitHub - triton-inference-server/python_backend: Triton backend that enables pre-process, post-processing and other logic to be implemented in Python..

3. Pipeline & Processing Details

3.1 Data Flow

  1. Stream Ingestion: uridecodebin receives RTSP input.
  2. Frame Rate Normalization: Frames are synchronized to a common framerate.
  3. Stream Multiplexing: nvstreammux dynamically batches frames.
  4. Demultiplexing: nvstreamdemux splits streams for per-stream processing.
  5. Encoding: nvv4l2encoder encodes frames.
  6. RTSP Transmission: Encoded streams are streamed via rtspclientsink.

Observations & Issues

5.1 Artifact Characteristics

  • Appearance: Frame corruption.
  • Affected Streams: All RTSP channels.
  • MP4 Files: Stuttering, but no artifacts.
  • Mitigation Attempts: Reducing batch size, frame rate, or resolution reduces artifacts.

Timing: Artifacts occur randomly, not specifically during stream addition/removal.

The code is available in

src/driver_triton.py in

Plelase help us to understand and fix the issue.

This is caused by ethernet packets loss. You may need to identify whether it happens with the input rtsp sources or with the rtsp output.

No @Fiona.Chen ,

We with triton expensive code commented off, everything running fine.
We have 3 years of experience with the same rtsp setup, it is not network related.

The image you post here and the following description show that there is ethernet packets loss happened during the rtsp payload transferring.

Have you set enough delay of your rtsp client? Are you using TCP as the low level protocol for the RTP payload?

I saw your model have [1080,1920,3] input dimension and the batch size is 10. What is your GPU?

3070 Ti, 3050 Ti, A2000 Ada etc, checked from all of them

And please check the CPU and GPU loading when you run with Triton inferencing.

CPU is 10%
and GPU spikes to 80 to 90%
some time even to 95%

The GPU loading is a little bit high which may cause the GstBuffer be consumed late in some time. The delayed GstBuffer consumption may cause the source element(in this case, the source is the rtsp client) queue buffer full which may cause packets loss. Can you try with other models whose loading is lower?

Our model is not a neural network based model, but a model which have expensive GPU computation. It is written in python-backend. Now my question is with this expensive GPU computation in the mix, will I be able to run deepstream pipeline with out artifacts, parallelly in a process or in a different container?

It is not the expensive GPU computation but the pending of the GstBuffer consumption which caused the artifacts. If you don’t want the this expensive GPU computation python-backend to impact the pipeline, you need to make sure the Triton processing has nothing to do with the Gstreamer pipeline.

how to do that? how can I seperate the both contexts, so that no interference happen between triton and video streamer

Ajithkumar A K

You need to do the RTSP stream receiving, Triton inferencing and the RTSP send out in different threads or processes, the buffering between each threads should be large enough.

I have done similar.
Infact for AI, I use a seperate container, that will just decode video in 10 fps, controlling the 10 fps to triton inference server.
And that model will send overlays as pickle byte stream to the other container where rtsp-in-rtsp-out runs.

But I see still artifacts in the rtsp out, when the utilisation is high.

Please let me know what to do?

The DeepStream SDK is just a SDK which provide video/audio inferencing and processing functions which are accelerated by GPU.

Currently the nvinferserver performance is not smooth while you want it works with the constant frame rate input live stream , you need to guarantee the buffering between the decoding and sending container and the Triton inference server is large enough to cover the inferencing performance fluctuation which may introduce extra delay between RTSP decoding container and the Triton inferencing server.

I don’t understand it. rtsp-in-rtsp-out sample is not suitable for your case. And you have mentioned that there is a separated video decoding container, what does this video decoding container do? Is the RTSP protocol stack also included in the video decoding container?