• Hardware Platform GPU
• DeepStream Version 8.0
• NVIDIA GPU : RTX 5090
Pipeline description (DeepStream)
I am running a DeepStream multi-camera pipeline with pre-decode recording inside nvurisourcebin.
Main inference pipeline
nvurisourcebin (x N sources)
→ nvstreammux
→ nvstreamdemux
→ nvinfer
→ nvmetamux
-
nvstreammuxbatches all sources -
nvstreamdemuxsplits per-stream -
Inference and metadata aggregation work correctly even with 60 cameras
Recording branch (inside nvurisourcebin)
Inside each nvurisourcebin, I added a pre-decode tee to record the original RTSP stream:
rtspsrc
→ tee_rtsp_pre_decode
├── queue
│ → h264parse
│ → splitmuxsink (record original stream)
│
└── decode
→ nvstreammux (main pipeline)
- Recording is done before decode
splitmuxsinkis used to record video- The same recording pipeline works perfectly when run as a standalone GStreamer pipeline (60 cameras OK)
I also set property of Queue element to no limit.
- queue.set_property(“max-size-buffers”, 0)
- queue.set_property(“max-size-time”, 0)
- queue.set_property(“max-size-bytes”, 0)
Observed issue
-
With 30 cameras:
-
Recording works correctly
-
No frame drop or freeze
-
-
With 60 cameras:
-
Recording branch stalls
-
Output video files stop growing
-
Missing segments / frozen recordings
-
Inference pipeline continues to run normally
-
Question
-
Is this a known limitation when adding a pre-decode recording branch inside
nvurisourcebinat high source counts? -
Can backpressure from
nvstreammux/ global DeepStream clock propagate upstream and affect the pre-decode tee branch? -
Is there a recommended architecture for large-scale pre-decode recording (e.g. using
appsink, external recorder pipeline, or a separate process)? -
Are there specific properties in
nvurisourcebin/rtspsrcthat should be set to avoid this behavior at scale?
