Multiple rtsp with multiple nvinfer

hritik.shah · June 2, 2026, 11:59am

Hardware Platform: Jetson Orin Nano 8GB
DeepStream Version: 7.1
JetPack Version: 6.1 (L4T R36.4)
TensorRT Version: 10.3
Issue Type: Optimization Question

i have multiple rtsp sources and multiple services(cv models) , my goal is to maximize the number of cameras and services i can run on a single jetson

this is what i am doing:
rtspsrc → nvv4l2decoder → tee ─┬──-> service_mux_A (batch=N) → nvinfer_A → probe
└──-> service_mux_B (batch=N) → nvinfer_B → probe

1 decoder per camera (shared across all services on that camera)
1 nvinfer per service type (shared across all cameras)
nvstreammux batch-size capped at 1 (TRT engines built with maxBatchSize=1)
CPU RGBA output from nvvideoconvert (no NVMM) to avoid VIC exhaustion
New camera hot-adds a decoder and connects its tee to all active service muxes simultaneously

can this be optimized to get a better throughput in any way???

MarkusHoHo · June 2, 2026, 12:02pm

Hello @hritik.shah!

Based on the title and content of your topic, it looks like it may receive better visibility and feedback in a different category. We took the liberty of moving it for you.

If this was an incorrect assessment, please send me a direct message.

Disclaimer: this moderation suggestion and message were generated with AI assistance.

Fiona.Chen · June 3, 2026, 10:12am

Do you have multiple RTSP streams(cameras) to be added to the ineference pipeline dynamically? The sample /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-server can handle such case. Do you know the maximum number of the streams(cameras)?

What is the relationship bwteeen the two models “nvinfer_A” and “nvinfer_B”? If the two models will both infer all input video streams, only one “nvstreammux” is enough.

Where and how do you implement “CPU RGBA output from nvvideoconvert (no NVMM)” in the pipeline?

What do you do in the “probe”？

What do you mean by “get a better throughput”? The FPS value?

hritik.shah · June 4, 2026, 4:44am

yes i have multiple rtsp streams to be added dynamically.

i dont know the maximum number of streams, thats what i want to know, i want the maximum possible streams.

both models are different, and the number of streams both of them infer is also dynamic , i add / remove streams during runtime

this is how i use nvvideoconvert
rtspsrc → nvv4l2decoder(num-extra-surfaces=0) → tee
tee → queue(leaky, max=2) → nvstreammux(batch=N) → nvinfer → nvvideoconvert(compute-hw=1)
→ capsfilter(video/x-raw,RGBA)

better throughput is number of streams i can add

also:

how does batch_size in the infer config matter to the number of streams the engine can handle?
i had an issue of “failed in mem copy”, which got fixed by using copy-hw=1, scaling-compute-hw=1, is this a correct fix?
then i rebuilt the engines with batch_size=32, and on test , upto 32-33 rtsp streams got added on the same infer, then i got this error “libnvrm_gpu.so: NvRmGpuLibOpen failed, error=6”, so is batch_size the maximum number of streams i can add? if yes can i change the batch_size at runtime without having to rebuild the engine pre-run?

Fiona.Chen · June 4, 2026, 6:33am

From the DeepStream pipeline view, the maximum streams number it can support depends on the slowest part in the pipeline. You need to find out the bottleneck in your pipeline by yourself.

Jetson Orin Nano hardware decoder capability is listed in Jetson AGX Orin for Next-Gen Robotics | NVIDIA
The model TRT engine performance with different batch size can be measured by the TensorRT tool “trtexec”
What did you do with the RGBA data after “nvvideoconvert” ?

Can you elaborate it clearly? The streams will be added/removed dynamically, but we want to know whether the two models inference on exactly the same streams at the same moment. E.G. when there are 5 streams added to the pipeline, will the model A inference on stream 1,2,3 while model B inference on stream 3, 4, 5? Or both model A and model B will infer on stream 1,2,3,4,5?

The nvinfer batch size is the TensorRT model engine batch size. If your model is built to batch size 32 engine, that means the engine can infer at most 32 frames at one time. If you build the batch size 1 model engine, you need to infer 32 times with the engine for 32 frames. Most models we have tried show that to infer 32 frames with batch size 32 engine for one time is faster than infer 32 frames with batch size 1 engine for 32 times. We don’t know about your models, you may need to measure the models by yourself.

It works.

No. I think I have explained the maximum number of streams depends on your pipeline.

No.

hritik.shah · June 4, 2026, 6:51am

yes both models inference on exactly the same streams at the same moment, both model A and model B will infer on stream 1,2,3,4,5

yes but i got this error “libnvrm_gpu.so: NvRmGpuLibOpen failed, error=6” exactly when 33rd stream is added with the batch_size=32 engine for both the models tested seperately , and the ram didnt actually exhaust, my models are Yolov11n and RF-DETRs .

what does this error mean “libnvrm_gpu.so: NvRmGpuLibOpen failed, error=6”
if i load the next streams after 32 streams on another infer of the same model , will that work and increase the number of streams?
how will the config parameter “interval” change the infer in my usecase, and what is the best suggested interval , considering all my streams are running at 20fps

Fiona.Chen · June 5, 2026, 3:07am

Please use only one nvstreammux for your case.

It seems the fd exhaust. Please try “ulimit -n 4096”

Do you mean to add another pipeline? I think I have said the pipeline capability is decided by the slowest part, if the second pipeline shares the same resources, nothing will be changed.

The “interval” parameter is to skip the inference on some batches. If the bottleneck is the GPU loading of your models, it may help to improve the throughput. Every model is different, the different batch size TensorRT engines for the same model are different. The same model runs on different GPUs are different. The value is decided by your pipeline and GPU loading, you need to measure it by yourself.

yingliu · June 23, 2026, 5:56am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks.

Topic		Replies	Views
Inference FLickers on Nvstreeammux Batch-size increase to number of streams DeepStream SDK deepstream	42	548	September 30, 2025
Issues with running inference on multiple rtsp streams in deepstream-imagedata-multistream DeepStream SDK jetson-inference	23	1119	July 24, 2024
DeepStream 7.1 on Jetson Orin Nano Super — 3-Stream Pipeline Thermal Throttle at ~68–70°C, Seeking FPS Optimization Advice DeepStream SDK jetson , deepstream	4	125	April 2, 2026
Can’t able run more than 12 stream in deepstream python DeepStream SDK	16	711	April 10, 2023
Batch size is smaller than number of streams in DS pipeline DeepStream SDK	3	552	November 21, 2020
Multi-stream Deepstream 9.0 app DeepStream SDK camera , deepstream , configurations	11	200	May 20, 2026
Deepstream multiple rtsp output latency DeepStream SDK	1	735	November 10, 2022
DeepStream: Batching Not Occurring (Only 1 Frame per Batch Instead of 16) Causing FPS Drop with Multiple Streams DeepStream SDK deepstream	5	110	December 2, 2025
6-8 RTSP Cameras Streaming on Jetson Nano DeepStream SDK camera , jetson-inference , gstreamer	2	1239	April 20, 2021
Error when running in multiple RTSP source DeepStream SDK	5	692	September 24, 2020

Multiple rtsp with multiple nvinfer

Related topics