Jetson Orin NX (16GB) + DeepStream 7.1 + Ultralytics YOLO11 - 20-25 second latency between the source RTSP feeds and the processed output

We’re developing a real-time restaurant monitoring system using a Jetson Orin NX 16GB that processes 8 RTSP feeds from Reolink cameras. Our application extracts critical metadata including:

  • License plate detection and recognition

  • People counting and tracking

  • Zone-based time tracking (customer dwell time)

  • Parking space occupancy detection

The processed data needs to be displayed on a real-time dashboard with less than 2-second latency for operational decision-making.

Setup Details:

  • Hardware Platform: Jetson Orin NX 16GB

  • DeepStream SDK: 7.1

  • JetPack Version: 6.2.1+b38

  • TensorRT Version: 10.7.0.23-1+cuda12.6

  • Operating System: Ubuntu 22.04 (via JetPack 6.2)

  • Cameras: 8x Reolink cameras via RTSP

  • Model Format: FP16 precision

  • Approaches Tried:

    1. Docker container with TensorRT optimization

    2. Native DeepStream pipeline implementation

Problem

We’re experiencing approximately 15-20 second latency between the source RTSP feeds and the processed output, which makes real-time monitoring impossible. This delay is consistent across both our Docker/TensorRT and DeepStream implementations. Also sometime the script stop

Please check my attached zip folder for understanding the architecture. “deepstream_app.py“ is the main file.

deepstream_setup.zip (12.8 KB)

Requirements

  • Maximum acceptable latency: 2 seconds end-to-end

  • System runs continuously (15+ hours daily)

  • All 8 camera feeds must be processed simultaneously

Questions

  1. What are the recommended DeepStream configurations for minimizing RTSP latency with multiple streams?

  2. Should we consider different power modes (MAXN vs 15W) for better real-time performance?

  3. Is the Orin NX 16GB capable of handling 8 simultaneous feeds with our latency requirements, or should we consider distributing the load?

  4. Do we have to change our camera which will be directly connected to the jetson device? any suggestions for the camera?

I just need suggestion from expert to improve the architecture, so that in dashboard we can see the realtime data in less than 2 seconds.

1&2. from the code, the set of “batched-push-timeout” is wrong. it should be “1000000/max_fps”. for example, if the max fps is 25, it should be 40000. and please set high power for better inference performance. Please refer to this faq.
3. could you simplify the pipeline to narrow donw this issue? for example, if using “source-> nv3dsink”, will the latency issue of each camera persist? if using “source-> nvstreammux ->nv3sink”, will the latency issue persist?

thanks for the suggestions @fanzh. I have updated the code. But it’s like the detection is not stable yet. Also after running the code for some moment it’s shows warnings like “Do Not Touch the Hot surface”. Is that a problem? Cause we have to run it for 15 hours each day. Also sometimes the frame is green or black. Is that the problem due to rtsp feed? Do we use camera that are directly connected with jetson? Is that will give better performance?

If you don’t mind, I sent you a message. Can you please check that as well.

  1. Do Not Touch the Hot surface is not DeepStream log. could you provide a screenshot? is it the warning of system or any third-party applications?
  2. regarding “the frame is green or black”, it should be related to rtsp receiving. you can use "“source-> nvstreammux ->nv3sink” to test specially. Since UDP is better to transmit video, please set ‘protocols’ to 7, which means “UDP/UDP-MCAST/TCP”.

Sorry for the late reply, Is this still an DeepStream issue to support? Thanks!
Do Not Touch the Hot surface is a warning for high temperature. Please refer to this topic for a fix.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks.