Sudden high latenty in deepstream

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) T4
• DeepStream Version 7.0
• JetPack Version (valid for Jetson only)
• TensorRT Version deepstream7.0 isntallatioon guide
• NVIDIA GPU Driver Version (valid for GPU only) 535
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
i tried my pipeline with 2 nvinfer and i got this problem can you help me in this?
the latency most of the time changed from 20ms to 200ms for more than 1 min even when nothing changed in Rtsp input (same feature in picture) and i don’t know why can you help me in that?
(nothing change in the video or pipeline it’s just in a different time)





the other question is why it only use 70% of my GPU Process when the inference time is high

Can you tell us your complete pipeline and the parameters you set with the pipeline?

i sent a pipeline graph using gsrtreamer dot file

How did you send the dot file? I can’t get it

i convert it to png

i even check it in a simpler pipeline and i still have this issue!


watch this txt file:
m.heydari - Mon Apr 14 2025 14_44_03 GMT+0330 (Iran Standard Time).txt (60.1 KB)

  1. Please set TCP protocol with rtspsrc if you are using RTSP sources.
  2. Please set large latency property value with rtspsrc if you are using RTSP sources.
  3. Seems your pipeline use PGIE+SGIE. How many cars are detected when the latency raise? Have you measured the GPU loading when the latency became larger?

1- i did it
2. i set it to 1000
3. for every source atleast 8 object
i did them but latency increased when i add latency property

the GPU-Process is between 70 and 90
the vram is 1.07GI

i change batch-size for secondary object to 16 and it works well for 3 video(30ms latency) but when i change number of source to 4 i get atleast 150ms(the gpu-process sometimes is 100% with only 1.07GI how can i fix this with vram and gpu-process tradeoff)

You need to know the maximum number of the cars you need to handle. The GPU compute power is limited. The GPU loading reach to 100% means the inferencing is overloaded. Either you try to find some faster LPD model or you need to switch to better GPU.

What is your car detection and LPD models types? FP32, FP16 or INT8?

tao yolov4 fp16 pruned(+retrained) models.
the maximum number of object is 20 in every source

You can measure the performance(trtexec) of your model to find out the maximum number of objects it can support.

did you mean we can find out maximum number of object by this 2 file?can you help me in this?

i did : /usr/src/tensorrt/bin/trtexec --onnx=nvinfer_cars_yolov4/yolov4_70pruned20epoch_car_1class.onnx --fp16 --shapes=Input:1x3x544x960 --duration=10 --infStreams=2 --exportProfile=p.json

car.txt (7.4 KB)
lpd.txt (7.3 KB)

Actually you need to measure the TensorRT engine performance wit trtexec, please refer to Low performance when running pipeline with RTX 4090 - #25 by Fiona.Chen

what is yolo8s and others in Performance — DeepStream documentation !
how can i trained with those models

We have some yolo model samples here: deepstream_tools/yolo_deepstream/deepstream_yolo at main · NVIDIA-AI-IOT/deepstream_tools

The yolov8 is 3rd party open source model, please refer to the Explore Ultralytics YOLOv8 - Ultralytics YOLO Docs for the training information.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.