Delay gathered over time

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) NVIDIA Jetson Xavier NX
• DeepStream Version deepstream-app version 6.0.1
DeepStreamSDK 6.0.1

• JetPack Version (valid for Jetson only) # R32 (release), REVISION: 6.1, GCID: 27863751, BOARD: t186ref, EABI: aarch64, DATE: Mon Jul 26 19:36:31 UTC 2021

• TensorRT Versionii libnvinfer-bin 8.2.1-1+cuda10.2 arm64 TensorRT binaries
ii libnvinfer-dev 8.2.1-1+cuda10.2 arm64 TensorRT development libraries and headers
ii libnvinfer-doc 8.2.1-1+cuda10.2 all TensorRT documentation
ii libnvinfer-plugin-dev 8.2.1-1+cuda10.2 arm64 TensorRT plugin libraries
ii libnvinfer-plugin8 8.2.1-1+cuda10.2 arm64 TensorRT plugin libraries
ii libnvinfer-samples 8.2.1-1+cuda10.2 all TensorRT samples
ii libnvinfer8 8.2.1-1+cuda10.2 arm64 TensorRT runtime libraries
ii python3-libnvinfer 8.2.1-1+cuda10.2 arm64 Python 3 bindings for TensorRT
ii python3-libnvinfer-dev 8.2.1-1+cuda10.2 arm64 Python 3 development package for TensorRT

• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I have a Python pipeline to process an RTSP stream, stream an RTSP out and generate events with the msg broker plugin.

I am using a resnet10 from the demo apps to detect cars. My problem is that it gathers delay over time, to be specific ~5 sec for each hour, so after 24 hours it has ~ 2 min delay in comparison to the RTSP source.

I put a filter to convert the frames to RGBA, it seems like it fixed the issue for a period of time, but now it happens again. (Maybe I’m wrong and it wasn’t fixed, but it worked for 24 hours and the delay was 3 sec, but let’s assume that it wasn’t fixed with this)

Already set:
sink.set_property(“async”, False)
sink.set_property(“sync”, 0)

For a single RTSP source, I think it should work in real-time - usually, there is only a car and some people in the frame, but it is usually empty.

Any idea or suggestion on what can be the cause of this and how to fix it?

You may need to measure the pipeline delay to exclude the network delay first. You can refer to DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums.

Hi, thanks for the recommendation. I have a few questions.

  1. Is there any way to test that with the Python bindings? If not, please let me know what pipeline should I use just for this test.

  2. The video stream is directly connected to the Jetson Xavier, so that should not be an issue since it is only going through a switch, but it is connected to the internet through 5G, which may be the cause for the latency. If I remove the RTSP out and just let the msg broker use MQTT then the delay should disappear, is that right?

For python app, you need to add pad probe function to calculate the latency by yourself.

The source RTSP stream is connected through network(even with 5G), the video data network transferring may involve some delay, you can measure the actual network bandwith(not theoretical bandwidth) to confirm whether it is enough to transferring low latency video data.

Can you tell us which “delay” are you talking about?
Your whole app is RTSP source -> DeepStream RTSP client -> DeepStream inferencing pipeline -> DeepStream RTSP output -> final RTSP client

The RTSP source and final RTSP client are out of the DeepStream app and pipeline. Is your issue talking about the delay between RTSP source and final RTSP client? If so, you need to exclude the network delay in RTSP source -> DeepStream RTSP client and DeepStream RTSP output -> final RTSP client, the network delay is out of DeepStream’s control.

The source video is connected locally through a switch with the Jetson Xavier: CCTV → SWITCH POE → ROUTER → JETSON - so the stream is reaching the Jetson directly.

I am accessing the camera live stream and comparing it with the RTSP out from the DeepStream using VLC at rtsp://localhost:8554/ds-test
When I start the program the difference is ~ 3 seconds, but when I checked them later the difference increased with 5 sec for each hour. Taking into account the hardware used I don’t think this is normal behavior for a resnet10 model (same as in the demo apps).

The stream reaches the Jetson through the network. The network protocol stacks work.

Can you measure the DeepStream inferencing pipeline delay only to exclude the network delay(which may not be stable since most network payloads are based on UDP)? I mean the delay of DeepStream RTSP client -> DeepStream inferencing pipeline -> DeepStream RTSP output which is what the DeepStream app’s work?

I added a pad after streammux with time.time and one after rtppay right before the sink to get the time from streammux until right before the sink plugin. Below are the results. Let me know if this method is ok.

I played the same program on the same hardware but using an .mp4 video.

Process time: 0.026813030242919922
Process time: 0.027075529098510742
Process time: 0.016327857971191406
Process time: 0.017354249954223633
Process time: 0.024331331253051758
Process time: 0.02550220489501953

Process time: 0.008718013763427734
Process time: 0.009101390838623047
Process time: 0.019330978393554688
Process time: 0.020936012268066406
Process time: 0.009273052215576172
Process time: 0.009803533554077148
Process time: 0.018743038177490234
Process time: 0.023789405822753906
Process time: 0.009517431259155273
Process time: 0.010687828063964844

Process time: 0.010860443115234375
Process time: 0.011258840560913086
Process time: 0.013165950775146484
Process time: 0.015964984893798828
Process time: 0.07453012466430664
Process time: 0.016382455825805664
Process time: 0.017992019653320312

You can use this method to measure the infrencing delay, but before that, please check the network delay to gurantee the network delay will not impact the DeepStream pipeline.

Hi. Below are the results, from Jetson to CCTV.

64 bytes from 172.24.1.201: icmp_seq=1 ttl=64 time=0.346 ms
64 bytes from 172.24.1.201: icmp_seq=2 ttl=64 time=0.724 ms
64 bytes from 172.24.1.201: icmp_seq=3 ttl=64 time=1.05 ms
64 bytes from 172.24.1.201: icmp_seq=4 ttl=64 time=0.499 ms
64 bytes from 172.24.1.201: icmp_seq=5 ttl=64 time=0.401 ms
64 bytes from 172.24.1.201: icmp_seq=6 ttl=64 time=0.507 ms
64 bytes from 172.24.1.201: icmp_seq=7 ttl=64 time=0.370 ms
64 bytes from 172.24.1.201: icmp_seq=8 ttl=64 time=0.629 ms
64 bytes from 172.24.1.201: icmp_seq=9 ttl=64 time=0.404 ms
64 bytes from 172.24.1.201: icmp_seq=10 ttl=64 time=0.655 ms
64 bytes from 172.24.1.201: icmp_seq=11 ttl=64 time=0.754 ms
64 bytes from 172.24.1.201: icmp_seq=12 ttl=64 time=0.695 ms
64 bytes from 172.24.1.201: icmp_seq=13 ttl=64 time=0.682 ms
64 bytes from 172.24.1.201: icmp_seq=14 ttl=64 time=0.743 ms
64 bytes from 172.24.1.201: icmp_seq=15 ttl=64 time=0.432 ms
64 bytes from 172.24.1.201: icmp_seq=16 ttl=64 time=0.471 ms
64 bytes from 172.24.1.201: icmp_seq=17 ttl=64 time=0.365 ms
64 bytes from 172.24.1.201: icmp_seq=18 ttl=64 time=1.40 ms
64 bytes from 172.24.1.201: icmp_seq=19 ttl=64 time=0.799 ms
64 bytes from 172.24.1.201: icmp_seq=20 ttl=64 time=0.796 ms
64 bytes from 172.24.1.201: icmp_seq=21 ttl=64 time=0.862 ms
64 bytes from 172.24.1.201: icmp_seq=22 ttl=64 time=0.806 ms
64 bytes from 172.24.1.201: icmp_seq=23 ttl=64 time=0.797 ms
64 bytes from 172.24.1.201: icmp_seq=24 ttl=64 time=0.904 ms
64 bytes from 172.24.1.201: icmp_seq=25 ttl=64 time=0.762 ms
64 bytes from 172.24.1.201: icmp_seq=26 ttl=64 time=0.518 ms
64 bytes from 172.24.1.201: icmp_seq=27 ttl=64 time=0.461 ms
64 bytes from 172.24.1.201: icmp_seq=28 ttl=64 time=0.786 ms
64 bytes from 172.24.1.201: icmp_seq=29 ttl=64 time=0.479 ms
64 bytes from 172.24.1.201: icmp_seq=30 ttl=64 time=0.567 ms
64 bytes from 172.24.1.201: icmp_seq=31 ttl=64 time=0.677 ms
64 bytes from 172.24.1.201: icmp_seq=32 ttl=64 time=0.659 ms
64 bytes from 172.24.1.201: icmp_seq=33 ttl=64 time=0.833 ms
64 bytes from 172.24.1.201: icmp_seq=34 ttl=64 time=0.560 ms
64 bytes from 172.24.1.201: icmp_seq=35 ttl=64 time=0.426 ms
64 bytes from 172.24.1.201: icmp_seq=36 ttl=64 time=0.826 ms
64 bytes from 172.24.1.201: icmp_seq=37 ttl=64 time=0.620 ms
64 bytes from 172.24.1.201: icmp_seq=38 ttl=64 time=0.602 ms
64 bytes from 172.24.1.201: icmp_seq=39 ttl=64 time=0.446 ms
64 bytes from 172.24.1.201: icmp_seq=40 ttl=64 time=0.610 ms
64 bytes from 172.24.1.201: icmp_seq=41 ttl=64 time=0.381 ms
64 bytes from 172.24.1.201: icmp_seq=42 ttl=64 time=0.583 ms
64 bytes from 172.24.1.201: icmp_seq=43 ttl=64 time=0.643 ms
64 bytes from 172.24.1.201: icmp_seq=44 ttl=64 time=0.678 ms
64 bytes from 172.24.1.201: icmp_seq=45 ttl=64 time=0.426 ms
64 bytes from 172.24.1.201: icmp_seq=46 ttl=64 time=0.365 ms
64 bytes from 172.24.1.201: icmp_seq=47 ttl=64 time=0.552 ms
64 bytes from 172.24.1.201: icmp_seq=48 ttl=64 time=0.424 ms
64 bytes from 172.24.1.201: icmp_seq=49 ttl=64 time=0.722 ms
64 bytes from 172.24.1.201: icmp_seq=50 ttl=64 time=0.358 ms
64 bytes from 172.24.1.201: icmp_seq=51 ttl=64 time=0.430 ms
64 bytes from 172.24.1.201: icmp_seq=52 ttl=64 time=0.419 ms
64 bytes from 172.24.1.201: icmp_seq=53 ttl=64 time=0.361 ms
64 bytes from 172.24.1.201: icmp_seq=54 ttl=64 time=0.815 ms
64 bytes from 172.24.1.201: icmp_seq=55 ttl=64 time=0.546 ms
64 bytes from 172.24.1.201: icmp_seq=56 ttl=64 time=0.695 ms
64 bytes from 172.24.1.201: icmp_seq=57 ttl=64 time=0.535 ms
64 bytes from 172.24.1.201: icmp_seq=58 ttl=64 time=0.447 ms
64 bytes from 172.24.1.201: icmp_seq=59 ttl=64 time=0.635 ms
64 bytes from 172.24.1.201: icmp_seq=60 ttl=64 time=0.372 ms
64 bytes from 172.24.1.201: icmp_seq=61 ttl=64 time=0.705 ms

How did the time is calculated?

I used ping from the Jetson to CCTV.

Have you noticed such delay times?

There is also some open source UDP/TCP tools. E.G. https://iperf.fr/

I tested it and it seems that the frames are reaching but the delay just gathers over time. I used a custom model and it gathers delay faster than the resnet10. I tested it in a pipeline with LPD and LPR, and it gathers delay and it is skipping the frames after a few minutes of delay. (which is normal behavior with sync/qos = 0)

The strange thing is that the GPU is used very little and only has some spikes. (I set the inference interval to 3), so I have resources but it seems that it just gathers delays.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

If the model will impact the delay. Please measure the inferencing latency separately. If the inferencing time is larger than the frame interval, the delay will accumulate.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.