Multi-RTSP FPS drop with python deepstream_test_3.py example

Hello everyone!

I have a serious problem when launching the examples from the git GitHub - NVIDIA-AI-IOT/deepstream_python_apps: A project demonstrating use of Python for DeepStream sample apps given as a part of SDK (that are currently in C,C++).

I am using the Docker container nvcr.io/nvidia/deepstream-l4t:5.1-21.02-samples

• Hardware Platform: Jetson Nano Developer Kit 4GB

• DeepStream Version: 5.1

• JetPack Version: 32.5.1

• Issue Type: bug

The problem is that if I use the deepstream_python_apps/apps/deepstream-test3/deepstream_test_3.py example with more than 2 RTSP streaming as source sometimes the deepstream pipeline drops the framerate to 0.2 and one random stream stops receiving data (as the only FPS info shown are from the other streams).

Typically I see for ~2300/3000 frames all the FPS info from each stream, and then suddenly I see only some stream’s FPS info and the FPS are 0.2 (or something very low w.r.t. the initial framerate).

This happens with both On Screen Display enabled or not (modifying the deepstream_test_3.py to remove OSD and using a fakesink).

I am simulating the streams with VLC (Media > Stream), because I do not know a public available RTSP with persons/objects recognized by peoplenet.

I did some test with 1, 2, 3 and 4 streams (same video source file, different streaming processes). e.g.

$ python3 deepstream_test_3.py rtsp://157.27.81.131:9999/

$ python3 deepstream_test_3.py rtsp://157.27.81.131:9999/ rtsp://157.27.81.131:9998/

$ python3 deepstream_test_3.py rtsp://157.27.81.131:9999/ rtsp://157.27.81.131:9998/ rtsp://157.27.81.131:9997/

$ python3 deepstream_test_3.py rtsp://157.27.81.131:9999/ rtsp://157.27.81.131:9998/ rtsp://157.27.81.131:9997/ rtsp://157.27.81.131:9996/

I am working with Deepstream 5.1 (latest at the time of writing) on a Jetson Nano Developer Kit 4GB. It is installed with a 64gb SD with the release 32.5.1 ( R32 (release), REVISION: 5.1, GCID: 26202423, BOARD: t210ref, EABI: aarch64, DATE: Fri Feb 19 16:45:52 UTC 2021; ). It is powered with a barrell jack with 5V-4A power supply. I already set the jetson to MAXN and performed sudo jetson_clock.

• How to reproduce the issue?

Steps to reproduce on a fresh installed jetson image starting from a new terminal (the apt is required as import gi in python3 fails):

(host)$ xhost +

(host)$ docker pull nvcr.io/nvidia/deepstream-l4t:5.1-21.02-samples

(host)$ docker run --gpus all --rm -it --net=host -e GST_DEBUG=3 -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY nvcr.io/nvidia/deepstream-l4t:5.1-21.02-samples

(container)$ apt-get update && apt-get install -y libpython3.6 python3-gi python-gst-1.0

(container)$ cd /opt/nvidia/deepstream/deepstream/samples && git clone https://github.com/NVIDIA-AI-IOT/deepstream_python_apps

(container)$ cd /opt/nvidia/deepstream/deepstream/samples/deepstream_python_apps/apps/deepstream-test3

(container)$ python3 deepstream_test_3.py rtsp://yourstreamhere_num1/ rtsp://yourstreamhere_num2/ rtsp://yourstreamhere_num3/ rtsp://yourstreamhere_num4/

Some final notes:

  1. With the environment flag GST_DEBUG=3 I can see that when the FPS drop occurs there are several warnings regards the rtpjitterbuffer. I tried also to open the streams with another instance of VLC during the drop event, and the streams seems working fine on VLC.

  2. The fps drop event happens sometimes and mostly in the 2000/5000 frames range, but it’s not deterministic.

  3. The streams are private videos of people walking in a closed environment, so the PGIE is inferencing often during the video (not all time though, there are several frames without any object).

  4. After several minute after the FPS drop at some point my jetson freezes, and I am forced to do an hardware reset (removing the barrel jack)

  5. I tried also on other jeston nano boards with same results

Does anyone find the same problem as me? Have someone a solution or may know the cause?

Thank you very much, any help is really appreciated

Federico

It will take some time to reproduce the issue. Will be back to you if there is any progress.

Have you measured your model performance on Nano with different batch size?

Hello,
First of all thanks for the reply
About your question regard the model performances:
my model is the default (Resnet10) provided in the docker and in the txt configuration.
I have only checked the FPS output from the python without any code edit in order to get the more independent result as possible.
About the batch: the TRT model is being recreated each time I run a new configuration (with 1 RTSP with batch 1, with 2 RTSP batch 2, etc…). At least, this is what it should happen right? Because code states this:

pgie.set_property('config-file-path', "dstest3_pgie_config.txt")
pgie_batch_size=pgie.get_property("batch-size")
if(pgie_batch_size != number_sources):
    print("WARNING: Overriding infer-config batch-size",pgie_batch_size," with number of sources ", number_sources," \n")
    pgie.set_property("batch-size",number_sources)

And in the output console I got that “WARNING”, and also the creation of the TRT engine every time I use a batch different from 1 as the pgie txt config file provided in the example explicitly states “b1”. Here the console output with 3 RTSP (with the instruction in the original post, so pull, docker run, git clone, etc…):

Using winsys: x11 
ERROR: Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-5.1/samples/deepstream_python_apps/apps/deepstream-test3/../../../../samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine open error
0:00:04.288251451   453     0x3bc076d0 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1691> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-5.1/samples/deepstream_python_apps/apps/deepstream-test3/../../../../samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine failed
0:00:04.288450256   453     0x3bc076d0 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1798> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-5.1/samples/deepstream_python_apps/apps/deepstream-test3/../../../../samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine failed, try rebuild
0:00:04.288589425   453     0x3bc076d0 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1716> [UID = 1]: Trying to create engine from model files
WARNING: INT8 not supported by platform. Trying FP16 mode.
INFO: [TRT]: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
INFO: [TRT]: Detected 1 inputs and 2 output network tensors.
0:00:43.257983924   453     0x3bc076d0 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1749> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-5.1/samples/models/Primary_Detector/resnet10.caffemodel_b3_gpu0_fp16.engine successfully
INFO: [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input_1         3x368x640       
1   OUTPUT kFLOAT conv2d_bbox     16x23x40        
2   OUTPUT kFLOAT conv2d_cov/Sigmoid 4x23x40         

0:00:43.625813291   453     0x3bc076d0 INFO                 nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary-inference> [UID 1]: Load new model:dstest3_pgie_config.txt sucessfully

Of course in terms of FPS with less RTSP I got more FPS, but with a plateau around 11-16 FPS with an original RTSP stream at 25 FPS and resolution 2992x1680px.
Here the benchmarks (FPS print at ~1000 frames):

1 RTSP : 24.8 FPS
2 RTSP : 24.2 FPS
3 RTSP : 16.6 FPS
4 RTSP : 12.6 FPS

Anyway, for speeding up process of streaming with VLC looping over a short file (I have videos of 1 min average), I use the VLC cli.
First of all create a playlist.xspf file with this content:

<?xml version="1.0" encoding="UTF-8"?>
<playlist version="1" xmlns="http://xspf.org/ns/0/">
<trackList>
 <track><location>file:///path/to/your/video.mp4</location></track>
</trackList>
</playlist>

Then

$ cvlc --random --loop playlist.xspf :sout=#gather:rtp{{sdp=rtsp://:YOURPORTHERE/}} :network-caching=1500 :sout-all :sout-keep

Run in 4 terminals the command changing the port and the stream will be published on rtsp://yourip:YOURPORTHERE/

Let me know if you find something!
Thanks again
Federico