Memory leak by RTSP reconnect

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU):Jetson
• DeepStream Version:DS7.0
• JetPack Version (valid for Jetson only):6.0
• TensorRT Version:8.6.2
• Issue Type( questions, new requirements, bugs):bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
I found that running deepstream-test5-app in DS7.0 with RTSP source
in a situation where RTSP reconnection occurs, causes a memory leak.
Here is the logs of the information obtained from /proc/meminfo and RSS while running deepstream-test5-app
and the memory leak of deepstream-test5-app using valgrind.
We run deepstream-test5-app twice,
once for 6 hours to get the information in /proc/meminfo and RSS and once for 6 hours to get the logs from valgrind.
I followed this topic on how to get the logs for valgrind.
meminfo_ps.zip (17.5 KB)
valgrind_6h.log (106.6 KB)

The decrease in MemAvailable in /proc/meminfo was 213 MB in 6 hours, but RSS was up 55 MB.
In addition, the amount of memory leakage in valgrind was 27KB.

Why is there a difference in the amount of memory leakage between valgrind, RSS, and MemAvailable?
MemAvailable did not decrease when no RTSP reconnection occurred.

container

nvcr.io/nvidia/deepstream:7.0-triron-multiarch

Command to get log of valgrind

valgrind --tool=memcheck --leak-check=full --num-callers=100 --show-leak-kinds=definite,indirect --track-origins=yes --log-file=valgrind_ffserver_6h ../deepstream-test5-app -t -c test5_config_file_src_infer.txt

Command to get log of /proc/meminfo

deepstream-test5-app -t -c test5_config_file_src_infer.txt

Commands to log memory

# Rewrite directory in crontab
cat crontab | crontab
nohup bash ./ps_5sec.sh psinfo.log

log_scripts.zip (1.5 KB)

test5_config_file_src_infer.txt

################################################################################
# SPDX-FileCopyrightText: Copyright (c) 2018-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: LicenseRef-NvidiaProprietary
#
# NVIDIA CORPORATION, its affiliates and licensors retain all intellectual
# property and proprietary rights in and to this material, related
# documentation and any modifications thereto. Any use, reproduction,
# disclosure or distribution of this material and related documentation
# without an express license agreement from NVIDIA CORPORATION or
# its affiliates is strictly prohibited.
################################################################################

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=0 # change to disable
rows=2
columns=2
width=1280
height=720
gpu-id=0
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0


[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=4 # change to RTSP
uri=rtsp://192.168.130.8:554/test.mpeg4
#uri=file://../../../../../samples/streams/sample_1080p_h264.mp4
num-sources=2
gpu-id=0
nvbuf-memory-type=0

[source1]
enable=0 # change to disable
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file://../../../../../samples/streams/sample_1080p_h264.mp4
num-sources=2
gpu-id=0
nvbuf-memory-type=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=1 # change to fakesink
sync=1 # change to disable
source-id=0
gpu-id=0
nvbuf-memory-type=0

[sink1]
enable=0 # change to disable
#Type - 1=FakeSink 2=EglSink 3=File 4=UDPSink 5=nvdrmvideosink 6=MsgConvBroker
type=6
msg-conv-config=dstest5_msgconv_sample_config.txt
#(0): PAYLOAD_DEEPSTREAM - Deepstream schema payload
#(1): PAYLOAD_DEEPSTREAM_MINIMAL - Deepstream schema payload minimal
#(256): PAYLOAD_RESERVED - Reserved type
#(257): PAYLOAD_CUSTOM   - Custom schema payload
msg-conv-payload-type=0
msg-broker-proto-lib=/opt/nvidia/deepstream/deepstream/lib/libnvds_kafka_proto.so
#Provide your msg-broker-conn-str here
msg-broker-conn-str=<host>;<port>;<topic>
topic=<topic>
#Optional:
#msg-broker-config=../../deepstream-test4/cfg_kafka.txt

[sink2]
enable=0
type=3
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4
## only SW mpeg4 is supported right now.
codec=3
sync=1
bitrate=2000000
output-file=out.mp4
source-id=0

# sink type = 6 by default creates msg converter + broker.
# To use multiple brokers use this group for converter and use
# sink type = 6 with disable-msgconv = 1
[message-converter]
enable=0
msg-conv-config=dstest5_msgconv_sample_config.txt
#(0): PAYLOAD_DEEPSTREAM - Deepstream schema payload
#(1): PAYLOAD_DEEPSTREAM_MINIMAL - Deepstream schema payload minimal
#(256): PAYLOAD_RESERVED - Reserved type
#(257): PAYLOAD_CUSTOM   - Custom schema payload
msg-conv-payload-type=0
# Name of library having custom implementation.
#msg-conv-msg2p-lib=<val>
# Id of component in case only selected message to parse.
#msg-conv-comp-id=<val>

# Configure this group to enable cloud message consumer.
[message-consumer0]
enable=0
proto-lib=/opt/nvidia/deepstream/deepstream/lib/libnvds_kafka_proto.so
conn-str=<host>;<port>
config-file=<broker config file e.g. cfg_kafka.txt>
subscribe-topic-list=<topic1>;<topic2>;<topicN>
# Use this option if message has sensor name as id instead of index (0,1,2 etc.).
#sensor-list-file=dstest5_msgconv_sample_config.txt

[osd]
enable=0 # change to disable
gpu-id=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Arial
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=1 # change to 1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=1920
height=1080
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0
## If set to TRUE, system timestamp will be attached as ntp timestamp
## If set to FALSE, ntp timestamp from rtspsrc, if available, will be attached
# attach-sys-ts-as-ntp=1

[primary-gie]
enable=0 # change to disable
gpu-id=0
batch-size=4
## 0=FP32, 1=INT8, 2=FP16 mode
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;1;1;1
bbox-border-color3=0;1;0;1
nvbuf-memory-type=0
interval=0
gie-unique-id=1
model-engine-file=../../../../../samples/models/Primary_Detector/resnet18_trafficcamnet.etlt_b4_gpu0_int8.engine
labelfile-path=../../../../../samples/models/Primary_Detector/labels.txt
config-file=../../../../../samples/configs/deepstream-app/config_infer_primary.txt
#infer-raw-output-dir=../../../../../samples/primary_detector_raw_output/

[tracker]
enable=0 # change to disable
# For NvDCF and NvDeepSORT tracker, tracker-width and tracker-height must be a multiple of 32, respectively
tracker-width=960
tracker-height=544
ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
# ll-config-file required to set different tracker types
# ll-config-file=../../../../../samples/configs/deepstream-app/config_tracker_IOU.yml
# ll-config-file=../../../../../samples/configs/deepstream-app/config_tracker_NvSORT.yml
ll-config-file=../../../../../samples/configs/deepstream-app/config_tracker_NvDCF_perf.yml
# ll-config-file=../../../../../samples/configs/deepstream-app/config_tracker_NvDCF_accuracy.yml
# ll-config-file=../../../../../samples/configs/deepstream-app/config_tracker_NvDeepSORT.yml
gpu-id=0
display-tracking-id=1

[tests]
file-loop=0

I am checking.

how did you test or simulate RTSP reconnection? is it a physical camera?

I used ffserver to test the RTSP reconnection.

Here is the command and configuration file for running ffserver.
After executing the following command, you can receive RTSP at rtsp://<IP address>:554/test.mpeg4.
When I ran deepstream-test5-app, it took about 7.5 minutes to reconnect the RTSP.

ffserver.conf.txt (336 Bytes)

sudo ffserver -d -f ffserver.conf.txt &
sudo ffmpeg -re -f lavfi -i "movie=sample_1080p_h264.mp4:loop=0, setpts=N/(FRAME_RATE*TB)" -c:v libx264 -override_ffserver -s 1920x1080 http://localhost:8099/feed1.ffm >> /tmp/ffmpeg.log 2>&1 < /dev/null &
  1. Noticing rtsp-reconnect-interval-sec and rtsp-reconnect-attempts are not set, how did you test rtsp reconnection?
  2. To narrow down this issue, if no rtsp reconnection, are MemAvailable and RSS fine?
  3. Since MemAvailable is a global data, did you run other applications at the same time?
  1. Even if rtsp-reconnect-interval-sec and rtsp-reconnect-attempts are not set, the bus_callback function at line 246 of deepstream_app.c will cause RTSP to reconnect.

  2. Yes, they are. MemAvailable does not seem to keep decreasing if RTSP is not reconnected. And RSS does not seem to keep rising.
    meminfo_ps_33h_no_reconnect.zip (150.5 KB)

  3. Yes, I ran the MemAvailable and RSS logging scripts in the initial post at the same time I ran deepstream-test5-app.
    No other applications were run.

I have checked to see if MemAvailable decreases with the pipeline you have shown.
meminfo.log (125.1 KB)
MemAvailable did not decrease.

The decrease in MemAvailable occurs when RTSP reconnection occurs in deepstream-test5-app.

I checked your log. MemAvailable decreased 103MB(from 28964MB to 28861MB) during 7 hours.
To narrow down this issue, Since MemAvailable is a global data, can you move ffserver and ffmpeg encoding to other machines? so as to rule out the effect of these modules.

Memory is consumed for the first few minutes after the application is started.
Therefore, if you look at the MemAvailable value from around 11:06, when MemAvailable starts to become constant.
And when I do that test, I run ffserver and ffmpeg on other machines.

The following picture is a graph of meminfo.csv included in meminfo_ps.zip.

Thanks for the sharing!

  1. during 1 hours in my test, there is only one reconnection. how did simulate RTSP reconnection? or can you simulate frequent RTSP reconnection? Thanks!
  2. in the logs of issue description, how many times deepstream reconnected? could you share deepstream logs? there will be “reset_source_pipeline” printing if the app reconnects.
  1. To reconnect RTSP, first execute the following command on a different machine from the one on which you want deepstream-test5-app to run.
    The video file to be played by RTSP is /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4.
    # Put sample_1080p_h264.mp4 and ffserver.conf.txt in the current directory
    sudo docker run -itd --name rtspserver --net=host -v ${PWD}:${PWD} -w ${PWD} ubuntu:18.04
    sudo docker exec -it rtspserver bash
    apt update && apt install -y ffmpeg
    ffserver -d -f ffserver.conf.txt &
    ffmpeg -re -f lavfi -i "movie=sample_1080p_h264.mp4:loop=0, setpts=N/(FRAME_RATE*TB)" -c:v libx264 -override_ffserver -s 1920x1080 http://localhost:8099/feed1.ffm >> /tmp/ffmpeg.log 2>&1 < /dev/null &
    
    Next, rewrite the IP address in the configuration file I gave you to the IP address of the machine that is streaming the RTSP,
    Run deepstream-test5-app with that configuration file on the machine you want to test.
    deepstream-test5-app -t -c test5_config_file_src_infer.txt
    
  2. 48 reconnections occurred at deepstream in the 6 hours since we started the deepstream-test5-app program.
    This file is the log of running deepstream-test5-app for about 20 hours,
    We counted the number of “reset_source_pipeline” within 6 hours.
    deepstream-test5-app.log (2.8 MB)

I did two tests.
test1: I tested on Orin with DS71 using the same test method with yours. Especially I did not start valgrind. there are 66 times reconnections in about 7 hours.
deepstram.txt (528.5 KB)meminfo.log (1.7 MB)ps.log (840.2 KB)
Here are statistic after the app run stably( after 1 hour).
MemAvailable decreased 77MB, that is from 23897496KB to 23974844KB.
2024/11/28 06:32:09 31433612 6445580 24083404 ----- > 2024/11/28 07:32:45 31433612 6335916 23974844 ----> 2024/11/28 14:35:12 31433612 6253428 23897496
RSS increases about 14MB, that is from 102032KB to 116660KB.
2024-11-28_06:32:32 74412 ----> 2024-11-28_07:32:34 102032 ----> 2024-11-28_14:32:33 116660
Since there is no accurate explanation of MemAvailable, I will focus on the memory leak of DeepStream app first. the memory leak is not that severe relatively on DS7.1 in this test case. please use DS7.1 if possible.

test2: I tested on Orin with DS71 using the same valgrind cmd with yours.
deepstream-valgrind.txt (538.8 KB)valgrind_6.log (71.0 KB)
from the DeepStream log, there are 78 times reconnections in 10 hours. from the valgrind log, there are mainly two cases after searching “definitely lost”.
a> leak in libgstrtpmanager.so.
b> leak libnvdsgst_multistream.so. the corresponding code is not opensource. we will investigate and fix in the latter versions. Thanks for your reporting!

Thank you for investigating the memory leak.
I would like you to also investigate why MemAvailable is decreasing more than the RSS increase.