memory leak in deepstream 3.0

Hello,
I found an issue possibly related to DS 3.0 SDK. The app’s memory will increase as time goes on when running on Geforce GPUs such as RTX 2060/2070. Maybe it is caused by low hardware performance on 2060/2070, but I found the GPU utils is only 40%/30% and the video output from deepstream pipeline has no delay. Below is our test environments:

(1) DeepStream SDK 3.0
(2) Ubuntu 18.08
(3) I3/I5 CPU & 2060/2070 GPU
(4) YoloV3-tiny object detector & IOU tracker algorithm
(5) 8 real-time streams, 1080p. transfer with RTP and received using udp_src plugins in pipleline, decode and do inference and then other logics

I can still get about 25 FPS on above environments but the memory increase, quicker on I3/2060 than I5/2070. I found some memory-leak info using memory leak check tool:

...
==2133== 
==2133== 184 (80 direct, 104 indirect) bytes in 1 blocks are definitely lost in loss record 60,036 of 63,041
==2133==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2133==    by 0x5D57AB8: g_malloc (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
==2133==    by 0x5D6F975: g_slice_alloc (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
==2133==    by 0x5D6FE28: g_slice_alloc0 (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
==2133==    by 0x5807BF3: gst_query_new_custom (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.1404.0)
==2133==    by 0x6928F58D: gst_nvquery_num_surfaces_per_buffer_new (in /usr/local/deepstream/libnvdsgst_helper.so)
==2133==    by 0x67D13F8B: gst_nvstreammux_alloc_output_buffers (in /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libnvdsgst_multistream.so)
==2133==    by 0x67D15399: gst_nvstreammux_sink_event (in /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libnvdsgst_multistream.so)
==2133==    by 0x57EC306: ??? (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.1404.0)
==2133==    by 0x57EC79A: ??? (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.1404.0)
==2133==    by 0x57ECBC8: ??? (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.1404.0)
==2133==    by 0x57EA56F: ??? (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.1404.0)
==2133== 

...

==2133== 
==2133== 304 (160 direct, 144 indirect) bytes in 2 blocks are definitely lost in loss record 60,568 of 63,041
==2133==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2133==    by 0x5D57AB8: g_malloc (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
==2133==    by 0x5D6F975: g_slice_alloc (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
==2133==    by 0x5D6FE28: g_slice_alloc0 (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
==2133==    by 0x5807BF3: gst_query_new_custom (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.1404.0)
==2133==    by 0x6928F928: gst_nvquery_nppstream_new (in /usr/local/deepstream/libnvdsgst_helper.so)
==2133==    by 0x6B564257: gst_nvvidconv_start(_GstBaseTransform*) (in /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libnvdsgst_vidconv.so)
==2133==    by 0x54CCF14F: ??? (in /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0.1404.0)
==2133==    by 0x54CCF3E4: ??? (in /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0.1404.0)
==2133==    by 0x57F158A: ??? (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.1404.0)
==2133==    by 0x57F2005: gst_pad_set_active (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.1404.0)
==2133==    by 0x57CFDDC: ??? (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.1404.0)
==2133== 

...

We also test on another hardware platform:
(1) E3 CPU + 2080Ti GPU
(2) 16 1080P real-time streams
The memory leak issue would not appear evidently, the memory usage increased by 0.3% after 2 days which I think it is acceptable.

So my question is Why the memory usage increase but the FPS is still about 25 FPS and stream output from pipeline has no delay accumulation. Anybody know this? Thanks! @ChrisDing @amycao

memory is one factor for FPS, but not the only one, GPU decoding capacity, video conversion, inference model Complexity, GPU computing power also do, you can observe by nvidia-smi GPU memory increase first then will to one stable state.

Maybe the root cause is Geforce GPU series, which is not supported very well for DL such as inference.