DMAbuf file descriptors leak in EGL Stream API after disconnect

Hello,

We’re using EGL Stream API to share video frames from single input to multiple applications. There is one process running as a video server, responsible for capturing and sharing frames. Server runs constantly in the background as systemd service. Also, up to 3 additional processes could be started anytime as a video clients.
prod-cons

Issue - On the server (producer) side there is a DMAbuf file descriptors leak after disconnection/reconnection of the clients.

Setup - Jetson TX2-4GB running L4T 32.4.4 with CUDA 10.2.

Test setup

I used EGLStream_CUDA_CrossGPU sample code from cuda-samples repo (latest master branch) as a reference.
Slightly modified Makefile to fix build for CUDA 10.2. As well as did quick modifications to main.cpp. Processes don’t stop and run until termination to emulate many reconnections to producer. Changes prod-cons.patch (3.5 KB) are attached.
Build command - cd Samples/2_Concepts_and_Techniques/EGLStream_CUDA_CrossGPU && make CUDA_PATH=/usr/local/cuda-10.2/ dbg=1. Compiled binary EGLStream_CUDA_CrossGPU (804.2 KB) is attached.

Steps to reproduce:
Start consumer first - ./EGLStream_CUDA_CrossGPU
Consumer waits for producer.
In the second terminal start producer - ./EGLStream_CUDA_CrossGPU -proctype prod
Run script to collect the FD usage statistics for both processes - ./cuda_test_fd.sh.
Script cuda_test_fd.sh (398 Bytes) is attached.
Press Enter button in Consumer terminal to repeat iteration. Collect statistics again and check that DMAbuf FDs are leaking.
According to the code, stream and frames should be destroyed (consequently, DMAbuf FDs should be closed) after each iteration, but in fact number of FDs grows very fast.

Where could be the issue? Is something wrong in sample code with sequence of Stream termination?
Any help is appreciated.
Thanks!

Hi,
The demo of EGL producer/consumer is shared in this post:
Problems getting EGL Stream transferred to another process on same machine - #7 by DaneLLL

Please check if you can make a patch on this set up so that we can run and reproduce it. And would be great if you can upgrade to Jetpack 4.6.2 and try.

Hi,
Thanks for quick reply.

I’m aware about this demo. But our case is different. We have CUDA producer/consumer instead of OpenGL producer/consumer used in eglstreamcube/ctree/gears. That’s why I’m using demo from official Nvidia’s cuda-samples repo - EGLStream_CUDA_CrossGPU. It matches our case.
Patch and steps for reproducing are attached to original post under “Test setup” section. Just in case, “Test setup” section is collapsed for convenience. All technical details are there. Using them you can easily reproduce issue on your side.

I tried latest L4T 32.7.2 on my setup. Observe the same issue. I used same code and steps, as described in my original post under Test setup section.

Thanks,
Vlad.