DMAbuf file descriptors leak in EGL Stream API after disconnect

vlad · June 16, 2022, 4:40am

Hello,

We’re using EGL Stream API to share video frames from single input to multiple applications. There is one process running as a video server, responsible for capturing and sharing frames. Server runs constantly in the background as systemd service. Also, up to 3 additional processes could be started anytime as a video clients.
prod-cons

Issue - On the server (producer) side there is a DMAbuf file descriptors leak after disconnection/reconnection of the clients.

Setup - Jetson TX2-4GB running L4T 32.4.4 with CUDA 10.2.

Test setup

I used EGLStream_CUDA_CrossGPU sample code from cuda-samples repo (latest master branch) as a reference.
Slightly modified Makefile to fix build for CUDA 10.2. As well as did quick modifications to main.cpp. Processes don’t stop and run until termination to emulate many reconnections to producer. Changes prod-cons.patch (3.5 KB) are attached.
Build command - cd Samples/2_Concepts_and_Techniques/EGLStream_CUDA_CrossGPU && make CUDA_PATH=/usr/local/cuda-10.2/ dbg=1. Compiled binary EGLStream_CUDA_CrossGPU (804.2 KB) is attached.

Steps to reproduce:
Start consumer first - ./EGLStream_CUDA_CrossGPU
Consumer waits for producer.
In the second terminal start producer - ./EGLStream_CUDA_CrossGPU -proctype prod
Run script to collect the FD usage statistics for both processes - ./cuda_test_fd.sh.
Script cuda_test_fd.sh (398 Bytes) is attached.
Press Enter button in Consumer terminal to repeat iteration. Collect statistics again and check that DMAbuf FDs are leaking.
According to the code, stream and frames should be destroyed (consequently, DMAbuf FDs should be closed) after each iteration, but in fact number of FDs grows very fast.

Where could be the issue? Is something wrong in sample code with sequence of Stream termination?
Any help is appreciated.
Thanks!

DaneLLL · June 16, 2022, 8:34am

Hi,
The demo of EGL producer/consumer is shared in this post:
Problems getting EGL Stream transferred to another process on same machine - #7 by DaneLLL

Please check if you can make a patch on this set up so that we can run and reproduce it. And would be great if you can upgrade to Jetpack 4.6.2 and try.

vlad · June 23, 2022, 12:59am

Hi,
Thanks for quick reply.

I’m aware about this demo. But our case is different. We have CUDA producer/consumer instead of OpenGL producer/consumer used in eglstreamcube/ctree/gears. That’s why I’m using demo from official Nvidia’s cuda-samples repo - EGLStream_CUDA_CrossGPU. It matches our case.
Patch and steps for reproducing are attached to original post under “Test setup” section. Just in case, “Test setup” section is collapsed for convenience. All technical details are there. Using them you can easily reproduce issue on your side.

I tried latest L4T 32.7.2 on my setup. Observe the same issue. I used same code and steps, as described in my original post under Test setup section.

Thanks,
Vlad.

DaneLLL · July 1, 2022, 12:41am

Hi,
We will set up and try to replicate the issue first. And then do further investigation.

DaneLLL · July 1, 2022, 6:58am

Hi
Please check this user guide It is for Drive platform but similar on Jetson platforms. Generally we create consumer first and then producer. And consumer can be alive to wait for producers. It is same as this demo:
Problems getting EGL Stream transferred to another process on same machine - #7 by DaneLLL

Please check if you can adapt to this standard way.

vlad · July 6, 2022, 4:44am

Hi,
Have you been able to replicate issue on your side according to my steps?

DaneLLL · July 6, 2022, 4:53am

Hi,
No. We check the code and it is a bit strange the producer/consumer is destroyed/re-initialized in a loop but the process is still alive. It is more reasonable the producer/consumer is not destroyed if the process is alive. To re-use same producer and cosumer.

vlad · July 6, 2022, 5:04am

Thanks for pointing me out.

I checked this user guide. It’s the same as used in L4T documentation for 32.4.4 release. Wasn’t able to find something new there.

We’re following these guidelines. And looks like in github samples from Nvidia you’re also following them. But the difference is that in our case and in EGLStream_CUDA_CrossGPU case CUDA consumer/producer are used instead of OpenGL ones.

In my test setup (based on EGLStream_CUDA_CrossGPU) consumer was created first. As you’re suggesting.

Thanks,
Vlad.

vlad · July 6, 2022, 5:33am

I did that intentionally to simulate behavior closer to our real case. We can’t keep consumers always running, because they’re desktop applications. Users of the system can open/close them at will any time.

Do you mean that EGL stream implementation can’t properly support dynamic creation/destroying of multiple Streams?
Are there any guidelines or documentation on usage of CUDA consumer/producer with EGL Streams?

Thanks!

DaneLLL · July 6, 2022, 5:54am

Hi,
The document describes the functions and the sample is for demonstration. In the sample code, the consumer/producer is initialized and destroyed along with the process. After applying the patch, the process are alive and consumer/producer is initialize and destroyed in loop. This is not same as the sample and may not work properly.

For your use-case, one possible solution is to have consumer/producer daemon, once you need a consumer-producer connection, can fork one child process for consumer and one child process for producer. So that after the consumer-producer connection is done, you can destroy consumer/producer and exit the processes.

system · July 27, 2022, 3:17am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problems getting EGL Stream transferred to another process on same machine Jetson AGX Xavier	10	3084	October 18, 2021
Cross Process YUV buffers Jetson TX2 nvbugs , graphics	2	742	October 6, 2021
Creating a GStreamer source that publishes to NVMM Jetson AGX Xavier	12	4421	October 18, 2021
Confusion regarding EGL Stream Producers using OpenGLES offscreen rendering as source Jetson AGX Xavier gstreamer	4	850	October 18, 2021
Could not get EGL display connection DeepStream SDK	15	278	July 23, 2024
cuEGLStreamConsumerConnect is resulting in EGL_BAD_ALLOC CUDA Programming and Performance cuda	9	1310	October 23, 2020
LibArgus EGLStream to nvivafilter Jetson TX2	12	6205	October 18, 2021
Jetson Nano DeepStream Issue DeepStream SDK	11	309	February 26, 2024
Nvvidconv stuck after, "gst_nv_filter_buffer_pool_release_buffer:<nvfilterbufferpool1> release_buffer" Jetson AGX Xavier gstreamer , nvbugs	24	1004	October 18, 2021
New installation Multiple Failues DeepStream SDK	18	1115	June 28, 2022

DMAbuf file descriptors leak in EGL Stream API after disconnect

Related topics