cuInit() memory leak

stolk · August 19, 2020, 6:38pm

The CUDA API has a cuInit() call.

There is no cuExit() call.

That wouldn’t be so bad, if it were not for the fact that cuInit() leaks memory.

Either cuInit() should not leak resources, or there should be a cuExit()

I verified this with valgrind on Linux.

I get a whole bunch of leaks, one of which is even a pthread leak.

==8766== 384 bytes in 1 blocks are possibly lost in loss record 741 of 890
==8766==    at 0x3B84711A: calloc (vg_replace_malloc.c:760)
==8766==    by 0x3AC284F6: allocate_dtv (dl-tls.c:286)
==8766==    by 0x3AC284F6: _dl_allocate_tls (dl-tls.c:530)
==8766==    by 0x3DEAB227: allocate_stack (allocatestack.c:627)
==8766==    by 0x3DEAB227: pthread_create@@GLIBC_2.2.5 (pthread_create.c:644)
==8766==    by 0x3C2D80DE: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.440.100)
==8766==    by 0x3C2CD025: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.440.100)
==8766==    by 0x3C29E104: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.440.100)
==8766==    by 0x3C33CF33: cuInit (in /usr/lib/x86_64-linux-gnu/libcuda.so.440.100)
==8766==    by 0x42A172: osino_client_init (osino_cuda.c:78)
==8766==    by 0x4092EF: ctrl_init() (ctrl.cpp:344)
==8766==    by 0x4090A9: ctrl_create(int, int, float) (ctrl.cpp:367)
==8766==    by 0x4062C9: main (sdlmain.cpp:451)

stolk · December 24, 2023, 12:52am

Bump. Still an issue with the latest cuda lib.

njuffa · December 24, 2023, 7:51am

Did you file a bug report with NVIDIA? If so, what is its status?

Robert_Crovella · December 25, 2023, 2:07am

Filing a bug may be a good idea. I’m quite certain that if you file a bug, you will be asked for a full description of how to reproduce your observation. What is currently in this post/thread is definitely not enough.

Regarding leaks, I think the usual attention to leaks occurs when there is a repetitive creation and destruction of a resource. If that repeated process results in an ever increasing memory utilization, its a typical symptom of a leak that I am familiar with.

Since cuInit would never be used that way (it only makes sense to call it once per process), you might need to be more specific about what you mean by a leak. If cuInit needs a resource, and fails to release it immediately prior to process termination, I don’t think that would generally fit my definition of a “leak”. The termination of the process results in automatic destruction/release of all resources. I’ve not observed a counter-example to this, ever. Regarding the various posts where people report CUDA device memory still in use after termination of a process, in my experience this simply means they haven’t terminated the proper processes, which I acknowledge may be difficult to do in some cases. However the multi-process structure of python certainly doesn’t help here. I’ve never witnessed an ordinary C++ process not do full release of resources upon termination. Also, since this mostly results (in my experience) from abnormal process termination, having a tidy sequence of free operations prior to normal process termination is no guarantee of anything.

Specifically with respect to valgrind, there has been a continuing situation where valgrind may produce “reports” based on the usage of the CUDA apis. One of the reasons for this is IOCTL usage, which in some cases valgrind cannot properly inspect, without help. The help involves creation of descriptors/files by the application developer (in this case, the CUDA devs) to instruct valgrind what to do with a particular observation.

I’ve never seen any indication that the CUDA devs have any intention of creating such descriptors/files. So reports of that type from valgrind have been reported before with no activity that I have ever witnessed, but of course you are welcome to file bug(s).

serenissi · October 28, 2024, 6:07am

@stolk Did you file a bug regarding this?

Recently I came across an issue with this model (cuInit without
“cuExit”). Whenever in an app cuInit from libcuda is called, it opens /dev/nvidia{gpu_num} but neither it is closed when dlclose makes refcount of libcuda.so, nor there seams to be a cuda function to close it manually.

Of course one can iterate through /proc/self/fd to find and close the fd of /dev/nvidia{gpu_num} but that is simply a hack.

The reason why this is a problem is that some program (for example ffmpeg) opens the cuda device briefly but doesn’t use it later. But with the /dev/nvidia{gpu_num} the kernel driver does not let the GPU to go d3cold resulting in power waste, especially on mobile devices with dual GPU.

As this isn’t exactly related to memory leak, probably I should create a new bug though I wanted to check if there is any update as @stolk posted with quite a similar objective.

mEm · February 17, 2025, 7:34pm

To date, still an “issue” with CUDA version 12.2.
Nonetheless, if I might add, from what I could report, not only I see those plenty of warnings coming from cuInit(), but I also caught a warning (more than one actually) coming from cuCtxDestroy():

==230882== 1,280 bytes in 1 blocks are still reachable in loss record 106 of 118
==230882==    at 0x4849464: calloc (vg_replace_malloc.c:1328)
==230882==    by 0x4CAF504: ??? (in /usr/lib64/libcuda.so.535.86.10)
==230882==    by 0x4CAFBFE: ??? (in /usr/lib64/libcuda.so.535.86.10)
==230882==    by 0x4BF0E92: ??? (in /usr/lib64/libcuda.so.535.86.10)
==230882==    by 0x4B695E8: ??? (in /usr/lib64/libcuda.so.535.86.10)
==230882==    by 0x4B641CE: ??? (in /usr/lib64/libcuda.so.535.86.10)
==230882==    by 0x4C23BA9: ??? (in /usr/lib64/libcuda.so.535.86.10)
==230882==    by 0x488DF39: xbsf_gpu_finalise (gpu.c:1361)  #<- at this line `cuCtxDestroy()` is called

Topic		Replies	Views
`cuCtxCreate` and `cuCtxDestroy` pairs have a memory leak CUDA Programming and Performance cuda , problem	9	1375	January 11, 2024
Is there a memory leak in CUDA CUDA Programming and Performance	6	7304	June 11, 2008
cuModuleLoad valgrind warnings CUDA Programming and Performance	2	1859	October 26, 2009
Memory leaks in libcudart 4.2.9 or misuse? CUDA Programming and Performance	2	2201	June 7, 2012
memory leak in driver? CUDA Programming and Performance	2	6722	December 30, 2010
cuda call doing leakage ? CUDA Programming and Performance	1	2144	November 17, 2009
Memory leaks in CUDAAPI functions CUDA Programming and Performance	2	908	September 5, 2016
Unexpected leak CUDA Programming and Performance	9	6061	October 13, 2008
memory leak on cudaGetDeviceCount ? CUDA Programming and Performance	1	6798	October 14, 2009
Cuda memory leak CUDA Programming and Performance	0	739	August 26, 2020

cuInit() memory leak

Related topics