cuInit() memory leak

The CUDA API has a cuInit() call.

There is no cuExit() call.

That wouldn’t be so bad, if it were not for the fact that cuInit() leaks memory.

Either cuInit() should not leak resources, or there should be a cuExit()

I verified this with valgrind on Linux.

I get a whole bunch of leaks, one of which is even a pthread leak.

==8766== 384 bytes in 1 blocks are possibly lost in loss record 741 of 890
==8766==    at 0x3B84711A: calloc (vg_replace_malloc.c:760)
==8766==    by 0x3AC284F6: allocate_dtv (dl-tls.c:286)
==8766==    by 0x3AC284F6: _dl_allocate_tls (dl-tls.c:530)
==8766==    by 0x3DEAB227: allocate_stack (allocatestack.c:627)
==8766==    by 0x3DEAB227: pthread_create@@GLIBC_2.2.5 (pthread_create.c:644)
==8766==    by 0x3C2D80DE: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.440.100)
==8766==    by 0x3C2CD025: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.440.100)
==8766==    by 0x3C29E104: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.440.100)
==8766==    by 0x3C33CF33: cuInit (in /usr/lib/x86_64-linux-gnu/libcuda.so.440.100)
==8766==    by 0x42A172: osino_client_init (osino_cuda.c:78)
==8766==    by 0x4092EF: ctrl_init() (ctrl.cpp:344)
==8766==    by 0x4090A9: ctrl_create(int, int, float) (ctrl.cpp:367)
==8766==    by 0x4062C9: main (sdlmain.cpp:451)

Bump. Still an issue with the latest cuda lib.

Did you file a bug report with NVIDIA? If so, what is its status?

Filing a bug may be a good idea. I’m quite certain that if you file a bug, you will be asked for a full description of how to reproduce your observation. What is currently in this post/thread is definitely not enough.

Regarding leaks, I think the usual attention to leaks occurs when there is a repetitive creation and destruction of a resource. If that repeated process results in an ever increasing memory utilization, its a typical symptom of a leak that I am familiar with.

Since cuInit would never be used that way (it only makes sense to call it once per process), you might need to be more specific about what you mean by a leak. If cuInit needs a resource, and fails to release it immediately prior to process termination, I don’t think that would generally fit my definition of a “leak”. The termination of the process results in automatic destruction/release of all resources. I’ve not observed a counter-example to this, ever. Regarding the various posts where people report CUDA device memory still in use after termination of a process, in my experience this simply means they haven’t terminated the proper processes, which I acknowledge may be difficult to do in some cases. However the multi-process structure of python certainly doesn’t help here. I’ve never witnessed an ordinary C++ process not do full release of resources upon termination. Also, since this mostly results (in my experience) from abnormal process termination, having a tidy sequence of free operations prior to normal process termination is no guarantee of anything.

Specifically with respect to valgrind, there has been a continuing situation where valgrind may produce “reports” based on the usage of the CUDA apis. One of the reasons for this is IOCTL usage, which in some cases valgrind cannot properly inspect, without help. The help involves creation of descriptors/files by the application developer (in this case, the CUDA devs) to instruct valgrind what to do with a particular observation.

I’ve never seen any indication that the CUDA devs have any intention of creating such descriptors/files. So reports of that type from valgrind have been reported before with no activity that I have ever witnessed, but of course you are welcome to file bug(s).