Where should I call cuCtxDestroy in DLL?

I have a DLL that uses a global CUDA context (to allow all threads to share same GPU memory). I call cuCtxCreate() in a C++ constructor of a global, but it’s not clear where to put cuCtxDestroy() because I have some global variables (image buffers) that call cudaFree in its destructor.

The relative program order is:

  1. Constructors()
  4. Destructors()

If I put cuCtxDestroy() in a C++ destructor of a global, there is no guarantee it will be called after the destructors for the image buffers.

The only solution I can think of is to get rid of all global objects that use CUDA in its constructors/destructor and instead dynamically create/destroy those global variables in DLL_PROCESS_ATTACH & DLL_PROCESS_DETACH, but that’s less convenient than being able to declare your global image buffers anywhere in the code.

Is there a better way?

Try calling cuCtxCreate() at the very beginning of your program and cuCtxDestroy() at the very end of your program. Do not call either function in constructors or destructors and do not use global variables.

Let me know if it works.


Doesn’t unloading the DLL destroy the context anyway?

If you create the context when you load the DLL then you would definitely destroy the context when you unload the DLL. But as I understood it, that’s not what Uncle Joe wants.

Speaking of unloading the DLL, there is some weird behavior, which may be a bug or a feature. If you destroy the context when you unload the DLL the memory reserved by the context upon creation (about 100 MB on GTX 480) will not be deallocated. If you load and unload the DLL you will eventually run out of memory.