Thread-Safe? cublas.dll / cufft.dll

Does anybody know, if cublas.dll and cufft.dll are thread-safe?

TIA
chris

Anything wrong with my question or does it really nobody know???

It would help if you were more specific. What are you trying to do?

The libraries operate correctly when called from multiple threads assigned to different GPUs.

afaik there is no possibility to assgin anything to specific GPUs in (at least) cublas. it is a (simple) function library, which doesn´t deal with device stuff at its interface (exported functions).

but my question is simple: can you use the dll from different host-threads of one host-process at the same time?

the next question might be: how to initialize it in this case? once by process or once by each thread?

hope that clarifies what i want to know…

CUBLAS may not provide the means to deal with devices, but it still needs a CUDA device context to operate. If you haven’t explicitly set the device using cudaSetDevice(), one will be set for you at the first call.

Reading sections 4.5.2.2 and 4.5.3.3 of the programming guide might help.

Your question is not as simple as you think. For example:

If you have two CUDA contexts assigned to two different threads (and therefore two GPUs - one thread for each) and you’ve allocated memory on each GPU inside each thread, then pass those pointers to CUBLAS in the same thread, then CUBLAS is thread-safe.

But if you allocate memory on one GPU then use those pointers in simultaneous calls to CUBLAS from other thread, CUBLAS is not thread-safe.

So, the libraries are thread-safe if each thread has a different CUDA context - meaning each thread is assigned to a different GPU. To put it another way, the libraries are thread-safe but CUDA contexts are not.

Initialize what? CUBLAS?

"…
Function cublasInit()

initializes the CUBLAS library and must be called before any other
CUBLAS API function is invoked. It allocates hardware resources
necessary for accessing the GPU.
…"

In cublas is nowhere the possibility to create contexts or select devices. I know that there are in cuda. So I keep asking that “simple” question: Is it host-thread-safe, with the exception of passing GPU-Device-Pointer between threads, of course?
Does cublas handle thread-specific things like creating distinct contexts for each thread it is called by?

But by now I think, it is thread-safe and I have to call cublasInit once by each thread…

Maybe someone from nVidia can answer that. If it behaves like the rest of CUDA, a context for GPU 0 is created if one doesn’t already exist.

Try it and let us know if it works. I’d be worried an exiting thread would clean up the context and cause a segfault in another thread - I saw this occur in CUFFT.