Is CUDA thread-safe?

I have multiple CPU threads calling CUDA, with potential overlap. The threads do the usual copy-data-to-card, invoke kernel, copy-results-from-card logic. Are the CUDA calls thread-safe? If two threads invoke a kernel at the same time (for example, on a dual-core CPU) is that supported?

Same question for async calls…

So far it’s all been rock-solid, but I’ve not seen mention of this in the documentation so was hoping for some confirmation.

Thanks!

When using multiple threads and CUDA one has to bear in mind that every thread has it’s own CUDA context. That means if GPU memory is allocated in one thread a second thread does not have access to this GPU memory space.

As for launching kernels at the same time this is not a problem from a CUDA point of view. Multiple Kernel calls are automatically put in a queue and executed sequentially.
For performance reasons it might be important to know that once this queue is full (I think it currently allows for 16 kernel calls to be queued) the CPU thread will synchronize and wait until a slot in the queue is free e.g. until a kernel finished. This synchronization is a busy waiting loop done by the CPU.

Thanks! Got to get rid of these busy loops :-)

Good to know! External Image
But will there be also overlapping of memory copy and kernel execution, llike with multiple streams?