CUDA Device Sharing


I’m working on a multithreaded application that runs some CUDA filters on a stream of images.

I have two CPU threads, and each one must run a CUDA kernel on different images at the same time.

My question is whether it’s safe to have two CPU threads running at the same time, allocating device memory, and launching kernels? So far I’ve tried it and it seems to be working, but it could be just a coincidence. I’m puzzled by the following statement in the CUDA programming guide:

“Once the runtime has been initialized in a host thread, any resource (memory, stream, event, etc.) allocated via some runtime function call in the host thread is only valid within the context of the host thread. Therefore only runtime functions calls made by the host thread (memory copies, kernel launches, …) can operate on these resources. This is because a CUDA context is created under the hood as part of initialization and made current to the host thread, and it cannot be made current to any other host thread.”

Does this mean that for two threads two contexts exist or rather that only one thread should be making CUDA calls?

Any help will be greatly appreciated.