Paralelling cudaMalloc in different GPU cards

Hello, forum!

Exists a way to use cudaMalloc in two different CPU threads?

I’m trying to allocate memory in 2 GPU cards in parallel functions where the first line of my own functions is a cudaSetDevice invocation to select which card will be used by thread. When i compile my code the feedback shows which pthreads library isn’t available to nvcc. Streams are useful in this case?

Thanks for answer.

You can have a cudaMalloc operation in 2 different CPU threads, targetting 2 different cards. Take a look at the OpenMP sample code:

http://docs.nvidia.com/cuda/cuda-samples/index.html#cudaopenmp

I have used pthreads with nvcc, and targetting multiple GPUs. Not sure what you mean by “pthreads library isn’t available to nvcc”.

Certainly you can use streams for managing concurrency, but streams are associated with a particular device:

http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#device-selection

They are not a replacement for CPU threads, if you need multithreading.