Multiple processes sharing the same device: MPI+CUDA

Smurty · May 5, 2017, 9:39pm

Hi all,

I have multiple processes sharing a single device. Each process involves transferring some data to the GPU followed by a kernel call. I understand that CUDA creates a different context corresponding to each process and that kernels in different context are executed serially. So, I was wondering if the same is true for memory transfers as well?

In other words, Is memory transfer from multiple processes to a GPU executed serially or concurrently?

Thanks!

Robert_Crovella · May 5, 2017, 10:06pm

PCIE bus transfers in a particular direction are always serialized, even when they originate from the same process. I believe transfers in opposite directions can overlap, if you have satisfied the rules for concurrent transfers.

You may want to take a look at CUDA MPS.

Smurty · May 6, 2017, 5:50am

Thanks, I’ll look at MPS.

So then, there is no real point of having multiple processes the same device using normal CUDA right? Since the calls will all be executed serially.

tera · May 6, 2017, 1:12pm

It certainly simplifies mattes if you only have one process (less space used for contexts, potential for overlapping kernels). You lose memory protection of course.
PCIe transfers in opposite direction can overlap if you use Tesla or Quadro cards. On GeForce cards you can program around that issue by using cudaMemcpy() for one direction and mapped (aka zerocopy) memory and a custom kernel for the other. But you’d still need the kernels to be in the same context, i.e. come from the same process.

Topic		Replies	Views
Concurrent execution of more than one CUDA application CUDA Programming and Performance	5	2977	May 1, 2009
multiple independent CPU processes using data that is in Device Memory CUDA Programming and Performance	1	689	May 11, 2011
Overlapping data transfers with kernel execution CUDA Programming and Performance	9	4554	March 13, 2009
How is the laptop GPU able to do the rendering and execute a cuda program at the same time CUDA Programming and Performance	6	672	August 15, 2023
Using dma memory transfers CUDA Programming and Performance	2	8089	February 23, 2007
Do kernels/streams execute concurrently? CUDA Programming and Performance	1	1175	October 15, 2008
Is CUDA thread-safe? CUDA Programming and Performance	3	12455	February 18, 2008
Parallel computing question CUDA Programming and Performance	3	4511	June 3, 2011
Two Questions! CUDA Programming and Performance	3	5260	December 3, 2007
cuda with multicore (multitasking) multicore CPU(for multitasking) and CUDA CUDA Programming and Performance	13	12022	February 23, 2009

Multiple processes sharing the same device: MPI+CUDA

Related topics