Multiple thread/process access to single GPU

Hi,

I’m working with a legacy C code, from which I would like to ship out compute-expensive routines/code blocks to the GPU using CUDA. I have a GPU that is mounted on a workstation with 2 dual-core CPUs. My legacy code is multi-process code, which means that I have routines that are running concurrently in separate parallel processes on all the 4 cores, and all of these routines would like to access the GPU. Is there a way by which I could schedule tasks from all 4 cores to run concurrently on the GPU, or use the GPU optimally?

Thanks,
GP

The driver does it for you. Simply launch kernel from each of your CPU threads.

Paulius

[quote name=‘paulius’ date=‘May 12 2008, 01:14 PM’]

The driver does it for you. Simply launch kernel from each of your CPU threads.

Thanks Paulius, but can we launch a kernel from each separate process, to the same GPU, at the same time?

Yes, but they wont be executed concurrently on a single GPU, but in a sequential manner.

Also, CUDA contexts of different threads|processes will live in separate address spaces, this might be worth noting, you cannot share data between them on the GPU.

Can I run different streams from a single kernel on (i) separate processes and (ii) separate threads? Could this be a way to overlap data transfer and compute for separate processes and/or separate threads?