How to do GPU allocation in N GPU + M process env

mgibbons · October 8, 2008, 10:45am

Can anyone tell me how, for example, I ensure that on a two (or more) cpu box with a four GPU TESLA card attached that, out of two processes running, that each process uses a free GPU and not a GPU on which the kernel of the other process is already executing? Can I scan for free GPU’s and ‘lock’ one for my use? I don’t want to be in the situation where I have processes’ kernels being serialised while GPU’s go idle.

regards
Mark

Sarnath · October 8, 2008, 11:53am

Interestingly, CUDA APIs do NOT support this. You just cannot figure out who is using what.

There have been attempts by people in forum to come up with some algorithms to figure out which GPU is in use (by estimating the % of free CUDA memory etc…)

but still, there is no formal fool-proof mechanism to lock and allocate a GPU for a CUDA operation – AFAIK. Not sure if things have changed much with CUDA 2.0

_teju · October 8, 2008, 12:04pm

Went through the CUDA 2.0 doc, things haven’t changed in this direction :(

Sarnath · October 8, 2008, 12:07pm

While stuffing GPUs after GPUs inside a TESLA box or the X2s or whatever, NVIDIA has to provide a way to allocate a free GPU…

Come on… This is the most basic thing that any programmer would expect to find in an API.

I dont think this is a big thing to implement.

When can we expect this support in the driver? Is this in the pipeline? Kindly throw some light on this. More and more people are getting annoyed by the lack of such a basic support…

MisterAnderson42 · October 8, 2008, 12:17pm

I wrote a python script that scans the output of lsof /dev/nvidia*. CUDA apps open the nvidia device they use with the “mem” descriptor. The python script returns a free GPU # which is then passed to the CUDA application on the command line. To avoid race conditions, GPUs are considered “reserved” for 30s allowing time for the program to initialize and acquire the GPU before another program starts running. Jobs are run with the sun grid engine.

Since X opens /dev/nvidia* with the mem descriptor too, this method only works on boxes with X disabled.

Another option is to run a job scheduler (like the sun grid engine) and create a number of resources GPU1, GPU2, GPU3, … and have jobs written to specifically request a specific GPU. Unless all your jobs are roughly equal time, you may end up with a pile of jobs waiting for GPU2 while all the GPU1 jobs have finished, though.

Neither of these situations is ideal, but that is really all we have to work with. There have been feature requests on file with NVIDIA since CUDA 0.8 to add something into the API to solve this problem, but nothing has ever come of it.

One other simple idea people have suggested is to check the amount of free memory on the card with cuMemGetInfo to determine if it is used or not. You can try it, but race conditions can easily lead to two programs on the same GPU.

spg · October 8, 2008, 1:17pm

You could write a small scheduler and and use GPUWorker class (search on the forum).

mgibbons · October 10, 2008, 2:32pm

Thanks for the suggestions.

The feedback from Nvidia has been that there are currently two possible options - both also suggested here: use a custom daemon process to arbitrate access or, as suggested here, use the /dev/nvidia*/mem info on Linux.

The obvious thing to do would be to provide this in the API and implement it in the driver. I’m told that they’ve had a number of requests for this and I’ve added my name to that list. Hopefully something will come of that but in the meantime

I’ll be investigating implementing a simple library which implements the arbitration logic using a shared mem segment to record GPU allocation. Really need some way to kill a running kernel and re-init a GPU as well though!

Topic		Replies	Views
Multiple GPUs: finding one that's not busy CUDA Programming and Performance	3	1914	September 3, 2008
Different index definition in nvml & CUDA runtime? CUDA Programming and Performance	6	2465	March 5, 2015
CUDA on a non-dedicated GPU CUDA Programming and Performance	1	1509	August 10, 2008
Query if GPU is occupied CUDA Programming and Performance	1	905	January 22, 2013
Can I divide the GPU memory to different processes (python)? CUDA Programming and Performance	5	1378	June 3, 2024
Multiple GPUs, multiple applications CUDA Programming and Performance	10	10012	April 22, 2009
Multi-user-systems und multi-gpu-usage CUDA Programming and Performance	9	6214	July 15, 2008
Questions for multiple GPUs CUDA Programming and Performance	8	7162	April 20, 2009
workarounds for not blocking UI while running kernel? wondering if there are ways to stop cuda from CUDA Programming and Performance	4	10022	January 17, 2010
How can I tell whether an EXCLUSIVE_PROCESS-mode GPU is "taken" or not? CUDA Programming and Performance cuda , nvidia-smi , nvml	7	1637	November 22, 2023

How to do GPU allocation in N GPU + M process env

Related topics