Sharing GPUs on multi-gpu, multiuser systems When cudaSetDevice() goes bad.

PBS Pro (and probably SGE, LSF, Torque and all the other batch systems) knows squat about allocating GPUs, nor does the CUDA runtime seem to allow exclusive ownership of a device.

Here’s an LD_PRELOAD-able shim that overrides cudaSetDevice() and ensures that GPU-requesting programs will get exclusive use of a GPU, or die trying.
It arbitrates the GPU allocation with lockfiles in /var/lock/cuda (or the location of CUDA_LOCKFILE_DIR).

Use it by setting LD_PRELOAD=/path/to/cuPlayNicely.so. It will tell what it’s up to if you set CUDA_LOCKFILE_VERBOSE.

Enjoy.

Matt
cuPlayNicely.tar (10 KB)

Cool! Thanks for sharing.

Wow, never thought about this approach…nice work!

But I just wondering whether cudaSetDevice() will be called or not if the user didn’t write the line manually…?

No. If it’s not called explicitly in the user code, device #0 will used. Not obvious way to over-ride that that I can think of.

You could always move the call to the real cudaSetDevice() to the DSO’s constructor, but then any program that preloads the library will acquire a GPU, required or not.

Also, for completeness, you’d probably want to override the driver API’s set device function, in case you should have a user into that sort of self-abuse.

M

Device management will be less of an issue soon.

Hello
I thank you for this useful tool.

But, There is a little question :
how to automatic call the function “my_fini” to release the device even when program is interrupted ?
Is there a command to call to “unload” library after one kills the program ?

Thanks,

Guillermo