Curious little problem that may be something few folks have, but before I try to solve it I wanted to know what was out there.
I have a “cuda sandbox” machine with an integrated GPU (SM1.1), a 295 (2xSM1.3) and a Tesla (SM1.3, much memory). I also use a build system (SCons) pushing cuda around with python-wrapped C/C++.
What I’d really like to be able to do is say “scons -uj2”, saying that two processes can run at once. I’d like the first of those processes to be able to say “I want a SM 1.3 card with 800MB of memory” and get half of the 295, then have the second one come along and automatically get the other half of the 295 that’s not in use. If I ran -uj3, the third process would get the Tesla, but with -uj4 the fourth process would say “there aren’t any cards available; I will sleep and check every so often to see when one opens up”
- Set min requirements for the card you want (SM and memory for now)
- Get the first card not in use by another app matching those requirements (or the “least impressive” card matching those requirements)
- If no cards matching min requirements are available, either return with an error (so the app can exit or wait) or just wait for a card to be available, sleeping nicely until it does
I can see how do to much of it except the “first card not used by another app” part… maybe that could be done with nvidia-smi, which I’ve seen referenced (but I don’t seem to have it). But it seemed a simple enough thing I figured someone else has probably solved it by now with a nice library (or just scrap of code). Any suggestions? I don’t want a full queueing system (in a sense this is a single multithreaded application with different threads on different cards; hopefully that would work OK, but it doesn’t make sense with a cluster-style queuing system)–just something simple.