We’ve recently moved our second 8800 GTX into the same server as our first GTX card, and we need a way to ensure concurrent CUDA jobs don’t use the same card.
I’d like to request a new Device Mangement function to make this easier:
cudaError_t cudaSetDeviceLeastUsed(int *dev); Sets the active host thread to run device code on the CUDA device which is being used by the fewest host processes. The selected device number is returned in *dev. This function is atomic, so if there are N available devices and N processes call cudaSetDeviceLeastUsed() at the same time, they are all guaranteed to be assigned to different devices.
For the time being, were are adding this logic into the program we run most frequently, but a global way to do this right for all CUDA jobs would be nice.