Multi-User GPGPU

Hallo,

we have got a Tesla S1070 GPU-Rack within our cluster system, which we want to use in a multiple user mode. Unfortunately, this does not work correct. I wrote a dgemm-benchmark using cublas, where the dgemm-routine is calles many times. When another user starts a CUDA-application in most cases the execution is blocked, but after several tries the second application starts and my benchmark is killed with an unspecific launch failure. How checks the driver the availbility of the device? Why can another application kill my program although the device memory is not freed yet? This really an issue for our multi user mode…

CUDA Driver Version: 3.10
CUDA Runtime Version: 3
Compute mode: Exclusive (only one host thread at a time can use this device)

Kind Regards,

Tim

It sounds a lot like this bug. Compute exclusivity seems to mess up for kernels which require a reasonably large number of registers. Tim Murray indicated there is a fix coming for it, but I don’t know precisely which driver versions are effected, and when in the release cycle the fix will get into drivers.

It sounds a lot like this bug. Compute exclusivity seems to mess up for kernels which require a reasonably large number of registers. Tim Murray indicated there is a fix coming for it, but I don’t know precisely which driver versions are effected, and when in the release cycle the fix will get into drivers.

Mh… Thx for the link to the other thread. I think I use the latest driver. Is there a possiblity to sign in to confirmation list to the corresponding bug? Or does anybody know when this will be fixed?

Mh… Thx for the link to the other thread. I think I use the latest driver. Is there a possiblity to sign in to confirmation list to the corresponding bug? Or does anybody know when this will be fixed?

The fix for this bug is coming out with CUDA 3.2/R260.

The fix for this bug is coming out with CUDA 3.2/R260.

Ok. Thanks for that.

Ok. Thanks for that.