Locking on streams in CUDA?

Jules123 · April 25, 2013, 6:31pm

Hi,

I am trying to find a way to have the CPU block when GPUs have run out of streams.

I have arrays:
cudaStream_t streams[MAX_GPU_COUNT][STREAMS_MAX];
mutexes[MAX_GPU_COUNT][STREAMS_MAX];

and also separate the data and results out into other arrays, again dimensioned by [MAX_GPU_COUNT][STREAMS_MAX].

And I want to use the streams to launch multiple kernels (locking the appropriate mutex in the mutexes array) on my Titan cards under Windows 7. Then I wanted to use cudaStreamAddCallback to process the results and to unlock the mutex to mark the stream as available again for another kernel launch.

Then I discovered that boost/thread.hpp does not compile using nvcc. I tried the advice to separate device and host code but was not able to do that because I am using Thrust as well.

I wonder if anyone knows of an elegant way to do the locking and unlocking?

Hope that makes sense,
Jules

tera · April 25, 2013, 9:01pm

Why can’t you have the callback in nvcc-compiled code call a function in cpp-compiled code that calls into Thrust?

Jules123 · April 25, 2013, 9:51pm

Thanks Tera,

You got me thinking. I have managed to separate into host and device code so I am using boost::mutex now.

Only that my design is bad. If the callback routine or the method it calls crashes, then the mutex in the mutexes[MAX_GPU_COUNT][STREAMS_MAX] is never unlocked. I know that it is recommended to wrap mutex in a class and call the unlock in its destructor, but how to do that with CUDA callbacks?

Thanks for any help,
Jules

tera · April 26, 2013, 12:33am

I guess that question boils down to “how to detect that my code has crashed”. I’m not sure why you’d expect your callback routine to crash. I’d much more expect something to go wrong on the CUDA side, where you could call the classes’ destructor from the error handling code that handles return codes other that cudaSuccess. You could also call cudaStreamQuery() in a few strategic places to see if the stream is still busy or whether there is any error pending.

Topic		Replies	Views
Do stream callbacks hold any cuda-internal locks? CUDA Programming and Performance	8	47	July 16, 2025
callback functions and thread synchronization CUDA Programming and Performance	1	1579	May 14, 2016
Why does cudaStreamAddCallback serialize kernel execution and break concurrency? CUDA Programming and Performance	12	8121	April 5, 2015
cudaStreamAddCallback cause deadlock CUDA Programming and Performance	2	657	July 6, 2016
cudaStreamCallbackBlocking flag usage CUDA Programming and Performance	0	596	June 5, 2013
How to hang up a stream waiting for a CPU thread? CUDA Programming and Performance	1	856	September 22, 2015
cudaLaunchHostFunc API example CUDA Programming and Performance	31	6155	February 8, 2025
CUDA called from multiple threads CUDA Programming and Performance	1	4565	July 18, 2010
Adding CUDA streams to threaded software CUDA Programming and Performance	7	8587	August 2, 2011
how to use cudaStreamAddCallback() CUDA Programming and Performance	0	2362	July 30, 2013

Locking on streams in CUDA?

Related topics