cuda block synchronization

siyam · June 19, 2011, 8:51pm

I have b number of blocks and each block has t number of threads. I can use

__syncthreads()

to synchronize the threads that are in a particular block. for example

__global void aFunction()
{
for(i=0;i<10;i++)
{
//execute something
__syncthreads();
}
}

But my problem is to synchronize all the threads in all the blocks. How can I do this?

seibert · June 19, 2011, 11:27pm

CUDA provides no efficient, reliable inter-block synchronization except kernel launches themselves. If you split your calculation into two kernels and launch them in the same stream, you will be guaranteed that all the blocks in the first kernel are finished and device memory updated before the second kernel starts. The kernel launch overhead is low enough that you shouldn’t worry about it unless your kernels take less than 5 or 10 microseconds.

Topic		Replies	Views
cuda block synchronization CUDA Programming and Performance	4	8393	June 20, 2011
synchronisation between blocks CUDA Programming and Performance	2	1476	June 11, 2009
Synchronize threads. CUDA Programming and Performance	1	698	March 6, 2013
question about __syncthreads(); CUDA Programming and Performance	9	8617	March 17, 2008
Thread synchronization across blocks CUDA Programming and Performance	5	2839	July 2, 2008
sync over blocks age old question CUDA Programming and Performance	2	2877	September 9, 2008
Synchronizing threads CUDA Programming and Performance	1	5923	March 21, 2007
Thread sync CUDA Programming and Performance	2	788	May 9, 2011
synchronization between blocks CUDA Programming and Performance	2	747	December 5, 2014
核函数中不同block线程如何同步 CUDA Programming and Performance	1	1447	June 16, 2019

cuda block synchronization

Related topics