Hello,
I have implemented following function for block synchronization (to sync threads between two different blocks).
device void inter_block_barrier(unsigned int count)
{
int value;
if((threadIdx.x==0)&&(threadIdx.y==0))
{
value=atomicInc(count,gridDim.xgridDim.y);
while(count[0] !=0);
}
__syncthreads();
}
I am using 20,000 blocks in my code, each having 64 threads. My device is GT200. This function does not seem to synchronize blocks. Can anyone see any flaw in logic ?
the variable count[0] is made zero before calling function (thought ideally its not needed).
thanks adn regards,
Nachiket