I am having a problem with inter-thread communication.
I made only one block and grid.
void main()
{
dim3 block(32,1);
dim3 grid(1,1);
Test<<<grid, block>>>(x);
}
So I thought below code worked as I wanted.
globla void Test(int * x)
{
shared int temp = *x;
__syncthreads();
temp++;
__syncthreads();
*x = temp;
}
I expected *x gave me 32.
It only worked well in EmuDebug.
In Release, above code gave me 1.
I have no idea what the problem is.
Can you help me??
Thanks.