strange behaviour/weird bug

I spend allready many days and hours.

I have a huge kernel, invoked many 1000 times and suddenly it seems that device memory of one block is copied to another block. It happens altways at the same moment, no difference in release or debug mode. Any hint?

syncthreads

if ( threadIdx.x == 0 )
sdata[0] = gdata[ blockIdx.x] ;
syncthreads

…Code…

syncthreads

if ( threadIdx.x == 0 )
gdata[ blockIdx.x] = sdata[0];

syncthreads