Hi everyone !
I’m dealing with 2D arrays (array[W][H]) in CUDA since some times and since the begining, I calculate my index doing :
int idx=threadIdx.x + blockIdx.x* blockDim.x; int idy=threadIdx.y + blockIdx.y* blockDim.y; int index = idx + W*idy;
The problem of this method is that different threads can have the same index, for example, in my case :
W=256; H=24; dim3 threadPerBlock(24,8); dim3 dimGrid((W/threadPerBlock.x)+1, (H/threadPerBlock.y)+1); // which means dim3 dimGrid(11,4)
So for example, for index=256, the thread (15,0) of the block (10,0) matches and the thread (0,1) of the block (0,1) matches as well.
Since now, it has not been a problem but now I need to have a unique indexation !
I also calculated the index doing :
int idx=threadIdx.x + blockIdx.x* blockDim.x; int idy=threadIdx.y + blockIdx.y* blockDim.y; int index =idy*(gridDim.x*blockDim.x)+idx;
And that seems to work but only for
and that’s is not my case because 256/24 does not give a round result.
Is there other ways to calculate unique indexes ?