Hi everyone !

I’m dealing with 2D arrays (array[W][H]) in CUDA since some times and since the begining, I calculate my index doing :

```
int idx=threadIdx.x + blockIdx.x* blockDim.x;
int idy=threadIdx.y + blockIdx.y* blockDim.y;
int index = idx + W*idy;
```

The problem of this method is that different threads can have the same index, for example, in my case :

```
W=256;
H=24;
dim3 threadPerBlock(24,8);
dim3 dimGrid((W/threadPerBlock.x)+1, (H/threadPerBlock.y)+1); // which means dim3 dimGrid(11,4)
```

So for example, for index=256, the thread (15,0) of the block (10,0) matches and the thread (0,1) of the block (0,1) matches as well.

Since now, it has not been a problem but now I need to have a unique indexation !

I also calculated the index doing :

```
int idx=threadIdx.x + blockIdx.x* blockDim.x;
int idy=threadIdx.y + blockIdx.y* blockDim.y;
int index =idy*(gridDim.x*blockDim.x)+idx;
```

And that seems to work but only for

and that’s is not my case because 256/24 does not give a round result.

Is there other ways to calculate unique indexes ?