 Hey guys

I am debugging a very annoying problem, and I have traced it to the possibility of an index that is supposed to be unique to each thread, not being as unique as I thought.

I need 2^19 threads with their own unique ID, ranging from 0 - 2^19 (524288)

My kernel launch is as follows:

``````launch<<<2048 ,256, 0>>>(...)
``````

such that 2048*256 = 524288 = 2^19

My unique thread index is as follows:

``````typedef unsigned __int32 word;

const word x = blockIdx.x * blockDim.x + threadIdx.x;

const word y = blockIdx.y * blockDim.y + threadIdx.y;

const word unique = ((word)x+(word)y);
``````

Does each thread really have a unique value, and does the count really go from 0-2^19??

Hi,

Obviously the x+y is not unique. Here’s a simple example:

• Case 1: x == 1, y == 1 --> unique = 1 + 1 == 2

• Case 2: x == 2, y == 0 --> unique = 2 + 0 == 2

BTW - you currently running 2048 blocks with 256 threads in each block. The y dimension of your threads and blocks is 1.

Take a look at the SDK sample code, you should use something like this, to get two dimensional block/threads :

``````    dim3 kernelBlockGrid(iDivUp(KERNEL_W, threadBlock.x), iDivUp(KERNEL_H, threadBlock.y));

``````

Hope that helps

eyal

You are correct, but I don’t want 2D indexes. You raised a valid point, in this case I don’t think the value of y ever changes (it stays 1 all the time). So the unique variable is actually unique I think??

Actually it should probably be zero. I guess that then the values are unique.

eyal

It looks to me like y should always be zero, and the values will be unique in your 1-dimensional case.

To correctly calculate a unique index for the 2-D case, I think this would work:

``````const word x = blockIdx.x * blockDim.x + threadIdx.x;

const word y = blockIdx.y * blockDim.y + threadIdx.y;

const word unique = x+y*blockDim.x*gridDim.x;
``````