1D/2D indexes usage in a kernel


Well I’m learning CUDA C by myself, and I’m taking as a reference the CUDA BY EXAMPLE book. Now I understand the part of the threadIdx and blockIdx, but what I can’t understand is when to use threadIdx.x, threadIdx.y (the same with blockIdx.x, blockIdx.y). It’s just that when they use the example of vector add they use only something like this

[indent][indent]c[threadIdx.x] = a[threadIdx.x] + b[threadIdx.x][/indent][/indent]

and sometimes they use

[indent][indent]x = threadIdx.x + blockIdx.x * blockDim.x[/indent][/indent]
[indent][indent]y = threadIdx.y + blockIdx.y * blockDim.y[/indent][/indent]

but what’s the difference or when should I use the first thing and when the second one. I understand that the second form is just to linearize but, why linearize X & Y.

Could you explain me this? As simple as you can. Thank you.

Having x and y is just for your convenience. Use whatever suits you best.

For very large grids it also pushes the maximum block number a bit, since this is limited to 65535 in each direction.

So I can use the indexes as I wish? I mean if I always treat my problems like a vector using just threadIdx.x, blockIdx.x it won’t have much impact on performance? Or it’s better to try to use 1D, 2D, 3D grids/blocks???

There is no impact on performance, so just go ahead with 1D grid and block.