Hi all, I am starting to work with cuda and now I am reading CUDA toolkit documentation.
http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#heterogeneous-programming

Here in the chapter 2.1 it is said, that threadID and thread Index are connected in 2D via formula:

for a two-dimensional block of size (Dx, Dy),the thread ID of a thread of index (x, y) is (x + y Dx)

But in the example of matrix multiplication provided in htis chapter:

``````__global__ void MatAdd(float A[N][N], float B[N][N], float C[N][N])

{ int i = blockIdx.x * blockDim.x + threadIdx.x;

int j = blockIdx.y * blockDim.y + threadIdx.y;

if (i < N && j < N)

C[i][j] = A[i][j] + B[i][j];

}
``````

So from this example fornula is:

Can someone tell me which formula is correct?

As I understand, thread index is 2 dimensional in this case. Why then (x, y) is (x + y Dx) - 1D??

Are i and j in this code denote thread index?

Thanks,
Mikhail

threadID is a unique, scalar number that identifies each thread uniquely in a threadblock regardless of whather that threadblock is 1,2, or 3 dimensional. From a programming perspective, threadID is rarely important.

threadIdx.x,y,z are built-in variables provided by the runtime environment. Any computation of an “index” from those variables can take almost any form. There is no predefined connection between such a computed “index” which may have any relationship to the underlying threadblock structure, and threadID which has a specific, predfined, unique, and unmodifiable relationship with a given underlying threadblock structure.

Thanks

…And in the specific example ,what has confused you (x + y * Dx):

``````x + y * Dx , means ->  i + j * blockDim.x
``````

``````Threadindex(i)=blockID(i)*blockdim(i)+threadID(i), where i=x or y;
``````Threadindex =blockID(i)*blockdim(i)+threadID(i), where i=x or y;