Thread id and thread index in 2D

Hi all, I am starting to work with cuda and now I am reading CUDA toolkit documentation.

Here in the chapter 2.1 it is said, that threadID and thread Index are connected in 2D via formula:

for a two-dimensional block of size (Dx, Dy),the thread ID of a thread of index (x, y) is (x + y Dx)

But in the example of matrix multiplication provided in htis chapter:

__global__ void MatAdd(float A[N][N], float B[N][N], float C[N][N]) 

{ int i = blockIdx.x * blockDim.x + threadIdx.x; 

int j = blockIdx.y * blockDim.y + threadIdx.y; 

if (i < N && j < N) 

C[i][j] = A[i][j] + B[i][j]; 


So from this example fornula is:
Threadindex(i)=blockID(i)*blockdim(i)+threadID(i), where i=x or y;

Can someone tell me which formula is correct?

As I understand, thread index is 2 dimensional in this case. Why then (x, y) is (x + y Dx) - 1D??

Are i and j in this code denote thread index?


threadID and threadIdx.x,y,z are not the same thing.

threadID is a unique, scalar number that identifies each thread uniquely in a threadblock regardless of whather that threadblock is 1,2, or 3 dimensional. From a programming perspective, threadID is rarely important.

threadIdx.x,y,z are built-in variables provided by the runtime environment. Any computation of an “index” from those variables can take almost any form. There is no predefined connection between such a computed “index” which may have any relationship to the underlying threadblock structure, and threadID which has a specific, predfined, unique, and unmodifiable relationship with a given underlying threadblock structure.


…And in the specific example ,what has confused you (x + y * Dx):

x + y * Dx , means ->  i + j * blockDim.x

The thread index you wrote:

Threadindex(i)=blockID(i)*blockdim(i)+threadID(i), where i=x or y;

is :

Threadindex =blockID(i)*blockdim(i)+threadID(i), where i=x or y;

(the Threadindex without (i) )
It is for example the row or column index.