simple question about blockIdx/gridDim

bdg146psu · November 17, 2008, 7:55pm

I’m sure this has been discussed before, but I’m finding the search function of this forum to be somewhat lacking (and very frustrating).

I’ve reached a point where my kernel needs more than 65535 blocks, so I need to go to a 2-dimensional grid for the first time. The example in the programming guide is redundant due to the 16x16 grid size, and I’m not really interested in the “coordinates” of the block anyway. I’m really only interested in numbering the blocks sequentially, so that each block has a unique number. So, to obtain this unique number, I could use a formula such as:

blockNumber = (blockIdx.y * gridDim.x) + blockIdx.x;

right?

I just wanted to double check that I am interpreting things (particularly gridDim) in the correct manner.

Thanks.

Ailleur · November 17, 2008, 8:08pm

Looks okay to me!

mandrak · November 18, 2008, 6:37am

Yes. It is ok.

In general you can have:

one dimensional array of blocks on grid where each block has one dimensional array of threads then:

UniqueBlockIndex = blockIdx.x;
UniqueThreadIndex = blockIdx.x * blockDim.x + threadIdx.x;

one dimensional array of blocks on grid where each block has two dimensional array of threads then:

UniqueBlockIndex = blockIdx.x;
UniqueThreadIndex = blockIdx.x * blockDim.x * blockDim.y + threadIdx.y * blockDim.x + threadIdx.x;

one dimensional array of blocks on grid where each block has three dimensional array of threads then:

UniqueBockIndex = blockIdx.x;
UniqueThreadIndex = blockIdx.x * blockDim.x * blockDim.y * blockDim.z + threadIdx.z * blockDim.y * blockDim.x + threadIdx.y * blockDim.x + threadIdx.x;

two dimensional array of blocks on grid where each block has one dimensional array of threads then:

UniqueBlockIndex = blockIdx.y * gridDim.x + blockIdx.x;
UniqueThreadIndex = UniqueBlockIndex * blockDim.x + threadIdx.x;

two dimensional array of blocks on grid where each block has two dimensional array of threads then:

UniqueBlockIndex = blockIdx.y * gridDim.x + blockIdx.x;
UniqueThreadIndex =UniqueBlockIndex * blockDim.y * blockDim.x + threadIdx.y * blockDim.x + threadIdx.x;

two dimensional array of blocks on grid where each block has three dimensional array of threads then:

UniqueBlockIndex = blockIdx.y * gridDim.x + blockIdx.x;
UniqueThreadIndex = UniqueBlockIndex * blockDim.z * blockDim.y * blockDim.x + threadIdx.z * blockDim.y * blockDim.z + threadIdx.y * blockDim.x + threadIdx.x;

Thre dimensional grid of blocks hasn’t been supported on CUDA (yet) so you have only those 6 combinations.
UniqueThreadIndex means unique per grid.

Offcourse you can calculate LocalThreadIndex (per block) for a three dimensional array of threads ie.
LocalThreadIndex = threadIdx.z * blockDim.y * blockDim.x + threadIdx.y * blockDim.x + threadIdx.x;

bdg146psu · November 18, 2008, 1:58pm

Thanks a lot for the help. I think monday was adversely affecting my brain.

Jatukam · May 13, 2009, 3:00pm

Hello,

Your 6th point seems a liitle strange to me.

Although 3 dimensional grid are not supported you use : blockDim.z

I understand that

threadIdx.z * blockDim.y * blockDim.z + threadIdx.y * blockDim.x + threadIdx.x;

should have been misstape and is in fact :

threadIdx.z * blockDim.y * blockDim.x + threadIdx.y * blockDim.x + threadIdx.x;

and I think you mean :

UniqueBlockIndex  * gridDim.z * gridDim.y * gridDim.x

instead of :

UniqueBlockIndex  * blockDim.z * blockDim.y * blockDim.x

As a beginner I am getting confused with all that. Anyone can check what I just said ?

Thanks a lot.

Jamie_K · May 13, 2009, 3:28pm

I would say you are right that it should have been

... threadIdx.z * blockDim.y * blockDim.x + threadIdx.y * blockDim.x + threadIdx.x;

But I disagree about the other part. I think that this portion is correct:

UniqueBlockIndex * blockDim.z * blockDim.y * blockDim.x ...

Although it is equivalent to the simpler version that omits the z dimension. But conceptually the blockDim.z * blockDim.y * blockDim.x is the number of threads per block. So the total is the UniqueBlockIndex * (number of threads per block) + (thread number within this block).

Jatukam · May 13, 2009, 3:33pm

Thank you !
It’s much more clear now.

NCC-1701D · May 18, 2009, 4:24am

thanks a lot mandrak
quite helpfull this… :)

Topic		Replies	Views
Unique Block Index in 3D Grid CUDA Programming and Performance	6	1259	November 27, 2024
difference between threadIdx, blockIdx statements CUDA Programming and Performance	10	154938	May 14, 2024
2D grid and 1D Thread Block CUDA Programming and Performance	7	7227	August 21, 2008
2D array & unique indexation In order to avoid Threads conflicts CUDA Programming and Performance	3	1624	November 19, 2009
Unique values depending on threadIdx's Unique "ID" per thread CUDA Programming and Performance	4	1896	May 21, 2009
Size limitation for 1D Arrays in CUDA? CUDA Programming and Performance	9	18194	October 17, 2013
How to calculate ThreadIDs in a 2D GRID array CUDA Programming and Performance	4	4784	December 5, 2011
Whats wrong with this simple kernel call? Invalid Configuration Argument (with empty Kernel) CUDA Programming and Performance	16	9035	November 23, 2009
Calculate GLOBAL thread Id CUDA Programming and Performance	8	33870	September 5, 2023
How many can use Blocks to effcient parallel prog CUDA Programming and Performance	8	5789	December 12, 2009

simple question about blockIdx/gridDim

Related topics