Grid-Block-Thread Configuration

ElGuapo_Oficial · January 23, 2014, 4:33am

Hi guys!

I still get confused by the [grid, block, thread] configuration on the kernel (amazing!).
Max ‘x’ dimension values = [2^31,1024,1024] respectively for compute capability 3.0 (or GeForce GTX 680).

It seems to me that kernel needs 3 parameters but only 2 are used all the time right?
I’m actually launching a kernel with this configuration:
<<<1250000, 1024>>>

How is this read? 1 Grid with 1250000 blocks, each block with 1024 threads? Isn’t 1024 maximum ‘x’ dimension for blocks?

What if I want 3 grids with 16 blocks, each block with 32 threads… what is the proper configuration for this parameters?

Thanks in advance for the noob question :)

pasoleatis · January 23, 2014, 9:13am

Hello,

The kernel needs 2 dim3 arguments. dim3 is a integer structure with 3 components. If you use just a number it will automatically put 1 for the dimensions not mentioned. So in your case you launch a grid of blocks (12500000,1,1) and in each block is (1024,1,1). The third argument is related to use of shared memory and if it is not set it is 0. For a grid with 16 blocks you would use the launching :
<<<16,32>>>
You can define 2 dim3:

dim3 grid,threds;
grid.x=16;
grid.y=1;
grid.z=1;
threads.x=32;
threads.y=1;
threads.z=1;

launch with <<<grid,threads>>>

ElGuapo_Oficial · January 23, 2014, 5:57pm

Thanks for taking the time pasoleatis! (I also noted you helped me with the “speed up” post :)

I realize my mistake now!

Watching at a picture (Figure 7 at CUDA_C_Programing_Guide v5.0), it looked like I could launch several grids of blocks…

Something like:
kernel<<<2,16,32>>>

That is, 2 grids of 16 blocks, each block with 32 threads.

But now, looking at a different picture (more specific one) I realize “Grid 1” and “Grid 2” on the picture belonged to different kernel launches.

http://ixbtlabs.com/articles3/video/cuda-1-p5.html

Sorry for the super noob mistake (I personally blame Figure 7 of programming guide v5.0 :P).

Thanks again!

pasoleatis · January 23, 2014, 10:06pm

Hello,

I am not sure what this kernel<<<2,16,32>>>launches. Ithink it will launch 2 blocks with 16 threads and allocate 32 bytes of shared memory. per block.

The image you mentioned is something like :

kernel1<<<16,32>>>

host code

kernel2<<<16,32>>>
This means that kernel1 launch is sent to the gpu (execution starts independent on the host code which follows) , then host stats to execute some code and when this is finished the second kernel launch is sent to the gpu. The kernel2 follows kernel1. They are being done in the same time with the host, each kernel is executed only after the previous kernel is finished.

According the programming guide for each kernel launch only 2 numbers are, required Programming Guide :: CUDA Toolkit Documentation

For beginning I suggest first writing simple programs and then building more and more complex codes.

Topic		Replies	Views
grid dimension and block dimension CUDA Programming and Performance	2	754	August 28, 2023
Question about Block and Thread Organization dimBlock.x, dimBlock.y, dimGrid, dimBlock CUDA Programming and Performance	9	14733	April 22, 2012
Question about dimGrid CUDA Programming and Performance	1	883	August 4, 2010
Why does changing grid and block configuration from dim3(8, 1, 1) to dim3(8, 8, 1) make the CUDA kernel work? CUDA Programming and Performance	1	64	December 26, 2024
Whats wrong with this simple kernel call? Invalid Configuration Argument (with empty Kernel) CUDA Programming and Performance	16	9185	November 23, 2009
Dimensions of a Block and a Grid CUDA Programming and Performance	7	13140	May 1, 2008
help to clairfy usage of number of grids and number of blocks in kernal CUDA Programming and Performance	0	633	February 14, 2014
How many dimensions? CUDA Programming and Performance	2	819	March 21, 2011
Dimensionality of grid of thread blocks questions CUDA Programming and Performance	1	575	February 10, 2015
Block and thread configuration CUDA Programming and Performance	2	1539	February 11, 2008

Grid-Block-Thread Configuration

Related topics