Launching a 3d grid


How do I launch a 3-dimensional grid of thread block?

According to the CUDA documentation: The tread blocks can be 1D, 2D or 3D and the grid of tread blocks can also be 1D, 2D or 3D.
I just do not understand how to launch a 3D grid, as the cuLaunchGrid function only takes 2 size parameters (width and height).

As I understand it: When I call cuFuncSetBlockShape(kernel, x, y, z) Then these are the x ,y and z that i can read via the PTX-registers %ntid.x, %ntid.y and %ntid.z.

So what is the equivalent for cuLaunchGrid and the %nctaid register (which also has 3 parts)?

Am I completely misunderstanding something?

Best regards

You can’t

Grids are 2 dimensional. Even though the type is dim3, z has to be 1. See b.16 appendix of “cuda c programming guide”

Hmmm OK. Just not the impression you get when reading section 2.2.2. of “PTX: Parallel Thread Execution ISA Version 2.2”

I quote: “Each grid has a 1D, 2D, or 3D shape specified by the parameter nctaid.”

And from table 123 of the same manual: “The %nctaid special register contains a 3D grid shape vector…”

And something equivalent is stated in table 122.

Thanks for your response.

So just let one dimension represent two…

Fermi hardware does support 3D grids (which is why they are in the PTX spec) but they’re not exposed in the current CUDA API. This should be in the next release.

I for one will be glad - I hate writing all that indexing code!