Hi,

How do I launch a 3-dimensional grid of thread block?

According to the CUDA documentation: The tread blocks can be 1D, 2D or 3D and the grid of tread blocks can also be 1D, 2D or 3D.

I just do not understand how to launch a 3D grid, as the cuLaunchGrid function only takes 2 size parameters (width and height).

As I understand it: When I call cuFuncSetBlockShape(kernel, x, y, z) Then these are the x ,y and z that i can read via the PTX-registers %ntid.x, %ntid.y and %ntid.z.

So what is the equivalent for cuLaunchGrid and the %nctaid register (which also has 3 parts)?

Am I completely misunderstanding something?

Best regards

Troels