3D grid dimensions for compute compatability 6.1


I am using CUDA to solve PDEs on 3D matrices using 3D kernels I have written myself. I understand that for compute capability 2.0 to 5.2 the product of the grid size dimensions in x, y, and z, must not exceed 1024. However, I do not know whether this still holds for capabilities after 5.2. I am working on a GTX 1080Ti (compute capability 6.1) and would like to optimize the grid size, therefore does anyone know where I could find this information?

I know the maximum values that can be input are 1024 for x and y, and 64 for z but do not know what my limits of selecting grid dimensions are.

You’re referring to the block size, not the grid size.

The block dimension limits are the same.

The deviceQuery sample code will indicate relevant dimensions, and you can also refer to table 14 in the programming guide:


Apologies, thanks for clarifying.