I am using CUDA to solve PDEs on 3D matrices using 3D kernels I have written myself. I understand that for compute capability 2.0 to 5.2 the product of the grid size dimensions in x, y, and z, must not exceed 1024. However, I do not know whether this still holds for capabilities after 5.2. I am working on a GTX 1080Ti (compute capability 6.1) and would like to optimize the grid size, therefore does anyone know where I could find this information?
I know the maximum values that can be input are 1024 for x and y, and 64 for z but do not know what my limits of selecting grid dimensions are.