I am writing a program on a GTX 1080. The kernel executes as intended with a grid size of
kernelfunc <<< 65535,1024 >>> ();
But when I increase the number of blocks in the grid beyond 65535 the kernel does not launch. In the DevQuery it says a maximum of threads is 2147483647 (x). Does anyone have any idea why I can not launch more blocks in a single call? I may interpret the DevQuery in the wrong way?