Nvidia GTX 1080 CUDA cores

I am writing a program on a GTX 1080. The kernel executes as intended with a grid size of

kernelfunc <<< 65535,1024 >>> ();

But when I increase the number of blocks in the grid beyond 65535 the kernel does not launch. In the DevQuery it says a maximum of threads is 2147483647 (x). Does anyone have any idea why I can not launch more blocks in a single call? I may interpret the DevQuery in the wrong way?

The bug was found in the compile code, which did not include