Without GPU Debug Information limited to 512 kernels in CUDA 2.0

Basically, my program is fine and it runs properly.

The problem arises when I want to speed it up by setting “Generate GPU Debug Information” to “No”, indeed I get error until I am not below 512 kernels…

Why?