0 as number of threads per block behaviour


I’ve been testing some CUDA features/behaviours, and i couldnt find any documentation on this matter. If someone who experimented with this before could tell me his experience would be great.

kernel_name<<< num_blocks, 0 >>> (params);

What happens if i call a kernel with this configuration?? (i have a test program, and it works fine :S, but i dont know what CUDA does for real). Is not that i’m calling kernels with it, but im writting some kind of utility for CUDA and i would like to know if i should give support to 0 as a parameter.


Why would you like to run 0 threads per block? It seems very stupid. You should not support 0 as a parameter.

I dont know, maybe 0 has some special meaning like… “let cuda guess what’s best for your architecture”, that’s what i would like to know, if it has some special meaning, or its just some non-supported random behaviour number.

if you put o threads per block you get a launch error. The kernel will not be executed.

humm okay thanks, i’ll have to check my output again.

Damn true, forget this topic ._., output was exactly the same than input (usually only minor changes).

The correct way to detect errors like this is to check the return codes from CUDA functions. Don’t rely on changes in output to discover that a function somewhere has failed.

Id you use some error checking this will automatically stop the program with error.