What is the maximum number of threads per block?

Hi, Friends

As I know, the maximum number of threads per block is 512 in CUDA.
Does it mean that y is limited to 512 in a function call, foo<<<x, y>>>?
Is foo<<<x, 1024>>> valid or invalid?

Thank you.

James

Yes y is the threads per block, and any value > 512 (or any dim3 with x>512,y>512, z>64 or xyz>512) is illegal. The actual maximum number of threads per block which a kernel can be launched with may be less than 512. depending on kernel shared memory requirements and register usage.

To be more precise the maximum number of threads per block is 512 for compute capability 1.x. For 2.x it has been increased to 1024. See Appendix G.1 in the programming guide for more information.

Thank you.

But What does 2.x mean? I have Programming guide 2.3, but it does not include App. G.1.

It means the new Fermi architecture. CUDA 2.3 predates its announcement and contains no support for it. All currently available cards are either compute 1.0, 1,1, 1.2 or 1.3.