How Many Maximum threads can be generated on GPU device at once

Hi all

I want to know how many threads can be generated at one kernel invokation.I means is there a limit because i need to generate nearly 10 billion threads.

Check the limits in Appendix A. The max grid size is 65535 x 65535 and the max block size is 512 threads. That works out to a lot more than 10 billion threads possible.

That’s dependent on device. In my 9800GT, the max size of grid is 65535 x 65535, the max block size is 512, you could use part of the all threads in a block. You have to found the proper memory address by blockIdx and threadIdx.

I find that it’s often easier to launch a fixed number of threads (based on the maximum # that can execute at the same time) and have those threads process multiple data elements. This approach circumvents the various grid size and block size limitations. Here’s an example:…for_each.inl#45