Hi all
I want to know how many threads can be generated at one kernel invokation.I means is there a limit because i need to generate nearly 10 billion threads.
Hi all
I want to know how many threads can be generated at one kernel invokation.I means is there a limit because i need to generate nearly 10 billion threads.
Check the limits in Appendix A. The max grid size is 65535 x 65535 and the max block size is 512 threads. That works out to a lot more than 10 billion threads possible.
That’s dependent on device. In my 9800GT, the max size of grid is 65535 x 65535, the max block size is 512, you could use part of the all threads in a block. You have to found the proper memory address by blockIdx and threadIdx.
I find that it’s often easier to launch a fixed number of threads (based on the maximum # that can execute at the same time) and have those threads process multiple data elements. This approach circumvents the various grid size and block size limitations. Here’s an example: