How to apply thread concept to a FOR loop of one million points

piyush.goel · July 28, 2009, 6:04am

I have a loop which runs a million times.Now to port the code on CUDA i have to apply threads.Data on which the calculations are done is also data parallel according to me.
So when i call the kernel,i cant make one million threads.At the max i can apply 60,000 - 65000 threads.In this please tell me how to apply the thread concept and how to port my normal C code to CUDA
All the examples which i have seen till now have total number of points less than 60,000,so was no problem on how to apply the thread and launch the kernel function.
Can Anyone please help me with this?

Thanks in advance

avidday · July 28, 2009, 7:02am

You can have 65534x65534 blocks, each containing 512 threads - that is about 2.2e12 total threads, ie. over two million million.

_Big_Mac · July 28, 2009, 8:18pm

“You can’t have more than 60k threads” as in “you don’t know how” or “it hangs if you try”?

piyush.goel · July 29, 2009, 8:45am

Thanks for the reply but this thing didnt help me
Can u please me with a sample code or method on how to do it?
thanks

Topic		Replies	Views
Two questions about too many threads in a block CUDA Programming and Performance	5	2317	October 26, 2011
Understanding number of threads Problems with program working CUDA Programming and Performance	3	1053	August 17, 2009
Control number of threads CUDA Programming and Performance	2	3045	July 4, 2008
Help for a simple testing problem CUDA Programming and Performance	3	4808	July 30, 2007
Threads and blocks concept question Invoking a kernel CUDA Programming and Performance	2	1678	December 5, 2007
Kernel Question CUDA Programming and Performance	3	4730	March 4, 2012
CUDA Increasing Speed Possible ? CUDA Programming and Performance	2	4211	May 31, 2010
Understanding Threads in CUDA help me find the exact number of threads for my code CUDA Programming and Performance	4	2366	July 13, 2009
Launching 2**41 threads? CUDA Programming and Performance	1	922	May 4, 2009
max thread per block and memory device question CUDA Programming and Performance	2	17016	January 9, 2009

How to apply thread concept to a FOR loop of one million points

Related topics