In the SDK code example “BlackScholes” (2.1 release)
the kernel calls look like this:
BlackScholesGPU<<<480, 128>>>( parameters… )
and the indexing calculations within the kernle look like this:
const int tid = blockDim.x * blockIdx.x + threadIdx.x;
//Total number of threads in execution grid const int THREAD_N = blockDim.x * gridDim.x; //No matter how small is execution grid or how large OptN is, //exactly OptN indices will be processed with perfect memory coalescing for(int opt = tid; opt < optN; opt += THREAD_N) BlackScholesBodyGPU( <-- this is a func in the kernel it gets called with arg = (blockDim.x * blockIdx.x + threadIdx.x) PLUS (blockDim.x * gridDim.x)
QUESTION: WHERE DOES “480” COME FROM?
“480” doesn’t appear anywhere else in the code, in any of the files, I think.
It’s suggestive to me personally because I have two Teslas, meaning I have 480
processor cores; I could imagine arranging the processing so that each core
gets 128 threads.
Is it just a convenient number less that 512 threads/per/block? Is it twice 240 cores
such that each core goes thru 2 blocks of 128?
Minny Thank - youse