in the CUDA_SDK, there is a example called “simpleMultiGPU”.
Here is how it uses multiGPUs
for(i = 0; i < GPU_N; i++)
threadID[i] = cutStartThread((CUT_THREADROUTINE)solverThread, (void *)(plan + i));
My question is these threads in CPU are parallel or not?
It seems like they just sequentially call the function “solverThread” in which the kernel function is involved.
Anyone can help?