multiGPU example in SDK


I tried to write a simple program based on multiGPU example. Up to now, i met 2 problems:

(1) If I have several arguments for the kernel, how do I pass them to the kernel. It seems that I need to first pass the arguments to gpuThread(), then to the kernel. But how do I pass the arguments to gpuThreads(). i don’t quiet understand the way gpuThread is called, “threads[i] = cutStartThread((CUT_THREADROUTINE)gpuThread, (void *)&threadIds[i]);”

Currently, i can get around this by making the variables passed to kernel global.

i have 2 threads
thread 0: cudaSetDevice(0), do something
thread 1: cudaSetDevice(1), do something
then again :
thread 1: cudaSetDevice(0)
at this point, is the results of previous kernel implementation on device 0 still valid?

In the example, the memory allocation/copy and kernel launch are in the same function gpuThreads. I’d like to separate memory allcoation/copy and kernel launch, because after memory alloc/copy, i want to run the same kernel many times, and after each time, two threads (GPU contexts) will exchange some information.

I did this by dividing gpuThread() into two functions gpuThreadAlloc() and gpuThreadWork(). But it doesn’t work this way.

I attached the code, thanks a lot,

Yao (2.83 KB)

gpu thread is pthread. So you can apply anything you know about pthread here

I don’t think it work correctly

No. If you exit the threads, then everything related to it will be cleaned out from the memory