why cuda API does't support memory copy parallel with GPU computing

xjtusnail · December 10, 2008, 1:39am

first question: From my testing, i can see that when GPU deals with a kernel, it does’t support data transfer from CPU at the same time.
why cuda API (1.1) does’t support memory copy parallel with GPU computing?

second question: GPU does’t support data transfer from cpu to GPU parallell with data transfer from GPU to cpu, but i think DMA should support bidirectional data transmission.
does it hardware not support or API not support?

any one can give me explanation?
thank you in advance!

E.D_Riedijk · December 10, 2008, 5:46am

first question: From my testing, i can see that when GPU deals with a kernel, it does’t support data transfer from CPU at the same time.
                 why cuda API (1.1) does't support memory copy parallel with GPU computing?

That is not true. There is an example in the SDK. devicequery tells you if your device supports it (not all devices do)

xjtusnail · December 10, 2008, 7:42am

yes, i know that.

but someone said it is a bug for 1.1 (for G80 series hardware ). you can see http://forums.nvidia.com/lofiversion/index.php?t55372.html

so i am very confused on this problem.

pcchen · December 10, 2008, 8:06am

G80 is 1.0 hardware, so it does not support async memory copy operations. The fact that it’s reported as supported is a bug. However, for 1.1 (and later) hardwares (including G8X other than G80 and G92) it’s supported.

xjtusnail · December 10, 2008, 2:28pm

thank you, i got it.

xjtusnail · December 11, 2008, 1:41am

my GPU is 9600 GT, core is G94. below is the simplestream run results:

memcopy: 33.51

kernel: 40.80

non-streamed: 74.86 (74.31 expected)

8 streams: 75.12 (44.99 expected with compute capability 1.1 or later)

Test PASSED

does it not support overlaping?

xjtusnail · December 11, 2008, 5:21am

according to my testing, it indeed does’t support overlapping. the same code run on GTX280 (with capability 1.3), its result indicates the parallel very good.

alex_dubinsky · December 11, 2008, 6:07am

Can you tell us the versions of your CUDA toolkit and GPU driver? (and of your gtx280 machine)

xjtusnail · December 11, 2008, 7:10am

For Geforce 9600 GT, toolkit version is 1.1 and the driver version is 178.15

For GTX 280, the driver version is also 178.15, toolkit is same as above.

is there any problem?

thank you

Topic		Replies	Views
memory copy overlap CUDA Programming and Performance	7	14719	March 29, 2008
Overlapping data transfers with kernel execution CUDA Programming and Performance	9	4555	March 13, 2009
Asynchronous data transfer CUDA Programming and Performance	8	7076	May 15, 2008
two (newbie?) questions asynchroneous host->device memcpy+events CUDA Programming and Performance	22	21969	December 11, 2008
Concurrent exec. of kernel and GPU mem copies CUDA Programming and Performance	5	2892	March 7, 2008
Overlapping kernel execution and memory copy CUDA Programming and Performance	6	9724	September 22, 2007
Overlapping kernel execution and data transfer CUDA Programming and Performance	9	3390	May 10, 2017
Copies between CPU and GPU CUDA Programming and Performance	8	5342	November 3, 2009
Overlap Device2Host and Host2Device memcpy? How can we overlap two cudaMemcpy calls? CUDA Programming and Performance	4	4478	June 4, 2008
Why the cuda kernel and copy do not overlap? CUDA Programming and Performance cuda	2	38	November 5, 2024

why cuda API does't support memory copy parallel with GPU computing

Related topics