CUDAmemcpy takes too long

antothenewbi · July 16, 2009, 12:50pm

Hy,

I have a program in CUDA and it’s seems to long to run.

My program:

I have rfData_h as an input

cudaMalloc((void **) &rfData_d, size_nmpts_nmlne);

//Copy
cudaMemcpy(rfData_d, rfData_h, size_nmpts_nmlne, cudaMemcpyHostToDevice);

My kernels invocation

//Retrieve
cudaMemcpy(rfData_h, rfDataMODE_d, size_nmpts_nmlne, cudaMemcpyDeviceToHost);

My kernel is quick so no problem at all ( <300 us )
But the memory transfer takes too long (>100ms)
Can I use the shared memory instead of the local memory so that it’ll be faster?

I need to have the copy → kernel → retrieve <1ms.

eyalhir74 · July 16, 2009, 12:54pm

Shared memory cant be used outside your kernel function.

How do you know your kernel takes less than 300us??

What you probably see is that the kernel invocation is async and therefore you only measure the launch time of the kernel.

The memcpy than implictly calls cudaThreadSynchronize and therefore you think your copy takes so much time.

Make sure you time the code correcly by putting a sync after the kernel and then measuring the time, also check for errors returned

from the kernel.

You can probably find more info about this in the programming guide or the best practice manual released lately by nVidia.

eyal

antothenewbi · July 16, 2009, 2:38pm

Ok thanks now I see what’s going on!

Topic		Replies	Views
Slow memory transfers CUDA Programming and Performance	7	1986	May 23, 2011
cudaMemcpy() Best approach when you need to call it many times? CUDA Programming and Performance	8	25011	March 8, 2010
Is there any way to copy data from device to host more efficiently in this case? CUDA Programming and Performance	4	877	December 14, 2018
Memory Transfer CUDA Programming and Performance	7	2959	October 10, 2008
copy memory slow? CUDA Programming and Performance	2	4798	February 12, 2009
Copying memory from device to Host takes too much time CUDA Programming and Performance	7	3389	October 5, 2010
cudaMemcpy takes 30% of my project time. CUDA Programming and Performance	5	4394	July 20, 2009
Possibly Studpid question bout cudaMemcpy CudaMemcpy getting slow by time CUDA Programming and Performance	4	2001	February 26, 2010
Problem with cudamemcopy CUDA Programming and Performance	6	1829	September 18, 2009
cudaMemcpy sometimes doesn't work CUDA Programming and Performance	5	4474	November 13, 2008

CUDAmemcpy takes too long

Related topics