Device to Host memcpy How do i make this faster?

markbasham · February 6, 2008, 4:20pm

Hi all

could anyone let me know what im doing wrong here, but i seem to be getting realy slow transfer speeds from the device to the host.

The program basicaly writes some data into 2 arrays, but i then need to get the data back to the host.

So im curently using the following code to do this

CUDA_SAFE_CALL(cudaMemcpy(Liquid,Liquidd,memsize,cudaMemcpyDeviceToHost));

CUDA_SAFE_CALL(cudaMemcpy(Crystal,Crystald,memsize,cudaMemcpyDeviceToHost));

So the size of the data i am copying is, 2048x2048x sizeof(int) and im transfering 2 of them.

Now if i just run the program, i get a 4.75 second pause, whilst the program executes those 2 lines of code, but if i pause before them, with a scan statement for example, the delay is reduced.

Any ideas of how i can do this a bit quicker, cause as far as i can see im transfering 8meg of data, and its taking nearly 5 seconds.

I,m prety sure the bandwidth isnt 2mb/s Device to Host, so i guess im doing something wrong.

Thanks

Mark

kuisma · February 6, 2008, 4:29pm

I guess the delay you get, is your kernel still executing. The kernel call is asynchronous you know.

– Kuisma

markbasham · February 6, 2008, 4:50pm

Ah that would explain it all.

Thanks for the quick responce.

Topic		Replies	Views
Slow device to host transfer CUDA Programming and Performance	1	3095	June 14, 2007
cudaMemcpy CUDA Programming and Performance	0	1206	November 20, 2008
Why cudaMemcpyDeviceToHost is too slowly? CUDA Programming and Performance	1	605	November 16, 2021
cudaMemcpyDeviceToHost speed how to improve speed CUDA Programming and Performance	3	12502	June 13, 2008
cudaMemcpyDeviceToHost taking much time? CUDA Programming and Performance	3	2653	July 15, 2009
cudaMemcpy2D slow CUDA Programming and Performance	4	5745	January 30, 2009
About CUDA CUDA Programming and Performance	2	4712	December 3, 2008
cudaMemcpyDeviceToHost time procces CUDA Programming and Performance	6	3016	August 1, 2008
Zero-copy from host to device decreases cudaMemcpyAsync device to host performance CUDA Programming and Performance	0	554	January 27, 2020
Timming memcpy CUDA Programming and Performance	1	2095	June 24, 2009

Device to Host memcpy How do i make this faster?

Related topics