cudaMemcpyDeviceToHost taking much time?

CUDAkk · July 11, 2009, 12:43pm

Hi All,

Is any time difference between cudaMemcpyHostToDevice and cudaMemcpyDeviceToHost cudaMemcpy?

seibert · July 11, 2009, 2:00pm

There can be a difference between those two copy directions, but it depends on the motherboard. On the systems I have access to, pageable memory copies are 15% slower in the DeviceToHost direction, compared to HostToDevice.

CUDAkk · July 12, 2009, 6:34am

But in my case cudaMemcpy( ) function is taking 0.332 ms for a size of 8006008 data copying from host to device and same function is taking 21.3ms for a size of 240018008 data copying from device to host( which should not be accepted in CUDA).

Hardware details: Quadro CX, CUDA2.2 Drivers, Toolkit and SDK.

Is it because of large memory size? or what could be the reason?

I am expecting answer from nvidia member’s side?

k9quaint · July 15, 2009, 9:19pm

Make sure you call cudaThreadSynchronize() after you launch your kernel and before you stop the timer for measurement. The kernel launch is asynchronous and will return very quickly, so you are probably only measuring the amount of time to launch the kernel. The memcpy from device to host has an implicit thread sync, so you might be measuring both kernel execution time & memcpy time.

Topic		Replies	Views
cudaMemcpyDeviceToHost speed how to improve speed CUDA Programming and Performance	3	12509	June 13, 2008
cudaMemcpy CUDA Programming and Performance	0	1209	November 20, 2008
Why cudaMemcpyDeviceToHost is too slowly? CUDA Programming and Performance	1	614	November 16, 2021
Copy back to host lasts much longer than copy to device, why? CUDA Programming and Performance	3	677	December 11, 2013
cudaMemcpyDeviceToHost time procces CUDA Programming and Performance	6	3016	August 1, 2008
Device to Host memcpy How do i make this faster? CUDA Programming and Performance	2	2514	February 6, 2008
cudaMemcpyDeviceToHost 200 x longer than cudaMemcpyHostToDevice ? CUDA Programming and Performance	2	1477	November 25, 2011
Why is there the difference of memory copy speed between cpu>gpu and gpu>cpu CUDA Programming and Performance	3	1281	April 10, 2014
cudaMemcpy host->device and device->host speed CUDA Programming and Performance	6	15246	April 29, 2014
cudaMemcpyDeviceToHost 3x slower than cudaMemcpyHostToDevice CUDA Programming and Performance	1	899	January 9, 2019

cudaMemcpyDeviceToHost taking much time?

Related topics