multiple switch between host and device

Soraya · August 21, 2011, 7:39am

Hi,

I have a problem with the running time of my cuda code.
My application input is a stream data. At each time I do the following actions:

copy a pulse of data to devise (30 micro sec)
do some process (150 micro sec)
3.copy the result to the host (60 micro sec)

each of three above take not much time. but when I measure the entire time, it will be so much (15 mil sec). Could you please let me know why this happen?

I guess lots of time may be consumed for switching between host and device. Is this true? How can I avoid this?

many thanks.

seibert · August 21, 2011, 3:44pm

How are you measuring the time of each step?

Topic		Replies	Views
time of copy CUDA Programming and Performance	0	823	June 18, 2010
cudaMemcpyDeviceToHost time procces CUDA Programming and Performance	6	3131	August 1, 2008
About CUDA CUDA Programming and Performance	2	4784	December 3, 2008
Times spend on Transferring data Something wrong? CUDA Programming and Performance	1	1738	April 8, 2008
more time taken by CUDA rather than reducing time CUDA Programming and Performance	7	4718	November 18, 2010
cudaMemcpyDeviceToHost 200 x longer than cudaMemcpyHostToDevice ? CUDA Programming and Performance	2	1565	November 25, 2011
A few questions on CUDA performance with pictures! CUDA Programming and Performance	6	3465	January 10, 2009
cudaMemcpy(dataDev, dataHost, mem_size, cudaMemcpyHostToDevice) execution time to long CUDA Programming and Performance	2	6468	January 21, 2010
cudaMemcpy host->device and device->host speed CUDA Programming and Performance	6	15557	April 29, 2014
problem in timing of GPU work CUDA Programming and Performance	5	920	September 11, 2015

multiple switch between host and device

Related topics