About Stream control

darot · March 26, 2009, 5:29am

Hi, I’m try to use stream to do some time overlap.

in the programming guide, it list the exsample

for (int i = 0; i < 2; ++i)

cudaMemcpyAsync(inputDevPtr + i * size, hostPtr + i * size,

size, cudaMemcpyHostToDevice, stream[i]);

for (int i = 0; i < 2; ++i)

myKernel<<<100, 512, 0, stream[i]>>>

(outputDevPtr + i * size, inputDevPtr + i * size, size);

for (int i = 0; i < 2; ++i)

cudaMemcpyAsync(hostPtr + i * size, outputDevPtr + i * size,

size, cudaMemcpyDeviceToHost, stream[i]);

cudaThreadSynchronize();

for the example,

will the kernel lanch of stream(1) will wait for the copy of stream(1)?

or we have to do something to make the kernel launch wait for the previous copy when we want to make sure all of the data is copied and then be did the kernel operation?

kristleifur · March 26, 2009, 10:39am

AFAIK this is precisely what streams are intended to do

Topic		Replies	Views
Syncronization with cuda Streams CUDA Programming and Performance cuda	8	416	October 12, 2021
Multiple streams. CUDA Programming and Performance	1	3407	June 22, 2011
CUDA stream CUDA Programming and Performance	1	4649	April 11, 2010
Timing With Streams CUDA Programming and Performance	0	1720	October 2, 2008
Any method for time overlap? CUDA Programming and Performance	2	4511	April 13, 2009
Question about streams CUDA Programming and Performance	1	979	August 6, 2009
Weird behaviour of CUDA streams CUDA Programming and Performance	0	1889	June 17, 2010
Overlap cudaMemcpyAsync and kernel CUDA Programming and Performance	1	502	February 10, 2021
cuda (Newbie question) when using streams, does the order of the Async calls make a difference? CUDA Programming and Performance	1	527	December 5, 2010
Stream Synchronization Questions CUDA Programming and Performance	1	287	January 17, 2019

About Stream control

Related topics