Timing With Streams

AFlare1 · October 2, 2008, 2:17pm

Hi,

I am writing a piece of code which utilises streams to help mask the memory overheads.

for ( int i=0; i< MaxStreams; i++)
{
cudaMemcpyAsync(d_a, a, nbytes, cudaMemcpyHostToDevice, i); [A]
kernel<<<blocks, threads, 0, i>>>(d_a, value); [B]
cudaMemcpyAsync(a, d_a, nbytes, cudaMemcpyDeviceToHost, i); [C]
}

As far as i understand this will do the following:

[Stream 0] A_B_C
[Stream 1] A__B_C
[Stream 2] A___B_C
…

Is this correct, and if so how do i set up cuda events to show the time for the completion of A-C for each stream?

Topic		Replies	Views
About Stream control CUDA Programming and Performance	1	941	March 26, 2009
STREAMS CUDA Programming and Performance	0	726	November 8, 2009
Syncronization with cuda Streams CUDA Programming and Performance cuda	8	420	October 12, 2021
CUDA stream CUDA Programming and Performance	1	4651	April 11, 2010
CUDA Streams Overlap Data Transfers CUDA Programming and Performance	2	609	October 24, 2013
Streams and multiprocessor usage? CUDA Programming and Performance	3	2899	September 20, 2008
Using streams... Howto? CUDA Programming and Performance	0	1112	July 25, 2008
I want to synchronize CUDA streams CUDA Programming and Performance	5	837	January 5, 2024
Help with CUDA streams CUDA Programming and Performance	1	1599	April 2, 2010
Multiple streams. CUDA Programming and Performance	1	3409	June 22, 2011

Timing With Streams

Related topics