Streams and Kernel Execution Order

Aaron_V · August 19, 2010, 3:20pm

I have a question about the order of kernel calls when using streams. For the following code

for (i = 0; i < 10; i++) {
kernel_a<<<1,1,0,stream[i]>>>(i);
kernel_b<<<1,1,0,stream[i]>>>(i);
}

will the kernel execution order be guaranteed (i.e. as executed on the GPU) to be

kernel_a<<<1,1,0,stream[1]>>>(1)
kernel_b<<<1,1,0,stream[1]>>>(1)
kernel_a<<<1,1,0,stream[2]>>>(2)
kernel_b<<<1,1,0,stream[2]>>>(2)
kernel_a<<<1,1,0,stream[3]>>>(3)
kernel_b<<<1,1,0,stream[3]>>>(3)
etc.
kernel_a<<<1,1,0,stream(9)>>>(9)
kernel_b<<<1,1,0,stream(9)>>>(9)

So that’s one form of my question. More generally, I’m really trying to see the asynchronous API provides guarantees about kernel execution order (on the GPU). (in particular: is the order of kernel calls deterministic?)

I have code which uses streams, and I have found that empirically it produces the same results each run, and order does matter, so the kernel order seems deterministic. I’m trying to see if it is a fluke due to the kernel scheduling or the behavior of my code, or whether it’s because the driver enforces a certain deterministic order of kernel execution.

Thanks for any help.

MisterAnderson42 · August 19, 2010, 6:50pm

kernel ordering is deterministic only within each stream. The driver is free to reorder kernels and/or run them in parallel if they are from separate streams.

MisterAnderson42 · August 19, 2010, 6:50pm

kernel ordering is deterministic only within each stream. The driver is free to reorder kernels and/or run them in parallel if they are from separate streams.

Topic		Replies	Views
Kernel Execution Sequence CUDA Programming and Performance	1	2954	May 25, 2012
Overlapping execution / data transfer & kernel execution order CUDA Programming and Performance	2	745	December 10, 2015
Processing Order with Cuda Streams in 7.5 CUDA Programming and Performance	13	2203	June 24, 2016
Stream execution order in CUDA exercise Teaching & Curriculum Support	1	1281	February 3, 2020
Stream Ordering CUDA Programming and Performance	6	1543	October 12, 2021
kernel launches in the same stream CUDA Programming and Performance	4	5328	September 22, 2010
Dynamic Parallelism Execution Order CUDA Programming and Performance	4	733	September 21, 2015
Excution kernel with default stream CUDA Programming and Performance	2	739	November 28, 2016
cuda (Newbie question) when using streams, does the order of the Async calls make a difference? CUDA Programming and Performance	1	580	December 5, 2010
Stream Job Scheduling CUDA Programming and Performance	6	4290	November 29, 2009

Streams and Kernel Execution Order

Related topics