Question about control flow divergence

lee222 · July 24, 2008, 7:04am

Suppose that each thread in a block executes the following loop.

//tid is a theadID
for(i=0; i < f(tid); i++) {
…
}

If 16 threads in a warp execute 8 iterations (f(tid) == 8), and the other 16 execute 10 iterations, which of the followings is true?

until 8th iterations, all threads in a warp run in parallel, and then half of the warp executes remaining 2 iterations in parallel.
or
all 32 threads are diverged; thread 0 executes 8 iterations, and then thread 1 executes 8, …, thread 31 executes 10 iterations.

(assuming that loop body is somewhat big, and thus predication can not be used.)

Sarnath · July 24, 2008, 11:15am

To my knowledge “1)” is true! i.e. all threads will go hand in hand for 8 iterations… There will be divergence only after that,

lee222 · July 24, 2008, 3:41pm

Thanks, but I’m wondering how this happens. This means that runtime system checks control divergence at every branch instruction, and it means that control diverge check routine in in a critical path of GPU H/W. :angel:

SPWorley · July 24, 2008, 4:12pm

Yep, it checks for divergence at every branch instruction. Wild, when you think about it! But remember the hardware and scheduling software has been DESIGNED to do this so it’s extremely efficient.

Nonetheless, you should still avoid divergence if you can, but the nice part is how it Just Works when you do use it.

lee222 · July 24, 2008, 4:30pm

Thanks a lot.

Topic		Replies	Views
Shift direction and divergence CUDA Programming and Performance	7	381	November 13, 2020
Loops in kernels CUDA Programming and Performance	2	1321	September 3, 2009
Thread Divergence CUDA Programming and Performance	2	2730	September 27, 2008
Question about divergence and loops CUDA Programming and Performance	7	7071	November 21, 2010
Avoid branching ... CUDA Programming and Performance	3	3601	May 19, 2010
Each thread working concurrently ? CUDA Programming and Performance	5	1117	March 2, 2010
thread local 'for loop' question thread parallel for loop execution CUDA Programming and Performance	5	3388	August 29, 2007
Must all threads execute the same code? "Branch divergence occurs only within a warp" CUDA Programming and Performance	5	2939	December 28, 2008
Difference between Thread Divergence and Warp Divergence CUDA Programming and Performance	3	9090	September 7, 2018
branch diveragence with if/while same as if one of the threads in a warp returning CUDA Programming and Performance	18	2713	December 13, 2011

Question about control flow divergence

Related topics