Will this loop cause warp divergence?

ejmeitz1 · December 10, 2023, 4:08pm

In my CUDA kernel I have the following for loop where each thread in a warp begins at a different index. That index then rolls over back to the first index if it goes beyond 32. Will looping like this cause warp divergence?

Note I only have one warp per block so thread id is same as the lane in the warp.

tid = ThreadIdx.x
for j in tid:(tid + WARP_SIZE)
        wrapped_j_idx = (j - 1) & (WARP_SIZE - 1) #modulo
        val = foo()
        #No race conditions as threads in warp execute in step
        forces[tid + offset, :] += val
        forces[wrapped_j_idx, :] -= val
end

Topic		Replies	Views
Does CUDA support variable loop limits? CUDA Programming and Performance	2	1214	October 12, 2011
Loops in kernels CUDA Programming and Performance	2	1334	September 3, 2009
Question about divergence and loops CUDA Programming and Performance	7	7086	November 21, 2010
warp divergence triggered by for loop CUDA Programming and Performance	2	1653	April 2, 2018
thread local 'for loop' question thread parallel for loop execution CUDA Programming and Performance	5	3397	August 29, 2007
Branch divergence - for loop range CUDA Programming and Performance	3	960	February 12, 2018
Scheduling Question Again CUDA Programming and Performance	1	1685	June 26, 2007
Avoid branching ... CUDA Programming and Performance	3	3616	May 19, 2010
Thread Divergence CUDA Programming and Performance	2	2743	September 27, 2008
Question about control flow divergence CUDA Programming and Performance	4	7324	July 24, 2008

Will this loop cause warp divergence?

Related topics