I have some examples which give me some strange headaches:
I produce a thread divergence, but I cannot figure out which branch or which statements are computed first?
First example:
I have the following kernel, which I start by 2 threads in 1 block.
with a[0]=0, and a1=0.
__global__ void branchTest_kernel( float* a){
int tx = threadIdx.x;
if(tx==0){ // or tx==1
a[1] = a[0] + 1; (a)
}else if(tx==1){ // or tx==0
a[0] = a[1] + 1;; (b)
}
}
[b]Output
a[0] = 1
a1 = 2 [/b]
I assum that because the two threads are in on warp, they execute in lockstep and is that the reason why they get serialized that (b) is first and (a) second, but why this way around? (b) is precedent also if first if → tx==1 and second else if is tx==0…
Second example:
Exactly the same as the first but, now removed the else if part:
__global__ void branchTest_kernel( float* a){
int tx = threadIdx.x;
if(tx==0){
a[1] = a[0] + 1; (a)
}else{
a[0] = a[1] + 1; (b)
}
}
[b]Output
a[0] = 2
a1 = 1[/b]
What causes this behaviour that suddenly now (a) is first, and (b) second…
Can somebody explain how the precendence rules are for branches? Or where to find such information?
Thanks alot!
I encountered this example during an implementation of a Gauss-Seidel Solver: