I am observing an odd divergent branching phenomenon attributed to shared memory variables and the powf() function. The problem was zeroed in on two code blocks. I need your help finding out what causes the problem, since I couldn’t find an appropriate answer on the forum nor on the net.
The algorithm simulates biological processes. Each process has a fixed amount of input variables (varIn) and output variables (varOut) denoted by maxIn and maxOut. VarIn and VarOut change over time according to the following state equation
[list=1]
[*]varIn[n] = varIn[n-1] - x
[*]varOut[n] = varOut[n-1] + x
where “x” is the change factor. The change factor is calculated through
for(iii=0;iii<maxIn;iii++)
x = x * powf(varIn[iii],beta[iii])
varIn is an element of R[sub]0[/sub][sup]+[/sup]
beta is an element of R
Each thread simulates one process and maxIn and maxOut are the same among all threads. The variables are loaded into shared memory. Here are the two code blocks causing the branching problem:
[codebox]
global void genericprocess(…)
// Initialize variables ,set up shared memory space and load variables from global memory to shared memory
float x=1;
…
// Block 1 - Calculate process’ change factor
for(int iii=0;iii<maxIn;iii++){
x = x * powf(s_varIn[iii+offsetInShared],s_beta[iii+offsetInShared])
;
}
// Block 2 - Update process’ input and output variables
for(int iii=0;iii<maxIn;iii++){
s_varIn[iii+offsetInShared] = s_varIn[iii+offsetInShared] - x;
}
for(int iii=0;iii<maxOut;iii++){
s_varOut[iii+offsetOutShared] = s_varOut[iii+offsetOutShared] + x;
}
[/codebox]
Analyzing the program with the profiler reveals maxIn divergent branches ascribed to the “change factor for-loop”. If I comment out either of the blocks no divergent branching occurs. The same is true if the calculation of x does not depend neither on s_varIn nor on s_beta.
Any ideas what is causing this behaviour?
Thanks in advance!
George