I think I’ve narrowed down a problem and isolated what is probably a bug. I could file a bug report, but I thought I would ask here first in case it’s my mistake. It seems that the compiler is optimizing my for-loop conditional to make the same condition later in the code evaluate incorrectly. The code is below, but here’s a simpler explanation.
The kernel parses an octree, using an array of loop counters to keep track of the progress at each depth. If at any point a node is -1, it breaks out of the loop early and jumps to the next depth. The current counter value will be saved so when you return to this depth later, it can continue from the same place.
This is the loop:
for ( ; m2[d] < 8; m2[d]++)
After the loop, if the counter is 8, I know I have finished parsing this depth. If the counter is less than 8, I know I must have broken out early, so I jump down a level, and restart from the top.
The problem is this. In the case where the final node (counter = 7) of any depth is -1, once I break out of the loop, the value is EQUAL to 7, but NOT LESS than 8.
The check:
if (m2[d] == 7) printf("equals 7!\n");
if (m2[d] < 8) printf("less than 8!\n");
results in “equals 7!” but does not print “less than 8!”. Because of this, it does not enter either if block, stays at the current depth, and gets stuck in an infinite loop on this node. You can imagine my frustration trying to track this one down.
As far as I can tell, the compiler is not recognizing the functionality of ‘break’ in terms of how to optimize. It is assuming the loop condition is false immediately after the loop, and not re-evaluating it. Other conditions with other values like (m2[d] == 7) and (m2[d] < 9) work fine.
Adding printfs in specific places, or changing the order of things, seems to prevent the problem. Also defining m2 as volatile fixes it. I don’t see why I would have to declare it as volatile, there are no other threads accessing this value. The kernel reproduces this error every time when launched as a single block, single thread.
int d = 0; // current depth
int m2[10]; // current target cell at each depth
int m2_pos[10]; // target cell position in start/count array
m2[0] = 0; // start at first cell
m2_pos[0] = 0;
for (int j = 0; j < 8; j++) { // verify c_count is working
printf("%d = %d\n", j, pt2.c_count[0][j]);
}
int i = 0;
for (i = 0; i < 20; i++) { // until search is complete
printf("depth %d, starting at %d+%d\n", d, m2_pos[d], m2[d]);
for ( ; m2[d] < 8; m2[d]++) { // for each cell at this depth
int pos = m2_pos[d] + m2[d];
int m2_count = pt2.c_count[d][pos];
// target cell points to next depth
if (m2_count == -1 && d == 0) {
m2_pos[d+1] = pt2.c_start[d][m2_pos[d] + m2[d]];
break; // jump to next depth in this cell
}
}
printf("m2[%d] = %d\n", d, m2[d]);
if (m2[d] == 7) printf("equals 7!\n");
if (m2[d] < 8) printf("less than 8!\n");
//else printf("not less than 8!\n");
printf("m2[%d] = %d\n", d, m2[d]);
if (m2[d] < 8) { // work to do at next depth
printf("jumping down to depth %d\n", d+1);
m2[d]++; // this cell done
d++; // jump to next depth
m2[d] = 0; // start fresh at next depth
//continue;
}
if (m2[d] == 8) { // finished this depth
if (d == 0) break; // at top level: all done!
// jump back to previous depth and continue where we left off
d--;
}
}