Unspecified Launch failure with For loops?

I’m a little confused on why I’m getting this error. The following code (its a global function) produces NO errors:

for (int i = 0; i < 4; i++)

{

	float amount = round(error/float(i+1))*increment;

	//*neighbors[i] = amount;

}

float amount;

amount = round(error/float(1))* increment;;

*neighbors[0] = amount;

amount = round(error/float(2))* increment;;

*neighbors[1] = amount;

amount = round(error/float(3))* increment;;

*neighbors[2] = amount;

amount = round(error/float(4))* increment;;

*neighbors[3] = amount;

Now if I uncomment the commented line, then this code FAILS with unspecified launch failure.

for (int i = 0; i < 4; i++)

{

	float amount = round(error/float(i+1))*increment;

	*neighbors[i] = amount;

}

float amount;

amount = round(error/float(1))* increment;;

*neighbors[0] = amount;

amount = round(error/float(2))* increment;;

*neighbors[1] = amount;

amount = round(error/float(3))* increment;;

*neighbors[2] = amount;

amount = round(error/float(4))* increment;;

*neighbors[3] = amount;

I’m not sure whats going on. Any ideas? Could it be something to do with how I’m compiling this? The flags I’m using are -use_fast_math -O3

Any help would be greatly appreciated, thanks.

out of bounds memory access?

tmurray: While you’ve answered similar questions hundreds of times and seems to know your stuff, I’ve got say I don’t get it either. It looks to me like the functionality is identical. One loop, and the same loop unrolled.

Skyd: Run under the debugger and report back :)

This is a little curious. What is the declaration of neighbors? How is it initialized? Double pointers in device code are usually a red flag for possible problems.

That FOR loop would be unrolled anyway, so both versions would be the same. Could you check that the numbers of registers are different in these two cases?