very strange behavior reading a variable is causing a crash

I am getting a very strange error that I hope someone can explain to me.

I have code similar to this:

__device__ foo()


    bool isTrue = false;

     int i = 0;




               isTrue = true;


     }while(i < 10);




I realize that in this example is silly because ‘isTrue’ will always take ‘true’, but it illustrates the problem nicely…

The problem is that if I try to set ‘isTrue’ inside an if in the do-while loop and then access it outside of the loop my kernel execution time gets reduced by a full order of magnitude, even to the point of causing the kernel to not run. If I comment out the ‘if(isTrue)’ part and change nothing else then the kernel runs at full speed. If I set ‘isTrue’ to ‘true’ after the loop is finished executing then it also runs at full speed. I don’t understand why accessing the variable should cause such a slowdown compared with all of the other calculation I am doing. Also, in my code ‘doSomething()’ simply changes the value of another variable so it is not like I am getting a major divergence or anything.

Also, my code has nested ‘if’ statements in the do-while loop and if I try to set ‘isTrue’ in a deeper ‘if’ statement the problem gets even worse.

I am at a total loss right now because my code runs really fast… until I try to read anything that I may have set in the loop, and it only happens if I read the variable, not if I write it first.


OK, new information after testing. The code, as written above, works fine. However if I replace:


if(data[i].type == 2)

where data is an array stored in device memory, then the problem occurs as described above.

I am still completely confused as to why this would happen.

That is expected. Reading from device memory is the slowest thing you can do. See the manual for info on latency. You should cache the data array in shared memory.


Right, I understand that device memory is very slow but my issue was that adding the ‘if(isTrue)’ caused the program to slow down.

I think that I have figured out the problem and it looks like I am a victim of the optimizer. when I check the value of ‘isTrue’ the compiler knows that the if statement that sets it is needed and leaves it as is, but if I take any further reference to ‘isTrue’ out of my code or change it’s value prior to using it then the compiler is smart enough to optimize out (totally remove) the previous if statement, and thus the read from device memory, because it does not actually affect any data.

I am currently trying to put as much data into texture memory as possible because I am already using shared memory for something else.