By “lazy evaluation”, I mean : when you have a conditional jump like this :
if(a && b)
Then, if a is false, then b wont be evaluated. This is the standard behaviour in C, and it could be a little tricky when there are boolean expressions with side effects.
So, I’ve developed a C+±like compiler last year, that generates assembly code, and the code generation for boolean expressions with “lazy evaluation” had to involve conditional jumps. So, I wondered what was the behaviour of nvcc with CUDA kernels, because in my mind, conditional jumps are synonym of “divergent branches” !
I suppose I’m totally wrong, but I can’t find anything on the web… ! (perhaps nvcc looks for expressions with side effects prior to generate the ptx output ?)
By “lazy evaluation”, I mean : when you have a conditional jump like this :
if(a && b)
Then, if a is false, then b wont be evaluated. This is the standard behaviour in C, and it could be a little tricky when there are boolean expressions with side effects.
So, I’ve developed a C+±like compiler last year, that generates assembly code, and the code generation for boolean expressions with “lazy evaluation” had to involve conditional jumps. So, I wondered what was the behaviour of nvcc with CUDA kernels, because in my mind, conditional jumps are synonym of “divergent branches” !
I suppose I’m totally wrong, but I can’t find anything on the web… ! (perhaps nvcc looks for expressions with side effects prior to generate the ptx output ?)
Ah, ok, I’ve normally heard this called “short circuit evaluation” in this context, but I see that it is a specific case of lazy evaluation.
I just did a few tests with nvcc and some dummy kernels, and it does appear that the compiler does some kind of side effect evaluation to decide how to generate the PTX in order to satisfy the short-circuit rules for C/C++. When the second expression had no side effects (it was a simple equality test in my case), it was always evaluated. However, when I replaced the second expression with a device function call with a side effect, the generated PTX contained a branch after the evaluation of the first expression.
So yes, if the second expression has side effects, you have the potential for branch divergence within the evaluation of the boolean condition.
Ah, ok, I’ve normally heard this called “short circuit evaluation” in this context, but I see that it is a specific case of lazy evaluation.
I just did a few tests with nvcc and some dummy kernels, and it does appear that the compiler does some kind of side effect evaluation to decide how to generate the PTX in order to satisfy the short-circuit rules for C/C++. When the second expression had no side effects (it was a simple equality test in my case), it was always evaluated. However, when I replaced the second expression with a device function call with a side effect, the generated PTX contained a branch after the evaluation of the first expression.
So yes, if the second expression has side effects, you have the potential for branch divergence within the evaluation of the boolean condition.