Divergent branches

Hi all,

I wondered if the lazy evaluation (inherited from C) could produce divergent branches (I’m thinking as an assembly programmer)

Thank you ! (no CUDA card to test here <img src=‘http://hqnveipbwb20/public/style_emoticons/<#EMO_DIR#>/crying.gif’ class=‘bbc_emoticon’ alt=’:’(’ /> )

What do you mean by “lazy evaluation” in C?

Hello, and thank you for your time.

By “lazy evaluation”, I mean : when you have a conditional jump like this :

if(a && b)

Then, if a is false, then b wont be evaluated. This is the standard behaviour in C, and it could be a little tricky when there are boolean expressions with side effects.

So, I’ve developed a C+±like compiler last year, that generates assembly code, and the code generation for boolean expressions with “lazy evaluation” had to involve conditional jumps. So, I wondered what was the behaviour of nvcc with CUDA kernels, because in my mind, conditional jumps are synonym of “divergent branches” !

I suppose I’m totally wrong, but I can’t find anything on the web… ! (perhaps nvcc looks for expressions with side effects prior to generate the ptx output ?)

Hello, and thank you for your time.

By “lazy evaluation”, I mean : when you have a conditional jump like this :

if(a && b)

Then, if a is false, then b wont be evaluated. This is the standard behaviour in C, and it could be a little tricky when there are boolean expressions with side effects.

So, I’ve developed a C+±like compiler last year, that generates assembly code, and the code generation for boolean expressions with “lazy evaluation” had to involve conditional jumps. So, I wondered what was the behaviour of nvcc with CUDA kernels, because in my mind, conditional jumps are synonym of “divergent branches” !

I suppose I’m totally wrong, but I can’t find anything on the web… ! (perhaps nvcc looks for expressions with side effects prior to generate the ptx output ?)

Ah, ok, I’ve normally heard this called “short circuit evaluation” in this context, but I see that it is a specific case of lazy evaluation.

I just did a few tests with nvcc and some dummy kernels, and it does appear that the compiler does some kind of side effect evaluation to decide how to generate the PTX in order to satisfy the short-circuit rules for C/C++. When the second expression had no side effects (it was a simple equality test in my case), it was always evaluated. However, when I replaced the second expression with a device function call with a side effect, the generated PTX contained a branch after the evaluation of the first expression.

So yes, if the second expression has side effects, you have the potential for branch divergence within the evaluation of the boolean condition.

Ah, ok, I’ve normally heard this called “short circuit evaluation” in this context, but I see that it is a specific case of lazy evaluation.

I just did a few tests with nvcc and some dummy kernels, and it does appear that the compiler does some kind of side effect evaluation to decide how to generate the PTX in order to satisfy the short-circuit rules for C/C++. When the second expression had no side effects (it was a simple equality test in my case), it was always evaluated. However, when I replaced the second expression with a device function call with a side effect, the generated PTX contained a branch after the evaluation of the first expression.

So yes, if the second expression has side effects, you have the potential for branch divergence within the evaluation of the boolean condition.

Thanks*1000, this is very helpful !

Thanks*1000, this is very helpful !