Branching optimization manual or automatic?

Hello,

i’ve just finnished a program on cuda.
However, on some parts of the code i feel that i have too much branching for very simple tasks like:

i) find the maximum value from x,y.
Im using if( x < y )

ii) identify where the opposite and common vertexes of two adjacent triangles are.
Example: triangle1 = {0, 1, 2} triangle2={ 3, 2, 1 }
common = 1, 2;
opVertex = 0, 3;

as you can imagine, task ii) brute force solution involves many if(x == y) just to get that little info.

in general, as a first version i took if else approach, but know is the turn for optimization.

my question is the following:
Does the compiler internally optimize these instructions to avoid branching???

if not, then i will find better methods, however any indications are welcome

best regards
Cristobal

Hello.

Sometimes its possible to convert something like this:

if (condition)

a = expression1;

else

a = expression2;

into this:

a = expression1*condition + expression2*(!condition)

Why would you ever want to do that? C has an operator to explicitly do this:

a = condition ? expression1 : expression2;

It’s worth using, for reasons that are unclear to me the compiler sometimes produces a lot better code with this than with the original [font=“Courier New”]if[/font] clause.

Also, PTX has special instructions for minimum and maximum. Using the [font=“Courier New”]? :[/font] operator (but not for the [font=“Courier New”]if[/font] clause), the compiler is able to automatically optimize expressions into these instructions. Still it’s probably a good idea to explicitly use [font=“Courier New”]min()[/font] and [font=“Courier New”]max()[/font].

If arithmetic coding is possible then I think it’d be faster because there is no branching.

PTX has support for making this kind of operation fast, right, too bad that doesn’t work so well for branching over large chunks of code :)

sorry for the late response

i have been experimenting on what you guys recomended.

and changed almost all the if-elses to generic code using the technique you recommended.

int a = (condition * value1) + (!condition * value2)

but for surprise of me im getting slower performance, like 10% slower than before. I tried to find out what was really hapenning and

on the visual profiler i found that im getting much more divergent branches than before…

i thought that branching was associated to if-elses, but in the end im having almost the triple of branching than before.

anyone know why this behaveiour??, thanks in advance really

Cristobal

That is because you just got bad advice. There really is no advantage writing

int a = (condition * value1) + (!condition * value2)

and it might even create two conditional branches (for condition and for !condition) where previously there only was one, if the compiler isn’t optimizing really well.

Check my earlier advice and see how that scores.

thanks dude,

the ternary operator ended working equal or sometimes better, so cool :D

Cristobal