Ternary operator in device code

Hi,

I need lots of if conditions in one of my device functions. Is it any faster to use the ternary operator ?: instead of nested if condition ?

-Oj

As far as I know the compiler will use predication with small blocks or conditions; however, that is not very tranpsarent to the user. I am not sure, but I think the ternary operator might be a hint to use predication.

All I can say for sure is: test your kernel both ways, and see which is faster.

Hi,

this is not directly an answer to your question, but it might fit the topic:

Sometimes, bitwise, non-short-circuiting logical operators - that should cause less branching - are sometimes faster than short-circuiting operators.

if (x < y & conditionPassed(shared_array[threadIdx.x]))

may be faster than

if (x < y && conditionPassed(shared_array[threadIdx.x]))

The reasoning is that if all the thread blocks are bound to some calculations, by virtue of some threads contained in them, it might be worth it to simply have all the threads calculate the calculations anyway, getting rid of some branching.

I tested this in my code, and I think I saw a slight speed increase. It was a long sequence of potentially branching instructions though, I have no idea if it’s worth it for less …

Yes, I think that would be the case provided the overhead of thread divergence is more than the extra computation on all threads.

-Oj

If you try it and see any meaningful difference, I’d be pretty happy to know about it. If only to feel less like the crazy guy at the office that wants to go faster by doing more :)