Branching in kernels What if all kernels take the same route?

Dear all,

as stated in the programming manual, performing a branch in your kernel causes a penalty.

It was stated in the programmers manual that both routes might be taken, but only the results

of the proper path are kept. So the execution time would be that of both paths summed up.

But I wonder what happens if all threads of a warp (or block) need to take the same route

when branching.

Assume something like:

__global__ void myKernel()


  ...     // do some stuff


  ...     // evaluate n

  if (n > 0)


    // compute function A





    // compute function B



...  // do some stuff


And for one warp each thread evaluates n to 1, so all threads take the same path. Does it still

mean that both paths, function A and function B, are executed?


In the programming manual that situation is also addressed. Both execution paths are executed only if predicated execution is chosen by the compiler (and then, there is no branch at all, the two paths are executed with complementary predicates). When branching, there is no penalty if every thread in the warp evaluate the same condition (there is no divergence in the warp). Take a look at Section in the programming manual. It is clearly explained :)

I must have missed this. I’ll re-read that section again then.