Branching in kernels What if all kernels take the same route?

lord_jake · June 29, 2007, 3:14pm

Dear all,

as stated in the programming manual, performing a branch in your kernel causes a penalty.

It was stated in the programmers manual that both routes might be taken, but only the results

of the proper path are kept. So the execution time would be that of both paths summed up.

But I wonder what happens if all threads of a warp (or block) need to take the same route

when branching.

Assume something like:

__global__ void myKernel()

{

  ...     // do some stuff

  

  ...     // evaluate n

  if (n > 0)

  {

    // compute function A

    ...

  }

  else

  {

    // compute function B

    ...

  }

...  // do some stuff

}

And for one warp each thread evaluates n to 1, so all threads take the same path. Does it still

mean that both paths, function A and function B, are executed?

Jake

javier1 · June 29, 2007, 3:56pm

Dear all,

as stated in the programming manual, performing a branch in your kernel causes a penalty.

It was stated in the programmers manual that both routes might be taken, but only the results

of the proper path are kept. So the execution time would be that of both paths summed up.

But I wonder what happens if all threads of a warp (or block) need to take the same route

when branching.

Assume something like:
__global__ void myKernel()

{

  ...     // do some stuff

  

  ...     // evaluate n

  if (n > 0)

  {

    // compute function A

    ...

  }

  else

  {

    // compute function B

    ...

  }

...  // do some stuff

}
And for one warp each thread evaluates n to 1, so all threads take the same path. Does it still

mean that both paths, function A and function B, are executed?

Jake

[snapback]215860[/snapback]

In the programming manual that situation is also addressed. Both execution paths are executed only if predicated execution is chosen by the compiler (and then, there is no branch at all, the two paths are executed with complementary predicates). When branching, there is no penalty if every thread in the warp evaluate the same condition (there is no divergence in the warp). Take a look at Section 5.1.1.2 in the programming manual. It is clearly explained :)

lord_jake · June 29, 2007, 5:21pm

I must have missed this. I’ll re-read that section again then.

Thanks!

Topic		Replies	Views
Branching in kernel CUDA Programming and Performance	3	5393	June 5, 2008
about divergent branches CUDA Programming and Performance	1	2123	March 24, 2008
Ternary operators and branching CUDA Programming and Performance	3	9092	May 3, 2009
About divergent warps CUDA Programming and Performance	3	1636	September 22, 2009
Question about divergent branching CUDA Programming and Performance	3	6479	May 21, 2009
How many divergent branches can actually be discussed in parallel? CUDA Programming and Performance	5	3087	October 1, 2009
Warp branching CUDA Programming and Performance	11	10357	October 26, 2010
Must all threads execute the same code? "Branch divergence occurs only within a warp" CUDA Programming and Performance	5	3019	December 28, 2008
Concerning Branching CUDA Programming and Performance	1	2377	June 11, 2009
Cost of bra instruction CUDA Programming and Performance	8	7911	January 14, 2010

Branching in kernels What if all kernels take the same route?

Related topics