Thread divergence due to IF

Two questions:

  1. Will the following kernel have its threads diverged because of “if” instruction:

kernel(int param){

  for(i=0;i<param;i++){
          if (i<param>>1){
                  do something
          }else{
                  do something else
          }
     }

}

  1. My guess that there will be no divergence within half-warp, but there will be one b/w half warps. What can be the impact on the performance?

I experience severe performance penalty due to such “if” and I don’t understand why. After all, my understanding is that only half warps are executed in lock step, and not full warps, so why is it so painful?

If that is really your kernel, there will be no divergence. Param is the same for all threads in the block, so all threads in the block will follow the same branch of the if statement and have the same number of iterations in the for loop.

full warps suffer from divergence as well, as far as I know. half warps are only used for memory coalescing

Correct. Performance impact depends on your code. Think of it in this way - both branches of the if-statement have to be executed by the threads of a divergent warp (except that parameters are not fetched and results are not written by the threads that don’t satisfy the condition for the particular branch).

Paulius