Is it better to have multiple kernels withoout if loops,…or 1 kernel with many If loops???,what will be the performance issue??? :unsure:
It is always better to get rid of the branches. All the threads in a warps will have to wait for the slowest thread in this warp. So the best situation is when all the threads have the same execution time. Branches in the code usually make the threads diverge, and thus cause some of the threads to take longer time.
Thank youu/…That answers my question…
In theory, divergent warps are bad. In practice, the hardware handles them very well. Optimizing for divergent warps is almost the lowest on my priority list (the only thing lower is shared memory bank conflicts). I’ve seen cases where adding an if() to avoid some calculations at the cost of divergent branches significantly increased performance (10-15%). I’ve also seen cases where adding an if to avoid some calculations reduced performance. So it really comes down to a case by case basis and you just have to try both ways!