Hi all
I have only one simple question: when I profile mi application with visual profiler and I look at the “Profiler Counter Plot” I see that the most of the time spent in executing my kernel is used for “instructions” (about 80%). The second place is occupied by “branches” (about 10%).
My question is: to optimizing my application I need to focus on branches? Or, in other words, the “instructions” counter says only how many time is spent in processing data, am I right?
It is not very clear to me why to report this value as a percentage… How should this percentages be used to optimize the code? Is it bad to have a lot of branches? If they are not divergent I thought it should not be a problem, right?