Nsight compute Profiling divergent branch measurement standards

Hi, I measures the thread divergence for performance analysis.
I have a question about the measurement standard of Avg.divergnet branches

It is indicated as “incremented only when there are two or more active threads with different branch taget (This counter metric represents the avg across all sub-unit instances”

I know thread divergence is a phenomenon occurred on warp that each thread has a different branch conditions.

so I guess a standard for measuring divergnet branches as below

  • divergent branch count divided by the number of warp

If my resaoning is appropriate, the value of avg. divergent branches is constant regardless of the size of workloads , when I use the same algorithm.

Because thread divergence is the phenomenon occured on warp, the value of avg. divergent threads means the count of divergent branches on single warp (isn’t it right??)

I wondering about the relationship between avg.divergent branches and occupancy of kernel and the standard for measuring avg. divergent branches.

No, it is not constant with respect to the size of the workload. The metric for Avg. Divergent Branches is smsp__sass_branch_targets_threads_divergent.avg, you can find this by hovering the metric in the UI, or by inspecting the SourceCounters.section file. From the metrics structure documentation, you can find what the individual parts of the metric name mean. In particular, this metric is collected at the SM Sub-partition (SMSP) level.

If you use a simple test kernel, you can see how the metric value changes, based on the launch configuration:

$ ncu --metrics smsp__sass_branch_targets_threads_divergent.avg,smsp__sass_branch_targets_threads_divergent.sum ./a.out
...
  Diverge(float *) (32, 1, 1)x(32, 1, 1), Context 1, Stream 7, Device 0, CC 8.9
    Section: Command line profiler metrics
    ----------------------------------------------- ----------- ------------
    Metric Name                                     Metric Unit Metric Value
    ----------------------------------------------- ----------- ------------
    smsp__sass_branch_targets_threads_divergent.avg                     0.11
    smsp__sass_branch_targets_threads_divergent.sum                       32
    ----------------------------------------------- ----------- ------------


$ncu --metrics smsp__sass_branch_targets_threads_divergent.avg,smsp__sass_branch_targets_threads_divergent.sum ./a.out 
...
  Diverge(float *) (32, 1, 1)x(64, 1, 1), Context 1, Stream 7, Device 0, CC 8.9
    Section: Command line profiler metrics
    ----------------------------------------------- ----------- ------------
    Metric Name                                     Metric Unit Metric Value
    ----------------------------------------------- ----------- ------------
    smsp__sass_branch_targets_threads_divergent.avg                     0.21
    smsp__sass_branch_targets_threads_divergent.sum                       64
    ----------------------------------------------- ----------- ------------

Thank you for answering and correct my misunderstanding but, I still have a question.

What is the SMSP (SM sub partition) level?
Does it mean warp? or other something?