We can think each range as a node in a tree. CUPTI supports nested range profiling where we can add a (Push/Pop) range inside another range.
To understand both the parameter lets consider below case:
cuptiProfilerPushRange(“RangeA0”) // Push rangeA0 at nesting level 1
Launch kernel A
cuptiProfilerPushRange(“RangeB0”) // Push rangeB0 at nesting level 2
Launch kernel B0
cuptiProfilerPushRange(“RangeC0”) // Push rangeC0 at nesting level 3
Launch kernel C0
cuptiProfilerPopRange() // Pop rangeC0
cuptiProfilerPopRange() // Pop rangeB0
cuptiProfilerPushRange(“RangeB1”) // Push rangeB1 at nesting level 2
Launch kernel B1
cuptiProfilerPopRange() // Pop rangeB1
cuptiProfilerPopRange() // Pop rangeA0
We can visualize the range structure as tree where every range is a node,
A0
|----------B0
| |----------C0
|
|----------B1
^ ^ ^
(1) (2) (3) <----- Nesting Level
While profiling we have two specific parameters in the cuptiProfilerSetConfig API, minNestingLevel and numNestingLevel.
Case1, When minNestingLevel = 1 and numNestingLevel = 3 ← CUPTI will profile all the ranges.
Case2, When minNestingLevel = 1 and numNestingLevel = 2 ← CUPTI profile all the ranges in level 1 and 2 so C0 will be ignored.
Case 3, When minNestingLevel = 2 and numNestingLevel = 2 ← CUPTI will ignore all the ranges whose level is less than 2. So only B0, B1 and C0 will be profiled and A0 will be skipped.
Now CUpti_Profiler_CounterDataImageOptions
is used for creating the counter data image which stores the range data. For case 3, we can set maxNumRanges =3, as we are profiling 3 ranges but maxNumRangeTreeNodes will be 4. CUPTI checks if root node are available or not before profiling the child nodes. Based on the minNestingLevel value, CUPTI will skip the profiling but for creating the tree structure CUPTI need to know what is the maximum number of range tree nodes are there.
If you consider Case 2, both maxNumRanges and maxNumRangeTreeNodes will be 3.
Note that all the above concept are essential when we are doing nested range profiling. For Auto range profiling there is no nesting happens so both the parameters will be same which is equal to number of ranges / kernel launches (In Auto range each kernel is treated as a range) profiled.