I am running deepstream pipeline with nvinfer and a couple of pre and post precessing steps which inlclude other plugins and probes.
I am measuring a whole cycle that will include pre, post processing custom probes and nvinfer and spans across multiple plugins
How can I query nsys profiling in nsight systems UI to get the idea of a % percentage of time spent in other nvtx ranges that are inside nvtx range for CYCLE for different cycles?
If I understand correctly, you want a report of the NVTX range hierarchy, and what percentage of time a sub-range or sub-tree of ranges took within a parent event?
Unfortunately, we do not provide that directly, but the Stats system does have a report that will display the NVTX push/pop range hierarchy. The nvtx_pushpop_trace report will show all NVTX ranges, arranged by enclosing times, PID, and TID. It also shows a range’s duration, along with the total duration of all sub-ranges. If you were to pull this report into a spreadsheet or other numeric environment, I suspect it wouldn’t be too difficult to calculate the data you’re looking for.
You can get a basic idea of the data available with the GUI display, or the CLI command nsys stats --report nvtx_pushpop_trace <nsys-rep or sqlite file> To output the report to a CSV file, rather than the console, just add the option --output . which will output to the default filename.
Hi @jkreibich . I tried to investigate the report generated by
nsys stats --report nvtx_pushpop_trace <nsys-rep or sqlite file>
But I don’t find many nvtx ranges. Example: I don’t have buffer_process nvtx range of Nvinfer, but I have other ranges from nvinfer like GstNvInfer: UID=1:queueInput batch_num=5. Which parameters should I use to control those pushpop traces better to have ranges that I am interested to investigate?
There are two kinds of ranges you can have in NVTX. Push-pop ranges (nested ranges that start and end in the same thread) and start-end ranges (ranges that are global to the process and are not restricted to a single thread).
Is there a chance that your code is mixing using these?
Any range that starts in one thread and ends in another has to be a start/end range, rather than a push/pop. There is the nvtx_startend_sum report, but there is no start/end trace report, largely because the raw SQLite data is about as much data as we have. Because of the flexible nature of start/end ranges, there isn’t a concept of a “tree” or enclosing ranges like there is with push/pop. If you want to see all the push/pop data directly, we can help you with an SQL query for that.