Correlate different NVTX ranges with a chosen Range

Good day!

I am running deepstream pipeline with nvinfer and a couple of pre and post precessing steps which inlclude other plugins and probes.
I am measuring a whole cycle that will include pre, post processing custom probes and nvinfer and spans across multiple plugins

How can I query nsys profiling in nsight systems UI to get the idea of a % percentage of time spent in other nvtx ranges that are inside nvtx range for CYCLE for different cycles?

image

@jkreibich can you help with this?

If I understand correctly, you want a report of the NVTX range hierarchy, and what percentage of time a sub-range or sub-tree of ranges took within a parent event?

Unfortunately, we do not provide that directly, but the Stats system does have a report that will display the NVTX push/pop range hierarchy. The nvtx_pushpop_trace report will show all NVTX ranges, arranged by enclosing times, PID, and TID. It also shows a range’s duration, along with the total duration of all sub-ranges. If you were to pull this report into a spreadsheet or other numeric environment, I suspect it wouldn’t be too difficult to calculate the data you’re looking for.

You can get a basic idea of the data available with the GUI display, or the CLI command nsys stats --report nvtx_pushpop_trace <nsys-rep or sqlite file> To output the report to a CSV file, rather than the console, just add the option --output . which will output to the default filename.

Thanks @jkreibich . I will try it out.

Hi @jkreibich . I tried to investigate the report generated by

 nsys stats --report nvtx_pushpop_trace <nsys-rep or sqlite file>

But I don’t find many nvtx ranges. Example: I don’t have buffer_process nvtx range of Nvinfer, but I have other ranges from nvinfer like GstNvInfer: UID=1:queueInput batch_num=5. Which parameters should I use to control those pushpop traces better to have ranges that I am interested to investigate?

There are two kinds of ranges you can have in NVTX. Push-pop ranges (nested ranges that start and end in the same thread) and start-end ranges (ranges that are global to the process and are not restricted to a single thread).

Is there a chance that your code is mixing using these?

See User Guide — nsight-systems 2024.6 documentation (direct link to the NVTX portion of the Nsys docs).

Yes, I use both types of ranges. Are nvinfer ranges also differ and that’s why I don’t see buffer_process when I have queueInput ranges ?

At least CYCLE range starts in one thread and will be ended in another.
Is it possible to match between them?

It should work if you make sure that the CYCLE range is defined by start-end.

I’m going to ask one of the NVTX experts to comment further.

Any range that starts in one thread and ends in another has to be a start/end range, rather than a push/pop. There is the nvtx_startend_sum report, but there is no start/end trace report, largely because the raw SQLite data is about as much data as we have. Because of the flexible nature of start/end ranges, there isn’t a concept of a “tree” or enclosing ranges like there is with push/pop. If you want to see all the push/pop data directly, we can help you with an SQL query for that.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.