Actually, this should be a question about NVTX rather than about Nsight Systems. As I didn’t find a topic on NVTX on this forum, I am categorizing this question to Nsight Systems, which is my current use case.
It is my understanding that there are two sets of APIs, one for C++ only (#include <nvtx3/nvtx3.hpp>
) and one for C/C++ (#include <nvtx3/nvToolsExt.h>
) in using NVTX. It is clear to me that we can use NVTX3_FUNC_RANGE()
to specify the whole range of a function. In my use case, I have many different OPs, each possessing a unique Compute
function for its computation. The problem with this range specification is clear: I cannot differentiate Compute
functions between different OPs from Nsight profiling.
To target the problem, I thought about using nvtxRangePushA
and nvtxRangePop
to provide names of my own to these Compute
functions. Then I had a different issue: a function may have conditional return, and without adding nvtxRangePop
to each conditional return, a range is not recorded correctly.
Is there an easy way to (1) provide user-defined text to each function to be profiled (2) ensure range terminates as a function returns?