Is there a way that I can use NVTX range to get the runtime of everything that is enveloped between the start and the end of the NVTX range?
nvtxRangeId_t tp_nvtxrange = nvtxRangeStartA(“cudaFree_71”);
The muFunc() above is a function with a mix of CPU and GPU activity. Meaning there is some CPU activity being performed with othr GPU kernels being called inside.
I am looking for a way to get the runtime of the muFunc() function using NVTX. Is there any property that NVTX range might have to hold the runtime of the myFunc() ?
According to the discussion in the following thread, I was suggested to use the “duration” of NVTX range. I am not quite sure what this duration is:
Does this mean that I use some standard C++ API to measure this “duration” of NVTX range? Or does NVTX has some sort of functionality to get this duration of NVTX range?
Please let me know!