NVTX Ranges in CUDA Context in sqllite file

When I add nvtx ranges to my cuda example, I could observe the nvtx ranges are shown in both host context and GPU context as shown in the image

But if I read the generated sqlite file, it contains the nvtx ranges related to only the host. Is there a way to obtain the nvtx range values in GPU context through the sqlite file as well?

The NVTX ranges only actually exist on the CPU side. The Nsys tool then projects them onto the GPU as well, so that you can see them in relation to the kernels spawned.

@jkreibich do we have a script that will allow the user to duplicate that projection?

1 Like

There are two stats reports that deal with mapping NVTX ranges to CUDA kernels: nvtxgpuproj and nvtxkernsum.

For more information, see nsys stats --help-report <report-name>. To see a list of all available reports, run nsys stats --help-reports.

To run the reports, use the command nsys stats --report <report-name> <input-file>

If you’re curious about the SQLite queries used to generate these reports, see the .../reports/<report-name>.py files in the Nsys install directory. The basic approach is to map the CUDA kernel execution on the GPU to the CUDA API call on the CPU side that was used to launch that kernel, and them map that API call to a NVTX range.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.