Cuda graph trace export

I am using nsys to profile an application that uses cuda graphs to execute a bunch of kernels. I use a command similar to this:

usr/local/cuda/nsight-systems-2023.2.3/target-linux-x64/nsys profile --nic-metrics=true --gpu-metrics-device=none --trace=cuda,mpi,nvtx --force-overwrite true -o prof_run_%q{OMPI_COMM_WORLD_RANK}

The trace is recorded correctly, and I can visualize the graph execution in the timeline. However, I am interested in understanding cuda graph execution time stats (percentiles). I made a little script to query the related sqlite database, but I dont seem to find the table or events related to the cuda graph execution. Can someone orient me on what to query to get such duration?

Does the created report contain tables CUDA_GRAPH_NODE_EVENTS and CUDA_GRAPH_EVENTS?

You might need to recollect the report with Nsight Systems version 2024.1 to get the most up to date schema of the aforementioned tables.

See also the sqlite schema of the documentation, link. The most updated information though can be found by observing the schema that the sqlite file shows.

Hi!

Thanks for getting back to me. I tried looking for those tables but

sqlite3 1to8_sender_graph.sqlite “select * FROM CUDA_GRAPH_NODE_EVENTS”
Error: in prepare, no such table: CUDA_GRAPH_NODE_EVENTS

sqlite3 1to8_sender_graph.sqlite “select * FROM CUDA_GRAPH_EVENTS”
Error: in prepare, no such table: CUDA_GRAPH_EVENTS

they dont seem to exist.

I am using

NVIDIA Nsight Systems version 2023.2.3.1001-32894139v0

to record the traces, and

Version: 2023.2.1.122-32598524v0 OSX.

to export and visualize them. I am still not very clear and curious about having to update. The visual part seems to be displaying the graph execution correctly and properly. What seems to be going on is that the graph is not being exported, thus, should I upgrade the version I used to export and visualize to 2024.1 and those tables will be created?

The functionality to collect and display information about CUDA graphs was added in one version of Nsys. Then a release or two later, we saw that people were wanting to construct data mining on those fields and we added them to the export.

Basically you have a version where we supported the functionality, but not the export.

You don’t have to recollect, just get a newer version and reexport the sqlite.

1 Like