I need to automatize time analysis of tensorrt ExecutionContext::execute execution times.
I went through sqlite database tables.
I found the tensorrt NVTX events listing in the NvtxPushPopRanges table.
But when I visualize the Nsight timeline, when I click right on the event StatefulPartionnedCall/… for each layer, I get the Kernel Call and the Kernel corresponding to it. The kernel Call and the Kernel have a correlation ID, but I don’t see how to link the Kernel Call and NVTX events.
I would like to get this hierachy in a file with the different properties of the events for all the “execute” invocation in the tensorrt application.
In the end I would like to get these information for each Execution::Context::execute Call like this :
NVTX TensorRT # Execution::Context::execute duration
NVTX TensorRT # _ |-- StatefullPartionnedCall/model/layer1 duration
CUDA API_____ # ___|-- Call to Kernel duration
CUDA API _____# _____ |-- Kernel duration
CUDA API ____ # __ |-- Call to Kernel duration
CUDA API ____ # _____ |-- Kernel duration
NVTX TensorRT # _ |-- StatefullPartionnedCall/model/layer2 duration
CUDA API _____ # __|-- Call to Kernel duration
CUDA API _____ # _____ |-- Kernel duration