Profiling TensorRT Inference


we are deploying deep learning models on Jetson NX and need a way to profile our models in order to quickly iterate on new network architectures.
We want to analyze which parts of the Network are fast and which are slow.
Our workflow looks like this: tensorflow model -> Onnx Model -> TensorRT Engine.
Currently we are using the profiling option of trtexec to get some timing information (not tested on jetson yet).

Is this the recommended way or do there exist better tools to visualize the profiling results?



TensorRT Version: 7.1.3
GPU Type: Jetson NX, RTX 2080Ti
TensorFlow Version: 2.3

Edit: Fixed GPU Name

Hi, Request you to share your model and script, so that we can help you better.

Alternatively, you can try running your model with trtexec command.


Hi, thanks for your fast reply. I can not share the model at the moment (it is a detection model) and we are already using trtexec as I described.

The question is whether trtexec is the best option for profiling tensorrt engines, especially on Jetson NX, or whether there exist other tools that also visualize the profiling results.


trtexec Command: ./trtexec --onnx=<model_path.onnx> --int8 --shapes=input_1:1x704x1280x3 --exportTimes=trace.json --dumpProfile --exportProfile=prof.json

Hi @jean.wanka,

Yes. Apart from trtexec, you can refer built-In TensorRT Profiling.

Thank you.