we are deploying deep learning models on Jetson NX and need a way to profile our models in order to quickly iterate on new network architectures.
We want to analyze which parts of the Network are fast and which are slow.
Our workflow looks like this: tensorflow model -> Onnx Model -> TensorRT Engine.
Currently we are using the profiling option of trtexec to get some timing information (not tested on jetson yet).
Is this the recommended way or do there exist better tools to visualize the profiling results?
TensorRT Version: 7.1.3
GPU Type: Jetson NX, RTX 2080Ti
TensorFlow Version: 2.3
Edit: Fixed GPU Name