tensorRT performance measurement

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.10.0
[1]DRIVE OS 6.0.8.1
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
[1] Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
[1]DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
2.1.0
[1]other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Issue Description
Hi team, I want to measure the performance metrics of tensorRT used in object detection and digit recognition. Is there any tool to do so?
Apart from this, can I compare two samples to monitor the performance of using tensorRT in one sample and without using tensorRT in the same sample?
Please help!

DW don’t have any tools for model perf evaluation.
You can use trtexec tool to get profile your model.

We don’t have any sample for this requirement. You can write script to load ONNX model and get perf metrics and then use TensorRT APIs to convert ONNX to TRT model and get perf metrics. You can refer to TensorRT samples for APIs usage.

Dear @SivaRamaKrishnaNV
my doubt is that, can I use non-tensorRT model in driveworks sample?

No. DW DNN APIs can load only the TRT model. You need to use tensorRT_ optimization tool to get TRT model

Dear @SivaRamaKrishnaNV thanks for info
Could you please share the command to measure the tensorRT performance using trtexec?
This would be really helpful!

Dear @akshay.tupkar ,
You can use --dumpProfile and --verbose flags to get layer profiling and verbose info.
Please see TensorRT/samples/trtexec at release/8.6 · NVIDIA/TensorRT · GitHub for details.
Also, check trtexec -h for details on other parameters

Dear @SivaRamaKrishnaNV
I wanted to compare the performance of tensorRT and non tensorRT model. I executed the command
”./trtexec –loadEngine=mnist.engine –dumpProfile –verbose” to measure the tensorRT performance.
To measure the non tensorRT model performance, I executed the following command in which I only specified the mnist model in ONNX format
”./trtexec –onnx=mnist.onnx –dumpProfile –verbose”
After executing both commands, I noticed the performance remains the same. Why is it happening?
Please help!

As clarified earlier, we don’t have sample to check perf metrics of non TRT model. trtexec is to convert ONNX model to TRT engine and profile the TRT model.

this will trigger the builder and compile an engine plan during the command execution and profile the plan, so no onnx profile is actually obtained, and therefore you see the same profile.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.