Trtexec performance not close to benchmarks

Hi,

i am doing some benchmark tests on the Jetson AGX Orin DevKit running as a Orin-NX16GB.
For benchmarking i am using the trtexec command

/usr/src/tensorrt/bin/trtexec --onnx=resnet50-v1-12.onnx --fp16

as a model i am using the most recent Resnet50 model from onnx model zoo .
However for the inference benchmark in the performance summary I only get a max throughput of 466 qps which is only a fraction of the official benchmark results .
Why is it that trtexec performs so badly on my onnx-model?

Find attached the printout of my bash.
resnet50_trtexec.log (34.5 KB)

for reference i am using nvidia-jetpack 5.1.2 with TensorRT 8.5.2.2 and cuda 11.4 and L4T-version 35.4.1

Thanks in advance

Hi,

We do have a benchmark script.
Please give it a check:

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.