Orin nano/nx ResNet-50 benchmark on R36.4.3(jetpack6.2)

Hi,

Is there any official benchmark for ResNet-50 on orin nx and orin nano in super/super-maxn power mode?
or
is there any tool that I can do it by myself?
because the jetson_benchmarks is not supported anymore.

Thx
Yen

Hi,

You can try the MLC benchmark to compare the difference of super mode.

Thanks.

Hi,

We would like to run resnet50, not LLM or any other language models.
please help us to get fps and fps/w on orin nx/orin nano in super/super-maxn power mode.

Thx
Yen

Hi,

We don’t have official table of resent50 for the super mode.
But you can get the score with trtexec binary directly.

$ /usr/src/tensorrt/bin/trtexec --onnx=/usr/src/tensorrt/data/resnet50/ResNet50.onnx

Thanks.

Hi

Do you know how to calculate the fps from the trtexec result below.
(orin nx 16G R36.4.3)

[03/05/2025-15:01:38] [I] === Performance summary ===
[03/05/2025-15:01:38] [I] Throughput: 292.075 qps
[03/05/2025-15:01:38] [I] Latency: min = 3.43835 ms, max = 3.51733 ms, mean = 3.46588 ms, median = 3.46564 ms, percentile(90%) = 3.47736 ms, percentile(95%) = 3.48071 ms, percentile(99%) = 3.4942 ms
[03/05/2025-15:01:38] [I] Enqueue Time: min = 0.76413 ms, max = 0.911865 ms, mean = 0.810784 ms, median = 0.801819 ms, percentile(90%) = 0.85376 ms, percentile(95%) = 0.875977 ms, percentile(99%) = 0.89624 ms
[03/05/2025-15:01:38] [I] H2D Latency: min = 0.0372314 ms, max = 0.0876465 ms, mean = 0.0432859 ms, median = 0.0419312 ms, percentile(90%) = 0.0477295 ms, percentile(95%) = 0.0490723 ms, percentile(99%) = 0.0560303 ms
[03/05/2025-15:01:38] [I] GPU Compute Time: min = 3.39075 ms, max = 3.44867 ms, mean = 3.41752 ms, median = 3.41748 ms, percentile(90%) = 3.42749 ms, percentile(95%) = 3.43127 ms, percentile(99%) = 3.43945 ms
[03/05/2025-15:01:38] [I] D2H Latency: min = 0.00366211 ms, max = 0.00688171 ms, mean = 0.00507489 ms, median = 0.00512695 ms, percentile(90%) = 0.00585938 ms, percentile(95%) = 0.00592041 ms, percentile(99%) = 0.00640869 ms
[03/05/2025-15:01:38] [I] Total Host Walltime: 3.0095 s
[03/05/2025-15:01:38] [I] Total GPU Compute Time: 3.004 s
[03/05/2025-15:01:38] [I] Explanations of the performance metrics are printed in the verbose logs.
[03/05/2025-15:01:38] [I]
&&&& PASSED TensorRT.trtexec [TensorRT v100300] # /usr/src/tensorrt/bin/trtexec --onnx=/usr/src/tensorrt/data/resnet50/ResNet50.onnx
p@p-desktop:~$

Thx
Yen

Hi,

qps indicates the number of inferences per second.
So if you run N frames per inference (batchsize=N), the fps = N * qps.

Thanks.

Hi

So my fps is about 292, this is much lower than official benchmark result.
official benchmark is 2517, why is my fps so low?

Thx
Yen

Hi,

Have you maximized the device performance first?

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

The precision used for the benchmark results is INT8 so please try to run trtexec with the --int8 flag.
You can find the benchmark source in the below link:

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.