Orin low performance on mobilnetv1 ssd

andy.linluo · May 7, 2022, 4:07am

I did some benchmark on Orin dev kit with GitHub - NVIDIA-AI-IOT/jetson_benchmarks: Jetson Benchmark
power mode MAXN
TensorRT 8.4 in Jetson Pack 5
The throughput is less than 300fps. Xavier NX is 800fps and Xavier AGX is 1,500fps.
What went wrong? Please advise.

/usr/src/tensorrt/bin/trtexec --onnx=/home/andy/nvidia/jetson_benchmarks/models/ssd-mobilenet-v1-bs16.onnx --useSpinWait --useCudaGraph --int8 --workspace=4096 --avgRuns=100 --duration=180

[05/06/2022-20:57:24] [I] === Performance summary ===
[05/06/2022-20:57:24] [I] Throughput: 296.952 qps
[05/06/2022-20:57:24] [I] Latency: min = 4.01562 ms, max = 12.0008 ms, mean = 4.78139 ms, median = 4.70898 ms, percentile(99%) = 6.21875 ms
[05/06/2022-20:57:24] [I] Enqueue Time: min = 0 ms, max = 0.867188 ms, mean = 0.0241754 ms, median = 0.0200195 ms, percentile(99%) = 0.0664062 ms
[05/06/2022-20:57:24] [I] H2D Latency: min = 0.503906 ms, max = 3.08443 ms, mean = 1.1139 ms, median = 1.13281 ms, percentile(99%) = 1.1875 ms
[05/06/2022-20:57:24] [I] GPU Compute Time: min = 3.20312 ms, max = 10.6925 ms, mean = 3.36628 ms, median = 3.27734 ms, percentile(99%) = 4.72656 ms
[05/06/2022-20:57:24] [I] D2H Latency: min = 0.140625 ms, max = 0.869156 ms, mean = 0.300852 ms, median = 0.300781 ms, percentile(99%) = 0.3125 ms
[05/06/2022-20:57:24] [I] Total Host Walltime: 180.009 s
[05/06/2022-20:57:24] [I] Total GPU Compute Time: 179.941 s
[05/06/2022-20:57:24] [W] * GPU compute time is unstable, with coefficient of variance = 9.40032%.
[05/06/2022-20:57:24] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.

AastaLLL · May 9, 2022, 3:01am

Hi,

Have you maximized the device performance first?

$ sudo jetson_clocks

We are going to reproduce this issue internally.
Will share more results from our side with you later.

Thanks.

AastaLLL · May 9, 2022, 7:47am

Hi,

We test the same command on Xavier and got the 100 qps.
It seems that Orin still outputs much better performance.

Could you share how you get the 1,500fps on the AGX Xavier?
Thanks.

andy.linluo · May 10, 2022, 4:48am

@AastaLLL , Compare NVIDIA Jetson Xavier NX with Jetson TX2 Developer Kits - Latest Open Tech From Seeed I found Xavier performance number from this link.

AastaLLL · May 10, 2022, 5:27am

Hi,

Please noted that the throughput shared on the page has taken the two extra DLAs into account.

More, the unit in the table is frame but the TensorRT qps is calculated by the inference frequency.
In this use case, the model has batch size 16, which means 16 images can be done per inference.

Thanks.

Splendor027 · May 24, 2022, 7:09pm

What do you mean by GPU + 2 extra DLA into account? I want to understand the scenario/scheduling that you test the networks:

1- GPU + 2 DLA runs the same network with different execution at the same time. You add the throughput at the end?
2- GPU + 2 DLA run the same network with different executions on different time intervals, so that there is no contention. You add the throughput at the end?
3- Any other possible scenario?

Any clarification would be greatly appreciated.

kayccc · June 1, 2022, 1:58am

HI Splendor027,

Please help to open a new topic for your issue. Thanks

Topic		Replies	Views
Orin NX performances - comparing with Xavier NX Jetson Orin NX jetson-inference	6	2384	August 2, 2023
Orin NX 10% slower than Xavier AGX Jetson Orin NX performance	4	81	January 2, 2025
DLA performance less (around half) than what's expected Jetson AGX Orin dla	6	52	December 9, 2024
Can't run nvcr.io/nvidia/l4t-tensorrt:r8.2.1-runtime on Orin AGX Jetson AGX Orin tensorrt	19	1157	May 13, 2022
AGX Orin DeepStream 6.1.1 FPS of deepstream-app use DLA low than AGX Xavier DeepStream SDK	19	886	November 2, 2022
Performance of Orin vs Xavier Jetson AGX Orin jetson	5	813	September 9, 2024
DLA-v2 is slower than DLA-v1 Jetson AGX Orin tensorrt , jetson-inference	8	2492	July 6, 2022
Low performance while running model on DLA0, DLA1, and GPU at the same time on Jetson AGX Orin 64 GB Jetson Orin NX dla	7	958	February 14, 2023
Keys to optimization a network on AGX Orin DLA for latency Jetson AGX Orin tensorrt , dla	2	829	October 6, 2023
About Orin CPU performance vs Xavier CPU Jetson AGX Orin performance	8	5662	April 14, 2022

Orin low performance on mobilnetv1 ssd

Related topics