Jetson Orin AGX DLA does't works normal, infer speed is lower than without DLA

feng871345432 · April 10, 2025, 9:04am

Hi there,

I try to use DLA to accelerate my inference speed, I follow the guide use trtexec to convert onnx model to TensorRT, I tried to convert it with option useDLACore=0 and without it, I found that without DLA seems much faster than with DLA. Could you help to explain the reason? The onnx model as attachment.
example.zip (23.4 MB)

AastaLLL · April 11, 2025, 3:14am

Hi,

DLA targets for offload GPU workload instead of performance.
Please find below for more info:

Q: Why does my network run slower when using DLA than without DLA?

A: DLA was designed to maximize energy efficiency. Depending on the features supported by DLA and the features supported by the GPU, either implementation can be more performant. Your chosen implementation depends on your latency or throughput requirements and power budget. Since all DLA engines are independent of the GPU and each other, you could also use both implementations to increase the throughput of your network further.

Thanks.

feng871345432 · April 14, 2025, 2:17am

Hi,
I am concerned about the GPU and DLA, What I want is to get low lantency when I infer with my model,
I tried with Orin NX and Orin Nano, I found that the Orin Nano even faster than Orin NX(it sounds amazing), So I compare the table sheet of two device, I found that the GPU frequency is 918MHz of Orin Nano(25w), but the 408MHz of OrinNX, so it is much slower than the Orin Nano. I found that it has DLA, so I try to utilize the DLA to speedup the inference, that’s why I ask this question, it seems Orin NX with DLA even lower than the Orin NX without DLA.
So my clear question is how can I get the better throughout(low lantency) with Orin NX, It at least better than Orin Nano.

AastaLLL · April 14, 2025, 5:59am

Hi,

Please note that Orin Nano doesn’t have DLA.
So all the inference will be run on GPU.

Have you tried the MaxN mode for the Orin NX, the clock rate is 918 and much higher than the 25W mode.

Thanks.

feng871345432 · April 14, 2025, 6:04am

Thanks for your reply. @AastaLLL
Yes, I know that Orin Nano has no DLA, so it runs fully on GPU.
I tried the MaxN mode of Orin NX, it’s speed is similar to the Orin Nano.
So that’s why I feel confused, I want to get more computer power so I use Orin NX insted of Orin Nano, but I found that it is amost the same with the MaxN mode. So I doubt my choice whether I need to use Orin NX, because it doesn’t provide more throughput to run my model.
Any advice from your side about how to select the Orin NX and Orin Nano? And any suggestion about which senario shoud I use Orin NX and which senario I can use Orin Nano is enough.

Thanks so much

AastaLLL · April 24, 2025, 6:28am

Hi,

Sorry for the late update.

Based on our document for r36.4.3:

Orin NX super:
8x CPU @1984Hz, GPU @1173Hz, EMC @3200Hz

Orin Nano super:
6x CPU @1728Hz, GPU @1020Hz, EMC @2133Hz

The clocks in Orin NX are higher than the Orin Nano.
So it’s expected to have better performance/throughput on the Orin NX.

Thanks.

Topic		Replies	Views
Why is the inference speed of DLA on agx orin much slower than that without DLA? TensorRT dla	1	102	March 28, 2025
How to improve performance of Jetson Orin NX Jetson Orin NX dla	2	943	April 18, 2025
DLA-v2 is slower than DLA-v1 Jetson AGX Orin tensorrt , jetson-inference	8	2892	July 6, 2022
Getting less throughput while enabling DLAs on Jetson AGX Orin Jetson AGX Orin dla	5	882	February 23, 2023
Compute time in DLA slower than expected Jetson AGX Orin dla	5	1092	July 28, 2023
Why yolox inference time with DLA is longer than without DLA ，81 ms vs 8 ms? Jetson AGX Orin dla	5	700	June 9, 2023
Keys to optimization a network on AGX Orin DLA for latency Jetson AGX Orin tensorrt , dla	2	1096	October 6, 2023
The Throughput is too slow in Nvidia jetson AGX ORin DLA Jetson AGX Orin cuda , cudnn , dla	4	629	January 31, 2024
Run AI models completely on Jetson AGX Orin DLAs Jetson Nano dla	4	607	April 20, 2024
Big difference between using DLA core and not using DLA core Jetson Xavier NX tensorrt , dla	4	3197	October 18, 2021

Jetson Orin AGX DLA does't works normal, infer speed is lower than without DLA

Related topics