DLA-v2 is slower than DLA-v1

Splendor027 · May 24, 2022, 6:47pm

Hi,

I’m testing AGX Orin’s NVDLA. The neural network inference results are significantly slower than Xavier-AGX and Xavier-NX. I’m following the instructions for neural networks on building the engines: GitHub - dusty-nv/jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.. I’m ready to provide more details if necessary.

System details:
Xavier AGX ORIN Developer KIT
Power mode: MAXN
nv_tegra_release output: # R34 (release), REVISION: 1.0
Tensort 8.4.0

Results comparison(Orin vs Xavier AGX):
Alexnet: 78ms vs 30 ms
Googlenet: 8.02ms vs 5.5ms
Vgg-19: 49.8ms vs 22ms

AastaLLL · May 25, 2022, 7:33am

Hi,

Confirm that we can reproduce the performance difference internally.

We are checking this issue with our internal team.
Will share more information with you later.

Splendor027 · May 25, 2022, 4:07pm

Thank you for confirming! Besides, I’m using TensorRT scripts instead of jetson-inference repo. I’m attaching here to give a reference.

building_engine_gpu_or_dla.py (1.7 KB)

The command that I use for tensorrt execution after building the engine: trtexec --iterations=100 --warmUp=2000 --batch=1 --useDLACore=0 --dumpProfile --loadEngine=alexnet_batch1_dla.engine

Another question: I guess we can’t install any other Jetpack version 5.0.1/5.0, so directly no different version other than TensorRT 8.4.0. May I also ask this issue is reproducible in older TensorRT versions? I’m asking this because newer JetPack version may take some time to be released, which is quite understandable.

jaybdub · May 25, 2022, 7:33pm

Hi Splendor027,

If you haven’t already, I would recommend testing the models with INT8 precision enabled.

On a related note, we now have the following GitHub project to help getting started with the DLA. It covers defining and profiling a model, tweaking the model for better DLA compatibility, and performing INT8 calibration. It may help introduce you to some concepts related to working with the DLA.

Please let me know if you have any questions, or feel free to open an issue with feedback on the tutorial if you do take a look.

Best,
John

Splendor027 · May 25, 2022, 9:34pm

Hello @jaybdub,

Your work seems pretty awesome. I had been planning to create such a repo for a while since there is no detailed one. Thanks for sharing this publicly!

However, I could not observe any significant change in AGX Orin in your repo either. Could you please post any neural network results here on AGX Orin if you have any?

AastaLLL · May 26, 2022, 2:55am

Hi,

We have checked this issue with our internal team and this is expected.
Due to the difference in hardware specification, a relative increase in latency is expected when running FP16 conv operations on Orin DLA as compared to running on Xavier DLA.

Thanks.

Splendor027 · May 30, 2022, 9:35pm

This is totally understandable. I appreciate your help on this @AastaLLL . One more note to my initial post: The power results on Orin AGX seem to be expected compared to Xavier AGX.

May I ask for further details on the Orin DLA? Being 5-10x slower than the GPU on Orin makes DLA unusable for any software purposes?

Additionally, we have been introduced as 9x speed-up for Orin DLA. May I ask if there is any type of application domain that we can use DLA with such performance? Sorry to push for frequent questions, but better performance on the DLA side motivated me a lot for my research!

AastaLLL · June 16, 2022, 5:34am

Hi,

We got further details from our internal team.

Orin’s DLA has more int8 dense TOPs but fewer fp16 TOPs.
So if you run the model in int8 mode, it’s expected to get better performance compared to Xavier’s DLA.

Thanks.

system · July 6, 2022, 7:38am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Compute time in DLA slower than expected Jetson AGX Orin dla	5	1048	July 28, 2023
Jetson Orin AGX DLA does't works normal, infer speed is lower than without DLA Jetson AGX Orin dla	6	214	April 24, 2025
Why is the inference speed of DLA on agx orin much slower than that without DLA? TensorRT dla	1	86	March 28, 2025
Model running on DLA with TensoRT(8.4.0) is slower than TensorRT(8.3.0) Jetson AGX Orin tensorrt , dla	4	784	July 28, 2022
Getting less throughput while enabling DLAs on Jetson AGX Orin Jetson AGX Orin dla	5	852	February 23, 2023
Keys to optimization a network on AGX Orin DLA for latency Jetson AGX Orin tensorrt , dla	2	1028	October 6, 2023
Why yolox inference time with DLA is longer than without DLA ，81 ms vs 8 ms? Jetson AGX Orin dla	5	662	June 9, 2023
Run AI models completely on Jetson AGX Orin DLAs Jetson Nano dla	4	555	April 20, 2024
Unable to verify Xavier inference benchmarks Jetson AGX Xavier	17	2440	October 18, 2021
Running inference in Jetson Orin NX with TensorRT Jetson Orin NX tensorrt	2	90	July 29, 2025

DLA-v2 is slower than DLA-v1

Related topics