Is using DLA for inference really more energy efficient??

tbvj5914 · June 23, 2019, 11:43am

Hi, I think it’s hard to believe that DLA is more energy efficient.

I used trtexec to examine inference time and energy consumption of model.
I used Resnet-50 caffe prototxt, but I removed last layer(softmax) and renamed last fully connected layer to prob, to avoid GPUfallback.
The command I used was something like this :

./trtexec --avgRuns=100 --deploy=../models/Resnet_without_prob.prototxt --fp16 --batch=8 --iterations=10000 --output=prob --useDLACore=0 --useSpinWait
./trtexec --avgRuns=100 --deploy=../models/Resnet_without_prob.prototxt --fp16 --batch=8 --iterations=10000 --output=prob --useSpinWait

I examined power consumption with tegrastats. And calculated (img/sec)/(Total Power consumption).

But using GPU with FP16 was always more efficient than DLA with FP16. (I tried MAX N mode, and 15W mode. Both tested after sudo jetson_clocks)

Of course DLA’s power consumption was low, but VDDRQ, SOC, CPU all these stuffs also needed energy, and DLA’s inference time was slower, so total power consumption was bad.

Could you give me some examples that I can check DLA is more energy efficient??

Thanks.

=====================================
I’m really sorry.
I found if I use other networks, such as GoogleNet, using DLA can be more energy efficient.

Thanks.

snarky · June 23, 2019, 6:20pm

In my experience, the main benefit of DLA is when you can run it in 8 bit mode. Assuming your model will still perform well at that resolution, the DLA can run reasonably fast and efficient.
If you don’t care about the CPU at all, try taking all but one CPU core offline, and turn down the memory/GPU clocks. You can actually define your own power profiles in the /etc/nvpmodel.conf file, and apply those.

tbvj5914 · June 24, 2019, 6:32am

Wow…! Is it possible to use Xavier’s DLA in 8 bit mode…?

snarky · June 24, 2019, 6:01pm

It’s supposed to, and NVIDIA has previously promised future support in TensorRT, but the benchmarks just run 16-bit:
https://devtalk.nvidia.com/default/topic/1044598/jetson-agx-xavier/jetson-agx-xavier-deep-learning-inference-benchmarks/

You should be able to use the 8-bit options in TensorRT, though:

dusty_nv · June 24, 2019, 6:06pm

Support for INT8 in Xavier DLA is coming soon, included in the next upcoming JetPack release. See this announcement thread:

[url]https://devtalk.nvidia.com/default/topic/1055632/jetson-agx-xavier/new-jetson-software-modules-and-pricing/[/url]

tbvj5914 · June 25, 2019, 1:16am

Thanks…!!
I’ll wait for next Jetpack…!

Topic		Replies	Views
DLA purpose Jetson AGX Xavier	2	6171	October 18, 2021
Compute time in DLA slower than expected Jetson AGX Orin dla	5	935	July 28, 2023
Model inference Energy consumption of DLA on AGX Orin benchmark problem Jetson AGX Orin power , dla	15	722	February 13, 2024
Does DLA work faster than GPU in fp16 model? Jetson AGX Xavier dla	18	2706	June 8, 2022
how to use DLA Jetson AGX Xavier	4	1382	October 18, 2021
NVDLA inference benchmark using AGX Xavier Jetson AGX Xavier	2	787	October 18, 2021
DLA-v2 is slower than DLA-v1 Jetson AGX Orin tensorrt , jetson-inference	8	2583	July 6, 2022
The power consumption of DLA on orin is much higher than that of GPU？ Jetson AGX Orin tensorrt	5	440	October 24, 2023
Jetson AGX Xavier Deep Learning Inference Benchmarks Jetson AGX Xavier	17	7669	June 15, 2021
Profiling DLA with GPU fallback on Jetson Xavier Jetson AGX Xavier dla	6	1521	August 29, 2021

Is using DLA for inference really more energy efficient??

Related topics