Performance about igpu and dla

wang_chen2 · September 10, 2021, 2:49am

Please provide the following info (check/uncheck the boxes after creating this topic):
Software Version
DRIVE OS Linux 5.2.6
DRIVE OS Linux 5.2.0
[yes] DRIVE OS Linux 5.2.0 and DriveWorks 3.5
NVIDIA DRIVE™ Software 10.0 (Linux)
NVIDIA DRIVE™ Software 9.0 (Linux)
other DRIVE OS version
other

Target Operating System
[yes] Linux
QNX
other

Hardware Platform
[yes] NVIDIA DRIVE™ AGX Xavier DevKit (E3550)
NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)
other

SDK Manager Version
[yes] 1.6.0.8170
other

Host Machine Version
[yes] native Ubuntu 18.04
other

Recently, I test the performance of Xavier. I transform a model into igpu, dla0 and dla1.
cost time as following:
igpu1: 105000ms
igpu2: 183000ms
igpu4: 350000ms
dla0: 245000ms
igpu1+dla0:324000ms
igpu*2+dla0+dla1: 355000ms

So, firstly, the dla is slower than igpu very much.
Secondly, when I use igpu and dla together for two thread, I need more time than just using igpu.

The layers of my model can not all run in dla, some layers need to fallback to gpu which will affect igpu seriously?

SivaRamaKrishnaNV · September 10, 2021, 3:05am

Dear @wang_chen2,
iGPU has more DL Tops(perf) than DLA. We geberally notice iGPU takes less time compared to DLA.
When gpufall is enabled, the layers that can’t run on DLA will move back to iGPU. This involves an additional data transfer of intermidiate layer output which causes increase in overall execution time.

wang_chen2 · September 10, 2021, 3:11am

Hi,@SivaRamaKrishnaNV
So, the my cost time is reasonable?
In best case, using dla will not affect igpu?
Are there some suggests for using dla to improve performance？

AastaLLL · September 10, 2021, 3:57am

Hi,

DLA is designed for low power rather than performance.
Usually, we recommend DLA when users want to release GPU or increase throughput.

Not sure if I understand your benchmark result correctly.
When you running GPU along with DLA, you should get 2x throughput although the latency increases.

More, could you try to set the below environment variable to see if it helps?

$ export CUDA_DEVICE_MAX_CONNECTIONS=32

You can find more details about this variable in the below topic:

Thanks.

wang_chen2 · September 10, 2021, 5:24am

Hi,
Yes, when I runing GPU along with DLA, I get 2x throughput but it costs more time than only runing GPU at the same throughput.

I export this and there is no change.

Thank you very much.

AastaLLL · September 13, 2021, 5:40am

Hi,

Could you check the GPU fallback ratio of your model?
You can run a model on DLA and monitor the GPU utilization.

If the model depends on GPU a lot, the data transfer between GPU and DLA may cause a performance issue.

Thanks.

wang_chen2 · September 13, 2021, 5:46am

Hi, AasttaLLL,
Yes, there are 5 layers need to fallbcak to GPU. I am trying to solve the fallback and then to test the performanc again.

AastaLLL · September 15, 2021, 7:19am

Hi,

Do you want to update the layer into DLA compatible?
This will be a better way to separate the dependency between DLA and GPU.

Thanks.

wang_chen2 · September 17, 2021, 1:31am

Yes， I hvae updated the layers into DLA compatible and it works.
Thank you very much.

Topic		Replies	Views
Does DLA work faster than GPU in fp16 model? Jetson AGX Xavier dla	17	3305	June 8, 2022
DLA and GPU cores at the same time Jetson AGX Xavier dla	19	10878	August 27, 2020
DLA enabled Network considerably slower Jetson AGX Xavier dla	1	872	July 13, 2020
When GPU and DLA are used at the same time, the time consumption increases with each other DRIVE AGX Orin General dla , driveos-dl	9	1019	March 9, 2023
DLA / GPU question Jetson AGX Xavier dla	5	1101	May 20, 2020
DLA purpose Jetson AGX Xavier	1	6250	January 21, 2019
DLA and GPU running at the same time, performance degradation Jetson Xavier NX dla	1	721	September 28, 2020
Deep Learning Accelerator problems DRIVE AGX Xavier General	1	1523	July 2, 2019
DLA and GPU running at the same time - performance question Jetson AGX Xavier nvbugs , performance , dla	23	3547	November 23, 2020
using DLA but not accelerate Jetson AGX Xavier	1	1582	August 16, 2019

Performance about igpu and dla

Related topics