DLA and GPU running at the same time - performance question

AastaLLL · November 4, 2020, 6:08am

Hi,

Sorry for the late update.

The threading code can be found in this file:
/usr/src/tensorrt/samples/common/sampleInference.cpp

int threadsNum = inference.threads ? inference.streams : 1;
int streamsPerThread  = inference.streams / threadsNum;

 std::vector<std::thread> threads;
 for (int t = 0; t < threadsNum; ++t)
 {
    threads.emplace_back(makeThread(inference, iEnv, sync, t, streamsPerThread, device, trace));
 }
for (auto& th : threads)
{
    th.join();
}

....

Thanks.

eyalhir74 · November 5, 2020, 1:43pm

Hi @AastaLLL,
I’ve used the threading code above that you’ve supplied (its from TensorRT 7 right?)
I still don’t see any better performance, its like they interfere each other, especially with a network that not all layers can run on the DLA and at the beginning and end of the network, falls back to the GPU.

I see ~30% penalty in GPU and DLA performance versus when each of them runs alone.

thanks
Eyal

eyalhir74 · November 8, 2020, 5:31am

@AastaLLL,
Thanks for the assistance, I’m still not seeing the DLA and GPU running at the same time. Could it be that I must use TensorRT 7 and Jetpack 4.4 ? I’m currently using 4.3 and TensorRT 5.1.

And another question, please. If I recall correctly DLA operations do NOT appear in NVVP, right? so if I run my network on the DLA but still see ~40-50% of the timeline in NVVP showing (some?) GPU operations, this is the DLA falling back to the GPU?

thanks
Eyal

AastaLLL · November 23, 2020, 7:16am

Hi,

You can find the detailed support matrix here:

Not all the TensorRT layers have an implementation in the DLA.
For the non-supported layer, DLA will fallback it into GPU and it will use the GPU resources.

So if your model have some fallback layers, it is expected that the performance will be lower if running a GPU pipeline at the same time.
Since Jetson has only one GPU, fallback layer and GPU engine will need to wait for the GPU resource in turns.

Thanks.

Topic		Replies	Views
DLA and GPU cores at the same time Jetson AGX Xavier dla	20	10532	October 18, 2021
Concurrent DLA and GPU calls fail Jetson AGX Xavier dla	10	932	October 18, 2021
DLA / GPU question Jetson AGX Xavier dla	6	1020	October 18, 2021
Use both DLA with NvInfer at the same time in the same process Jetson AGX Xavier dla	12	1115	October 18, 2021
Run GPU and DLAs concurrently Jetson AGX Xavier dla	4	705	October 18, 2021
Unexpected performance loss when using GPU, DLA0, DLA1 simultaneously Jetson Xavier NX tensorrt	6	1143	October 18, 2021
DLA and GPU running at the same time, performance degradation Jetson Xavier NX dla	2	679	October 18, 2021
Run pure conv2d node on DLA makes GPU get slower Jetson AGX Orin tensorrt	8	1509	July 12, 2022
Deep Learning Accelerator problems DRIVE AGX Xavier General	2	1475	October 12, 2021
Dla and tensorcore are used at the same time, performance is degraded Jetson AGX Xavier	9	1025	October 23, 2019

DLA and GPU running at the same time - performance question

Related topics