Multiple models on DLAs in AGX Xavier 32TOPs

nvdev156 · August 21, 2019, 10:16pm

Can the DLAs on the AGX Xavier run multiple models (networks) at the same time? Or are the DLAs bound to a specific network and impractical to swap in at a fast enough rate? Given there are 2 DLA cores, how will it work as the number of models in use on the Xavier increases?

AastaLLL · August 22, 2019, 7:16am

Hi,

It’s recommended to deploy the same model on the same DLA.

But there are two DLAs on the Xavier, you can deploy two models to the different DLA.
You can use setDLACore(…) to specify which DLA core to use.
[url]TensorRT: nvinfer1::IRuntime Class Reference

Thanks

nvdev156 · August 22, 2019, 1:14pm

Thank you, but this does not quite answer the question. Assume I want to be performing inferences with 4 models at the same time. Can the DLAs be used for all 4 models?
For example:
model1.setDLACore(0)
model2.setDLACore(1)
model3.setDLACore(0)
model4.setDLACore(1)
Then perform inference using all 4 models.

AastaLLL · August 23, 2019, 2:15am

Hi,

This depends on the use case.
There is some overhead when deploying a model.
But to give a further suggestion, we need to know the exact execution time of each model first.

By the way, how about the GPU?
If the GPU is not occupied, it’s recommended to use the following setting for minimal latency.

model1: DLA 0
model2: DLA 1
model3: GPU 0
model4: GPU 0

Also, you can choose which model to put into the DLA based on the supported layer ratio.
Thanks.

LoveNvidia · February 10, 2021, 3:03pm

@AastaLLL ,
I have one model (peoplenet of tlt model), and I want to run same model on GPU and two DLAs,
For this I need to load three times of model?
Is it possible to run two times of model, one for GPU and one for two DLAs as share model? Because when I load three times of model for each ones, The memory of jetson xavier nx is occupied, but when I load two times of model then the memory is sufficient.

eyalhir74 · February 10, 2021, 3:50pm

Hi,
You need to create and load an engine per DLA/GPU. So in your case you should create 3 engines, load the plan files to each one of them and execute the 3 engines.
You also need to build and run inference for each of the hardware components on a different CPU threads and CUDA streams (as far as I could tell).

thanks
Eyal

LoveNvidia · February 10, 2021, 4:00pm

Thanks, @eyalhir74 ,
Did you test with deepstream-python-apps? if yes, I have to run three python app with three different config in separate terminals?

eyalhir74 · February 10, 2021, 4:03pm

Hi,
Sorry, no. I’ve only done C++ and C++ TensorRT API.

thanks
Eyal

LoveNvidia · February 10, 2021, 4:09pm

@eyalhir74 ,
Is it possible to share your codes or reference GitHub?

eyalhir74 · February 10, 2021, 4:14pm

Sorry, its propriety.
However you can look at the code in trtexec under /usr/src/tensorrt/samples/trtexec
And basically create CPU thread + TRT objects (builder/context/engine/runtime) + CUDA stream per Hardware component you want (GPU, DLA 0, DLA 1) and you’re set.

thanks
Eyal

LoveNvidia · February 10, 2021, 4:17pm

@eyalhir74,
Did you run TLT models with your suggest solution?

CPU thread + TRT objects (builder/context/engine/runtime) + CUDA stream per Hardware component.

eyalhir74 · February 10, 2021, 4:20pm

Both a propriety version of resnet and a few other public networks, just to test the solution.
However it doesn’t matter. Any model that you can build for the DLA/GPU and use with trtexec, you can run in the manner I’ve described (I guess unless there are memory issues etc…, which I didn’t see)

thanks
Eyal

w.viscomi · June 10, 2021, 2:56pm

i’m on the Xavier NX

how about python? e.g. if i want to run deepstream_test_1_usb.py in a specific DLA or GPU? thanks

Topic		Replies	Views
How to run two inferences on different DLAs？ TensorRT	2	602	October 27, 2020
Use DLA Jetson AGX Xavier	3	687	October 18, 2021
Run TLT model on GPU and two DLAs simultaneously Jetson Xavier NX tensorrt	2	342	October 18, 2021
General Question about jetson Xavier NX Jetson Xavier NX dla	15	1575	October 18, 2021
Some question about using dual DLA of jetson xavier nx DeepStream SDK	8	2221	October 12, 2021
Deploy three AI model engines on both DLAs and GPU Jetson AGX Xavier tensorrt , jetson-inference , dla , gpu	4	624	September 26, 2023
DLA usage on my Xavier Jetson AGX Xavier dla	5	1720	October 18, 2021
How to use both DLA engines at the same time Jetson AGX Xavier	3	1077	June 28, 2019
Running the same network on both tensor cores and 2 DLAs when exported from ONNX Jetson AGX Xavier	4	570	October 18, 2021
DLA and GPU cores at the same time Jetson AGX Xavier dla	20	10221	October 18, 2021

Multiple models on DLAs in AGX Xavier 32TOPs

Related topics