DLA usage on my Xavier

Hi,
I am considering use the dla core on my AGX Xavier, so I use trtexec as test, but I got the following warnings:


I want to ask the following questions:
1.what is the host walltime and percentile time as shown in the log
2.what does this mean: “DLA supports only 3 subgraphs per DLA core”, does this mean my DLA version is too old and I need to update
3.Now I want to run model with gpu , dla0, dla1 at the same time, what should I do?

thanks ahead!

does any people replay?

Hi,

1. It’s CPU host time and the ratio of kernel execution time over host time.

2 This warning indicates that the model already occupies all the DLA resources.
So the extra operation will place on the GPU instead.

3. Each engine can only attach to one target processor.
You can run create three TensorRT engine from same or different models for each processor.

Thanks.

oh, many thanks to your replay, I am really new here !

based on your replay, I realize DLA is better for tiny models, and now my question is:
1.how can I generate models that can be run on DLA?
2.how to bind and specify DLA device so that I could run the DLA model and get inference results? any samples?

thanks a lot !

Hi,

1. You can find here for the supported matrix of DLA:
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#dla_layers

2. You can use --useDLACore=0 for DLA-0 and --useDLACore=1 for DLA-1.
Here is an example for your reference:

Thanks.

1 Like