The way to know where the layer is running of TRT file


Dear friends,

Is there any way to know where the layer(or model) is running on DLA or GPU from the deserialized engine plan file ??

I’m experimenting some models are running simultaneously on ORIN board, utilizing its GPU and DLA0/1 resources.
But It makes me confused which trt(serialized engine plan binary) is running on DLA0/1/GPU.

Please let me know guys :)


TensorRT Version: 8.4
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Please check the below links, as they might answer your concerns.


Dear NVES,
The documentation only informs the way to set the DL network to run on DLA (or DLA standalone).

What I want to know is, where the deserialized engine plan file is running on DLA or GPU.

In the deserialization phase, there are only nvinfer1::ICudaEngine, nvinfer1::IExecutionContext, and nvinfer1::IRuntime.

Is there any way to know where the exectuion context is running on after the deserialization phase?


EngineInspector will tell you about that to some extent:

If layers are running on DLA, those layers will appear as one DLA runner layer.
you can also use the polygraphy inspect model as a front-end for the engine-inspector.

Thank you.

Thank you @spolisetty.

I think the Engine inspector is not giving the needed info, even it is serialized using kDETAILED config flag.

Is there really any method to giving the HW info (where the layer is running on GPU or DLA)?

Another following questions:

What if I serialized my model using config->setProfilingVerbosity(nvinfer1::ProfilingVerbosity::kDEFAULT);
If I want to see the detailed information (kDETAILED info),
then Is re-serialization using kDETAILED is the only possible way to see that detailed info?

If I serialize the ONNX model using kDETAILED, is its speed as same to kDEFAULT serialized one?


Sorry for the delayed response.

May be your network layers are not running on the DLA. Could you please let us know the platform you’re using.

Q1 Yes, the engine will have more info stored.
Q2 Yes, that should be the same.

Thank you.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.