Is there any way to know where the layer(or model) is running on DLA or GPU from the deserialized engine plan file ??
I’m experimenting some models are running simultaneously on ORIN board, utilizing its GPU and DLA0/1 resources.
But It makes me confused which trt(serialized engine plan binary) is running on DLA0/1/GPU.
Please let me know guys :)
Environment
TensorRT Version: 8.4 GPU Type: ORIN Nvidia Driver Version: CUDA Version: CUDNN Version: Operating System + Version: Python Version (if applicable): TensorFlow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if container which image + tag):
Relevant Files
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
EngineInspector will tell you about that to some extent:
If layers are running on DLA, those layers will appear as one DLA runner layer.
you can also use the polygraphy inspect model as a front-end for the engine-inspector.
I think the Engine inspector is not giving the needed info, even it is serialized using kDETAILED config flag.
Is there really any method to giving the HW info (where the layer is running on GPU or DLA)?
Another following questions:
Q1.
What if I serialized my model using config->setProfilingVerbosity(nvinfer1::ProfilingVerbosity::kDEFAULT);
If I want to see the detailed information (kDETAILED info),
then Is re-serialization using kDETAILED is the only possible way to see that detailed info?
Q2.
If I serialize the ONNX model using kDETAILED, is its speed as same to kDEFAULT serialized one?