I’m working on jetson AGX Orin to run pre-trained tensorrt model. The model by defult runs in the CUDA cores but i wanted to free them up for other tasks. I read the following GitHub - NVIDIA-AI-IOT/jetson_dla_tutorial: A tutorial for getting started with the Deep Learning Accelerator (DLA) on NVIDIA Jetson about the use of DLA hardware but i understood that this could be helpful for training models not to run a ready pre-trained model. Please advise if there’s any documentation on how to run a pre-trained model in TensorCores or DLA.
If you have pretrained ONNX model, You can use
trtexec to get TensorRT model with layers offloaded to DLA. Note that, only DLA supported layers can run on DLA. Non DLA supported layers fall back to GPU. Please use
--useDLACore to select DLA core and
--allowGpuFallback to enabled non supported layers to run on GPU.
For usage of TRT tool : Developer Guide :: NVIDIA Deep Learning TensorRT Documentation
DLA supported layers : Developer Guide :: NVIDIA Deep Learning TensorRT Documentation
Thank you for the clarification. So if i have a tensorrt pre-trained model usually this model can’t be used directly as it will be executed only in the GPU and i will need the conversion ONNX → TensorRT through trtexec to allow the use of the DLA?
Yes. You are correct
Also check out the DLA github page for samples and resources: Recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.
We have a FAQ page that addresses some common questions that we see developers run into: Deep-Learning-Accelerator-SW/FAQ
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.