Cannot run model exported from TLT on Jetson's DLA

carlos.alvarez · December 16, 2020, 3:34pm

Description

I am using TLT2.0 (docker image nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3) to do transfer learning with detectnet_v2 resnet18. More precisely I am following this tutorial GitHub - NVIDIA-AI-IOT/face-mask-detection: Face Mask Detection using NVIDIA Transfer Learning Toolkit (TLT) and DeepStream for COVID-19. After training I am exporting the model with float16 precision as an .etl file.

Now I want to run the model with tensorRT on a Jetson Xavier AXG on the DLA. For that I am using the tlt-converter to generate the .engine/.trt file. Because I have tensorrt 6.0 I am using this converter https://developer.nvidia.com/tlt-converter-trt60. After that I am using trtexec to try to make inference on the DLA. Sadly the model only appears to run on the GPU.

Environment

TensorRT Version: 6.0
GPU Type: Xavier AGX
Operating System + Version: Jeptack 4.3

Steps To Reproduce

Exported the trained model with:

tlt-export detectnet_v2 \
            -o resnet18_detector.etl \
            -m resnet18_detector.tlt \
            -k key \
            --data_type fp16

Then on the Jetson, converted the .etl model to a tensorrt engine with:

tlt-converter -k key \
-d "3,544,960"  \
-o "output_cov/Sigmoid,output_bbox/BiasAdd"  \
-e resnet18_detector.trt   \
 -m 16   \
 -t fp16   \
resnet18_detector.etl

But I got some messages that all operations run on GPU. I got this:

[INFO] 
[INFO] --------------- Layers running on DLA: 
[INFO] 
[INFO] --------------- Layers running on GPU: 
[INFO] conv1/convolution + activation_1/Relu, block_1a_conv_1/convolution + block_1a_relu_1/Relu, block_1a_conv_shortcut/convolution, block_1a_conv_2/convolution + add_1/add + block_1a_relu/Relu, block_1b_conv_1/convolution + block_1b_relu_1/Relu, block_1b_conv_2/convolution + add_2/add + block_1b_relu/Relu, block_2a_conv_1/convolution + block_2a_relu_1/Relu, block_2a_conv_shortcut/convolution, block_2a_conv_2/convolution + add_3/add + block_2a_relu/Relu, block_2b_conv_1/convolution + block_2b_relu_1/Relu, block_2b_conv_2/convolution + add_4/add + block_2b_relu/Relu, block_3a_conv_1/convolution + block_3a_relu_1/Relu, block_3a_conv_shortcut/convolution, block_3a_conv_2/convolution + add_5/add + block_3a_relu/Relu, block_3b_conv_1/convolution + block_3b_relu_1/Relu, block_3b_conv_2/convolution + add_6/add + block_3b_relu/Relu, block_4a_conv_1/convolution + block_4a_relu_1/Relu, block_4a_conv_shortcut/convolution, block_4a_conv_2/convolution + add_7/add + block_4a_relu/Relu, block_4b_conv_1/convolution + block_4b_relu_1/Relu, block_4b_conv_2/convolution + add_8/add + block_4b_relu/Relu, output_bbox/convolution, output_cov/convolution, output_cov/Sigmoid, 
[INFO] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[INFO] Detected 1 inputs and 2 output network tensors.

Finally I tried to run it on the DLA:

trtexec --loadEngine=resnet18_detector.trt --batch=1 --useDLACore=0 --fp16 --verbose

But it appears to be using the GPU (checked with jtop GPU consumption). Also because when run without the --useDLACore I got the exact same inference time.

The above mentioned tutorial showed that it was possible to run it in DLA. In which part am I messing it up and how can I make it run on the DLA?

Morganh · December 17, 2020, 2:50am

Moving this topic into TLT forum.

Morganh · December 17, 2020, 3:01am

I am afraid your tlt-converter does not support DLA. You can run “tlt-converter -h” to check.
Please download DLA version https://developer.nvidia.com/assets/TLT/Secure/tlt-converter-7.1-dla.zip and retry.

Or you can deploy your etlt model with Deepstream directly and make sure enable DLA in the config file.

If you want to let DLA run inference, need to set below in the DS config file.
enable-dla = 1
use-dla-core = 1

carlos.alvarez · December 17, 2020, 4:45pm

I struggle a lot finding links to different versions of that tool. Is there any page where I can find different tlt-converter versions? Moreover, is there a tlt-converter 6 with dla? Because I have in my machine tensorrt6 so my guess is that the version you provided is not compatible.

Morganh · December 17, 2020, 5:09pm

Instead, suggest you to deploy your etlt model with Deepstream directly and make sure enable DLA in the config file.

carlos.alvarez · December 17, 2020, 10:18pm

Deepstream is not necessary, I already have the code for deployment directly with tensorrt integrated with the rest of the software stack. I am just needing to generate a valid tensorrt engine with DLA support. Would appreciate a lot if you can answer my previous questions about tlt-converter links and versions :D

Morganh · December 18, 2020, 12:13am

Hey, after running with deepstream and DLA enabled, the trt engine will be generated. That is what you expected.
For tlt-converter dla version for trt6, I will check further. I am afraid it is not available.

Topic		Replies	Views
Cannot run model exported from TLT on Jetson's DLA TensorRT	2	325	December 16, 2020
Tlt-convert on jetson nano TAO Toolkit	6	1847	October 12, 2021
Nvidia Transfer Learning Toolkit tlt-converter for TensorRT 6 TAO Toolkit	16	1766	October 12, 2021
Jetson Xavier - error running tlt-converter TAO Toolkit	10	1218	October 12, 2021
Accessing Jetson's DLA from python TensorRT tensorrt , jetson-inference , python	3	2176	December 1, 2020
Run the TLT models with deepstream on GPU not DLA Jetson Xavier NX dla	2	686	October 18, 2021
General Question about jetson Xavier NX Jetson Xavier NX dla	15	1569	October 18, 2021
transfert learning toolkit-> export model TAO Toolkit	11	3574	October 12, 2021
TLT-converter YOLO3 on xavier nx fail TAO Toolkit	3	1059	October 12, 2021
How to export model using tlt-converter for Jetson Nano TAO Toolkit	69	8476	October 12, 2021

Cannot run model exported from TLT on Jetson's DLA

Description

Environment

Steps To Reproduce

Related topics