Layers not using DLA cores on NX

Description

I’m using a simple YOLO model and trying to convert it to .engine file for running with DeepStream. I’ve observed that when I use trtexec to create engine file, I see following output:

----------------------------------------------------------------
Input filename:   YOLO.onnx
ONNX IR version:  0.0.4
Opset version:    11
Producer name:    pytorch
Producer version: 1.3
Domain:
Model version:    0
Doc string:
----------------------------------------------------------------
[01/14/2021-11:57:14] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/14/2021-11:57:14] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-11:57:14] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-11:57:14] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-11:57:14] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-11:57:15] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-11:57:15] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-11:57:15] [W] [TRT] DLA requests all profiles have same min, max, and opt value. All dla layers are falling back to GPU
[01/14/2021-11:57:16] [I] [TRT]
[01/14/2021-11:57:16] [I] [TRT] --------------- Layers running on DLA:
[01/14/2021-11:57:16] [I] [TRT]
[01/14/2021-11:57:16] [I] [TRT] --------------- Layers running on GPU:
[01/14/2021-11:57:16] [I] [TRT] 1339[Constant], 900[Constant], 461[Constant], (Unnamed Layer* 884) [Identity], (Unnamed Layer* 885) [Shuffle], (Unnamed Layer* 890) [Identity], (Unnamed Layer* 891) [Shuffle], (Unnamed Layer* 1890) [Identity], (Unnamed Layer* 1891) [Shuffle], (Unnamed Layer* 1896) [Identity], (Unnamed Layer* 1897) [Shuffle], (Unnamed Layer* 2896) [Identity], (Unnamed Layer* 2897) [Shuffle], (Unnamed Layer* 2902) [Identity], (Unnamed Layer* 2903) [Shuffle], (Unnamed Layer* 0) [Convolution], (Unnamed Layer* 2) [Activation], (Unnamed Layer* 3) [Convolution], (Unnamed Layer* 5) [Activation], (Unnamed Layer* 6) [Convolution], (Unnamed Layer* 8) [Activation], (Unnamed Layer* 48) [Slice], (Unnamed Layer* 49) [Convolution], (Unnamed Layer* 51) [Activation], (Unnamed Layer* 52) [Convolution], (Unnamed Layer* 54) [Activation], (Unnamed Layer* 56) [Convolution], (Unnamed Layer* 58) [Activation], 141 copy, (Unnamed Layer* 60) [Pooling], (Unnamed Layer* 61) [Convolution], (Unnamed Layer* 63) [Activation], (Unnamed Layer* 103) [Slice], (Unnamed Layer* 104) [Convolution], (Unnamed Layer* 106) [Activation], (Unnamed Layer* 107) [Convolution], (Unnamed Layer* 109) [Activation], (Unnamed Layer* 111) [Convolution], (Unnamed Layer* 113) [Activation], 171 copy, (Unnamed Layer* 115) [Pooling], (Unnamed Layer* 116) [Convolution], (Unnamed Layer* 118) [Activation], (Unnamed Layer* 158) [Slice], (Unnamed Layer* 159) [Convolution], (Unnamed Layer* 161) [Activation], (Unnamed Layer* 162) [Convolution], (Unnamed Layer* 164) [Activation], (Unnamed Layer* 166) [Convolution], (Unnamed Layer* 168) [Activation], 201 copy, (Unnamed Layer* 170) [Pooling], (Unnamed Layer* 171) [Convolution],
....... <and so forth> ........

As you can see there are no layers that are using DLA cores on the NX. I’m surprised to see that. Is there a reason why this is happening?

Environment

TensorRT Version: 7.1 / Latest JetPack:
GPU Type: Jetson NX:
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.3:

Hi,

DLA requests all profiles have same min, max, and opt value. All dla layers are falling back to GPU

Based on the log above, you will need to use the same profiles to run layers on the DLA.

Thanks.

What do you mean the same profile? Sorry I’m not an expert in TensorRT.

OK I figured it might mean the minShape, maxShape and optShape. I’ve replaced them with same values, 1x3x416x416, I am still seeing the same behaviour:

----------------------------------------------------------------
Input filename:   Model.onnx
ONNX IR version:  0.0.4
Opset version:    11
Producer name:    pytorch
Producer version: 1.3
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[01/14/2021-21:02:42] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/14/2021-21:02:42] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-21:02:42] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-21:02:43] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-21:02:43] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-21:02:43] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-21:02:43] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[01/14/2021-21:02:44] [I] [TRT] 
[01/14/2021-21:02:44] [I] [TRT] --------------- Layers running on DLA: 
[01/14/2021-21:02:44] [I] [TRT] 
[01/14/2021-21:02:44] [I] [TRT] --------------- Layers running on GPU: 
[01/14/2021-21:02:44] [I] [TRT] 1406[Constant], 967[Constant], 528[Constant], (Unnamed Layer* 956) [Identity], (Unnamed Layer* 957) [Shuffle], (Unnamed Layer* 962) [Identity], (Unnamed Layer* 963) [Shuffle], (Unnamed Layer* 1962) [Identity], (Unnamed Layer* 1963) [Shuffle], (Unnamed Layer* 1968) [Identity], (Unnamed Layer* 1969) [Shuffle], (Unnamed Layer* 2968) [Identity], (Unnamed Layer* 2969) [Shuffle], (Unnamed Layer* 2974) [Identity], (Unnamed Layer* 2975) [Shuffle], (Unnamed Layer* 0) [Convolution], (Unnamed Layer* 2) [Activation], (Unnamed Layer* 3) [Convolution], (Unnamed Layer* 5) [Activation], (Unnamed Layer* 6) [Convolution], (Unnamed Layer* 8) [Activation], (Unnamed Layer* 48) [Slice], (Unnamed Layer* 49) [Convolution], (Unnamed Layer* 51) [Activation], (Unnamed Layer* 52) [Convolution], (Unnamed Layer* 54) [Activation], (Unnamed Layer* 56) [Convolution], (Unnamed Layer* 58) [Activation], 141 copy, (Unnamed Layer* 60) [Pooling], (Unnamed Layer* 61) [Convolution], (Unnamed Layer* 63) [Activation], (Unnamed Layer* 103) [Slice], (Unnamed Layer* 104) [Convolution], (Unnamed Layer* 106) [Activation], (Unnamed Layer* 107) [Convolution], (Unnamed Layer* 109) [Activation], (Unnamed Layer* 111) [Convolution], (Unnamed Layer* 113) [Activation], 171 copy, (Unnamed Layer* 115) [Pooling], (Unnamed Layer* 116) [Convolution], (Unnamed Layer* 118) [Activation], (Unnamed Layer* 158) [Slice], (Unnamed Layer* 159) [Convolution], (Unnamed Layer* 161) [Activation], (Unnamed Layer* 162) [Convolution], (Unnamed Layer* 164) [Activation], (Unnamed Layer* 166) [Convolution], (Unnamed Layer* 168) [Activation], 201 copy, (Unnamed Layer* 170) [Pooling], (Unnamed Layer* 171) [Convolution], (Unnamed Layer* 173) [Activation], (Unnamed Layer* 174) [Convolution], (Unnamed Layer* 176) [Activation], (Unnamed Layer* 196) [Shuffle], (Unnamed Layer* 225) [Slice], (Unnamed Layer* 247) [Shuffle], 226 copy, (Unnamed Layer* 249) [Convolution], (Unnamed Layer* 251) [Activation], (Unnamed Layer* 1180) [Convolution] || (Unnamed Layer* 252) [Convolution], (Unnamed Layer* 552) [Slice], (Unnamed Layer* 527) [Slice], 
... <and so forth> .....

@AastaLLL Any update on this?

Hi,

Could you share your source code (or command) and the model with us?
So we can check it directly?

Thanks.

How do I privately share it with you?