Description
I was trying to apply DLA on TensorRT model that I encountered the issue of “DLA Node compilation Failed.”
The model can successfully be converted to TensorRT without using DLA via trtexec
tool.
Here is my command.
/usr/src/tensorrt/bin/trtexec --onnx=model.onnx --saveEngine=model.trt --fp16 --explicitBatch --workspace=318
However, if I enalbe the DLA, it will show this error message. And I searched on online, most of solutions were solved by adjusting higher workspace but it still cannot work for my case.
/usr/src/tensorrt/bin/trtexec --onnx=model.onnx --saveEngine=model.trt --useDLACore=1 --fp16 --allowGPUFallback --explicitBatch --workspace=2048
I have tried many different values such as 300, 2048, 4096, etc…
Here is the error message:
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=model.onnx --saveEngine=model.trt --useDLACore=1 --fp16 --allowGPUFallback --explicitBatch --workspace=2048
[08/11/2021-09:10:06] [I] === Model Options ===
[08/11/2021-09:10:06] [I] Format: ONNX
[08/11/2021-09:10:06] [I] Model: model.onnx
[08/11/2021-09:10:06] [I] Output:
[08/11/2021-09:10:06] [I] === Build Options ===
[08/11/2021-09:10:06] [I] Max batch: explicit
[08/11/2021-09:10:06] [I] Workspace: 2048 MB
[08/11/2021-09:10:06] [I] minTiming: 1
[08/11/2021-09:10:06] [I] avgTiming: 8
[08/11/2021-09:10:06] [I] Precision: FP32+FP16
[08/11/2021-09:10:06] [I] Calibration:
[08/11/2021-09:10:06] [I] Safe mode: Disabled
[08/11/2021-09:10:06] [I] Save engine: model.trt
[08/11/2021-09:10:06] [I] Load engine:
[08/11/2021-09:10:06] [I] Builder Cache: Enabled
[08/11/2021-09:10:06] [I] NVTX verbosity: 0
[08/11/2021-09:10:06] [I] Inputs format: fp32:CHW
[08/11/2021-09:10:06] [I] Outputs format: fp32:CHW
[08/11/2021-09:10:06] [I] Input build shapes: model
[08/11/2021-09:10:06] [I] Input calibration shapes: model
[08/11/2021-09:10:06] [I] === System Options ===
[08/11/2021-09:10:06] [I] Device: 0
[08/11/2021-09:10:06] [I] DLACore: 1(With GPU fallback)
[08/11/2021-09:10:06] [I] Plugins:
[08/11/2021-09:10:06] [I] === Inference Options ===
[08/11/2021-09:10:06] [I] Batch: Explicit
[08/11/2021-09:10:06] [I] Input inference shapes: model
[08/11/2021-09:10:06] [I] Iterations: 10
[08/11/2021-09:10:06] [I] Duration: 3s (+ 200ms warm up)
[08/11/2021-09:10:06] [I] Sleep time: 0ms
[08/11/2021-09:10:06] [I] Streams: 1
[08/11/2021-09:10:06] [I] ExposeDMA: Disabled
[08/11/2021-09:10:06] [I] Spin-wait: Disabled
[08/11/2021-09:10:06] [I] Multithreading: Disabled
[08/11/2021-09:10:06] [I] CUDA Graph: Disabled
[08/11/2021-09:10:06] [I] Skip inference: Disabled
[08/11/2021-09:10:06] [I] Inputs:
[08/11/2021-09:10:06] [I] === Reporting Options ===
[08/11/2021-09:10:06] [I] Verbose: Disabled
[08/11/2021-09:10:06] [I] Averages: 10 inferences
[08/11/2021-09:10:06] [I] Percentile: 99
[08/11/2021-09:10:06] [I] Dump output: Disabled
[08/11/2021-09:10:06] [I] Profile: Disabled
[08/11/2021-09:10:06] [I] Export timing to JSON file:
[08/11/2021-09:10:06] [I] Export output to JSON file:
[08/11/2021-09:10:06] [I] Export profile to JSON file:
[08/11/2021-09:10:06] [I]
----------------------------------------------------------------
Input filename: model.onnx
ONNX IR version: 0.0.6
Opset version: 11
Producer name: pytorch
Producer version: 1.5
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[08/11/2021-09:10:08] [W] [TRT] ConvTranspose_134: DLA cores do not support more than 1 groups.
[08/11/2021-09:10:08] [W] [TRT] Default DLA is enabled but layer ConvTranspose_134 is not supported on DLA, falling back to GPU.
[08/11/2021-09:10:08] [W] [TRT] ConvTranspose_142: DLA cores do not support more than 1 groups.
[08/11/2021-09:10:08] [W] [TRT] Default DLA is enabled but layer ConvTranspose_142 is not supported on DLA, falling back to GPU.
[08/11/2021-09:10:08] [W] [TRT] ConvTranspose_146: DLA cores do not support more than 1 groups.
[08/11/2021-09:10:08] [W] [TRT] Default DLA is enabled but layer ConvTranspose_146 is not supported on DLA, falling back to GPU.
[08/11/2021-09:10:08] [W] [TRT] ConvTranspose_158: DLA cores do not support more than 1 groups.
[08/11/2021-09:10:08] [W] [TRT] Default DLA is enabled but layer ConvTranspose_158 is not supported on DLA, falling back to GPU.
[08/11/2021-09:10:08] [W] [TRT] ConvTranspose_162: DLA cores do not support more than 1 groups.
[08/11/2021-09:10:08] [W] [TRT] Default DLA is enabled but layer ConvTranspose_162 is not supported on DLA, falling back to GPU.
[08/11/2021-09:10:08] [W] [TRT] ConvTranspose_166: DLA cores do not support more than 1 groups.
[08/11/2021-09:10:08] [W] [TRT] Default DLA is enabled but layer ConvTranspose_166 is not supported on DLA, falling back to GPU.
[08/11/2021-09:10:09] [I] [TRT]
[08/11/2021-09:10:09] [I] [TRT] --------------- Layers running on DLA:
[08/11/2021-09:10:09] [I] [TRT] {Conv_0,BatchNormalization_1,Relu_2,Conv_3,BatchNormalization_4,Relu_5,Conv_6,BatchNormalization_7,Relu_8,MaxPool_9,Conv_10,BatchNormalization_11,Conv_12,BatchNormalization_13,Relu_14,Conv_15,BatchNormalization_16,Add_17,Relu_18,Conv_19,BatchNormalization_20,Relu_21,Conv_22,BatchNormalization_23,Add_24,Relu_25,Concat_26,Conv_27,BatchNormalization_28,Relu_29,MaxPool_30,MaxPool_31,Conv_32,BatchNormalization_33,Conv_34,BatchNormalization_35,Relu_36,Conv_37,BatchNormalization_38,Add_39,Relu_40,Conv_41,BatchNormalization_42,Relu_43,Conv_44,BatchNormalization_45,Add_46,Relu_47,Concat_48,Conv_49,BatchNormalization_50,Relu_51,Conv_52,BatchNormalization_53,Relu_54,Conv_55,BatchNormalization_56,Add_57,Relu_58,Conv_59,BatchNormalization_60,Relu_61,Conv_62,BatchNormalization_63,Add_64, (skip it){Conv_136,BatchNormalization_137,Relu_138,Conv_143,BatchNormalization_144,Relu_145}, {Conv_148,BatchNormalization_149,Relu_150,Conv_159,BatchNormalization_160,Relu_161}, {Conv_168,BatchNormalization_169,Relu_170}, {Conv_152,BatchNormalization_153,Relu_154,Conv_163,BatchNormalization_164,Relu_165}, {Conv_172,BatchNormalization_173,Relu_174}, {Conv_176,BatchNormalization_177,Relu_178,Conv_179,Relu_180,Conv_181,Conv_182,Relu_183,Conv_184,Conv_185,Relu_186,Conv_187,Conv_188,Relu_189,Conv_190,Conv_191,Relu_192,Conv_193,Conv_194,Relu_195,Conv_196},
[08/11/2021-09:10:09] [I] [TRT] --------------- Layers running on GPU:
[08/11/2021-09:10:09] [I] [TRT] ConvTranspose_134, ConvTranspose_142, ConvTranspose_158, 448 copy, 473 copy, 408 copy, 481 copy, 368 copy, 497 copy, ConvTranspose_146, ConvTranspose_162, 489 copy, 485 copy, 509 copy, 501 copy, ConvTranspose_166, 513 copy, 505 copy,
[08/11/2021-09:10:11] [W] [TRT] DLA Node compilation Failed.
[08/11/2021-09:10:11] [W] [TRT] DLA Node compilation Failed.
[08/11/2021-09:10:11] [E] [TRT] Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() if using IBuilder::buildEngineWithConfig, or IBuilder::setMaxWorkspaceSize() if using IBuilder::buildCudaEngine.
[08/11/2021-09:10:11] [E] [TRT] ../builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node {Conv_0,BatchNormalization_1,Relu_2,Conv_3,BatchNormalization_4,Relu_5,Conv_6,BatchNormalization_7,Relu_8,MaxPool_9,Conv_10,BatchNormalization_11,Conv_12,BatchNormalization_13,Relu_14,Conv_15,BatchNormalization_16,Add_17,Relu_18,Conv_19,BatchNormalization_20,Relu_21,Conv_22, (skip it),Relu_119,Conv_120,BatchNormalization_121,Relu_122,Conv_123,BatchNormalization_124,Add_125,Relu_126,Concat_127,Conv_128,BatchNormalization_129,Relu_130,Conv_131,BatchNormalization_132,Relu_133,Conv_139,BatchNormalization_140,Relu_141,Conv_155,BatchNormalization_156,Relu_157}.)
[08/11/2021-09:10:11] [E] [TRT] ../builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 ()
[08/11/2021-09:10:11] [E] Engine creation failed
[08/11/2021-09:10:11] [E] Engine set up failed
Onnx model size is ~74MB.
Environment
TensorRT Version: 7.1.3.4
Device was AGX
(host was installing jetpack : v4.4.1)
Docker image was nvcr.io/nvidia/l4t-ml:r32.6.1-py3 from NGC
BTW I was trying in the docker container env.
I wonder what the reason raised this issue? (Even this error cannot be fixed, I wanna find the reason for making this error. Does it cause by GPU memory? or something?
Thank you.