Can't build engines with the DLA + INT8 in Jetson Xavier NX

Hi.
My English isn’t so good so feel free to ask me if there is anything unclear.

I can’t build an engine with trtexec in Jetson Xavier NX with a combination of DLA and INT8.
The following error message is printed and the program exits.

…/builder/tacticOptimizer.cpp (2626) - Assertion Error in exemplar: 0 (pitch == 1 && “requirements impossible to satisfy”)

・Device : Jetson Xavier NX (Dev kit)
・It works well in FP16 + DLA.
・It works well in INT8 + GPU.

Thank you in advance.

Regards,

Hi,

We need more information for the error.
Could you run the trtexec with --verbose flag and share the log with us?
(Just the INT8+DLA case is enough)

Thanks.

Hi.
I really apprecaite your reply.

Here is the log of the execution with the --verbose flag.

As the log would be very long in the actual model, we use a different model.
In this case, the model you are executing is the first part of the actual model.
(The error message is the same as that of the actual model.)

&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=/home/jetson/work/slash.onnx --workspace=2048 --useDLACore=0 --allowGPUFallback --int8 --verbose
[11/13/2020-14:00:08] [I] === Model Options ===
[11/13/2020-14:00:08] [I] Format: ONNX
[11/13/2020-14:00:08] [I] Model: /home/jetson/work/slash.onnx
[11/13/2020-14:00:08] [I] Output:
[11/13/2020-14:00:08] [I] === Build Options ===
[11/13/2020-14:00:08] [I] Max batch: 1
[11/13/2020-14:00:08] [I] Workspace: 2048 MB
[11/13/2020-14:00:08] [I] minTiming: 1
[11/13/2020-14:00:08] [I] avgTiming: 8
[11/13/2020-14:00:08] [I] Precision: FP32+INT8
[11/13/2020-14:00:08] [I] Calibration: Dynamic
[11/13/2020-14:00:08] [I] Safe mode: Disabled
[11/13/2020-14:00:08] [I] Save engine: 
[11/13/2020-14:00:08] [I] Load engine: 
[11/13/2020-14:00:08] [I] Builder Cache: Enabled
[11/13/2020-14:00:08] [I] NVTX verbosity: 0
[11/13/2020-14:00:08] [I] Inputs format: fp32:CHW
[11/13/2020-14:00:08] [I] Outputs format: fp32:CHW
[11/13/2020-14:00:08] [I] Input build shapes: model
[11/13/2020-14:00:08] [I] Input calibration shapes: model
[11/13/2020-14:00:08] [I] === System Options ===
[11/13/2020-14:00:08] [I] Device: 0
[11/13/2020-14:00:08] [I] DLACore: 0(With GPU fallback)
[11/13/2020-14:00:08] [I] Plugins:
[11/13/2020-14:00:08] [I] === Inference Options ===
[11/13/2020-14:00:08] [I] Batch: 1
[11/13/2020-14:00:08] [I] Input inference shapes: model
[11/13/2020-14:00:08] [I] Iterations: 10
[11/13/2020-14:00:08] [I] Duration: 3s (+ 200ms warm up)
[11/13/2020-14:00:08] [I] Sleep time: 0ms
[11/13/2020-14:00:08] [I] Streams: 1
[11/13/2020-14:00:08] [I] ExposeDMA: Disabled
[11/13/2020-14:00:08] [I] Spin-wait: Disabled
[11/13/2020-14:00:08] [I] Multithreading: Disabled
[11/13/2020-14:00:08] [I] CUDA Graph: Disabled
[11/13/2020-14:00:08] [I] Skip inference: Disabled
[11/13/2020-14:00:08] [I] Inputs:
[11/13/2020-14:00:08] [I] === Reporting Options ===
[11/13/2020-14:00:08] [I] Verbose: Enabled
[11/13/2020-14:00:08] [I] Averages: 10 inferences
[11/13/2020-14:00:08] [I] Percentile: 99
[11/13/2020-14:00:08] [I] Dump output: Disabled
[11/13/2020-14:00:08] [I] Profile: Disabled
[11/13/2020-14:00:08] [I] Export timing to JSON file: 
[11/13/2020-14:00:08] [I] Export output to JSON file: 
[11/13/2020-14:00:08] [I] Export profile to JSON file: 
[11/13/2020-14:00:08] [I] 
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::NMS_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::Reorg_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::Region_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::Clip_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::LReLU_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::Normalize_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::RPROI_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::CropAndResize version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::Proposal version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::Split version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[11/13/2020-14:00:08] [V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
----------------------------------------------------------------
Input filename:   /home/jetson/work/slash.onnx
ONNX IR version:  0.0.7
Opset version:    12
Producer name:    tf2onnx
Producer version: 1.7.0
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::GridAnchor_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::NMS_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::Reorg_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::Region_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::Clip_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::LReLU_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::PriorBox_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::Normalize_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::RPROI_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::BatchedNMS_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::FlattenConcat_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::CropAndResize version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::DetectionLayer_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::Proposal version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::ProposalLayer_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::PyramidROIAlign_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::ResizeNearest_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::Split version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::SpecialSlice_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] Plugin creator already registered - ::InstanceNormalization_TRT version 1
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:202: Adding network input: input:0 with dtype: float32, dimensions: (1, 32, 512, 1)
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:116: Registering tensor: input:0 for ONNX tensor: input:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:90: Importing initializer: new_shape__9
[11/13/2020-14:00:09] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D_weights_fused_bn
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D_bias_fused_bn
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:103: Parsing node: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5 [Reshape]
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:119: Searching for input: input:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:119: Searching for input: new_shape__9
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:125: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5 [Reshape] inputs: [input:0 -> (1, 32, 512, 1)], [new_shape__9 -> (4)], 
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:141: Registering layer: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5 for ONNX node: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:116: Registering tensor: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5:0 for ONNX tensor: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:179: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5 [Reshape] outputs: [StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5:0 -> (1, 1, 32, 512)], 
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:103: Parsing node: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D [Conv]
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D_weights_fused_bn
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D_bias_fused_bn
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:125: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D [Conv] inputs: [StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5:0 -> (1, 1, 32, 512)], [StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D_weights_fused_bn -> (32, 1, 3, 3)], [StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D_bias_fused_bn -> (32)], 
[11/13/2020-14:00:09] [V] [TRT] builtin_op_importers.cpp:450: Convolution input dimensions: (1, 1, 32, 512)
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:141: Registering layer: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D for ONNX node: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D
[11/13/2020-14:00:09] [V] [TRT] builtin_op_importers.cpp:533: Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 32
[11/13/2020-14:00:09] [V] [TRT] builtin_op_importers.cpp:534: Convolution output dimensions: (1, 32, 32, 512)
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:116: Registering tensor: StatefulPartitionedCall/default/ResNet_bn0_1/FusedBatchNormV3:0 for ONNX tensor: StatefulPartitionedCall/default/ResNet_bn0_1/FusedBatchNormV3:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:179: StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D [Conv] outputs: [StatefulPartitionedCall/default/ResNet_bn0_1/FusedBatchNormV3:0 -> (1, 32, 32, 512)], 
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:103: Parsing node: StatefulPartitionedCall/default/ResNet_relu0_1/Relu [Relu]
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/default/ResNet_bn0_1/FusedBatchNormV3:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:125: StatefulPartitionedCall/default/ResNet_relu0_1/Relu [Relu] inputs: [StatefulPartitionedCall/default/ResNet_bn0_1/FusedBatchNormV3:0 -> (1, 32, 32, 512)], 
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:141: Registering layer: StatefulPartitionedCall/default/ResNet_relu0_1/Relu for ONNX node: StatefulPartitionedCall/default/ResNet_relu0_1/Relu
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:116: Registering tensor: StatefulPartitionedCall/default/ResNet_relu0_1/Relu:0 for ONNX tensor: StatefulPartitionedCall/default/ResNet_relu0_1/Relu:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:179: StatefulPartitionedCall/default/ResNet_relu0_1/Relu [Relu] outputs: [StatefulPartitionedCall/default/ResNet_relu0_1/Relu:0 -> (1, 32, 32, 512)], 
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:103: Parsing node: StatefulPartitionedCall/default/softmax/transpose [Transpose]
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/default/ResNet_relu0_1/Relu:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:125: StatefulPartitionedCall/default/softmax/transpose [Transpose] inputs: [StatefulPartitionedCall/default/ResNet_relu0_1/Relu:0 -> (1, 32, 32, 512)], 
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:141: Registering layer: StatefulPartitionedCall/default/softmax/transpose for ONNX node: StatefulPartitionedCall/default/softmax/transpose
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:116: Registering tensor: StatefulPartitionedCall/default/softmax/transpose:0 for ONNX tensor: StatefulPartitionedCall/default/softmax/transpose:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:179: StatefulPartitionedCall/default/softmax/transpose [Transpose] outputs: [StatefulPartitionedCall/default/softmax/transpose:0 -> (1, 32, 32, 512)], 
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:103: Parsing node: StatefulPartitionedCall/default/softmax/Softmax [Softmax]
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/default/softmax/transpose:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:125: StatefulPartitionedCall/default/softmax/Softmax [Softmax] inputs: [StatefulPartitionedCall/default/softmax/transpose:0 -> (1, 32, 32, 512)], 
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:141: Registering layer: StatefulPartitionedCall/default/softmax/Softmax for ONNX node: StatefulPartitionedCall/default/softmax/Softmax
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:116: Registering tensor: StatefulPartitionedCall/default/softmax/Softmax:0 for ONNX tensor: StatefulPartitionedCall/default/softmax/Softmax:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:179: StatefulPartitionedCall/default/softmax/Softmax [Softmax] outputs: [StatefulPartitionedCall/default/softmax/Softmax:0 -> (1, 32, 32, 512)], 
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:103: Parsing node: StatefulPartitionedCall/default/softmax/transpose_1 [Transpose]
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/default/softmax/Softmax:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:125: StatefulPartitionedCall/default/softmax/transpose_1 [Transpose] inputs: [StatefulPartitionedCall/default/softmax/Softmax:0 -> (1, 32, 32, 512)], 
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:141: Registering layer: StatefulPartitionedCall/default/softmax/transpose_1 for ONNX node: StatefulPartitionedCall/default/softmax/transpose_1
[11/13/2020-14:00:09] [V] [TRT] ImporterContext.hpp:116: Registering tensor: Identity:0_1 for ONNX tensor: Identity:0
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:179: StatefulPartitionedCall/default/softmax/transpose_1 [Transpose] outputs: [Identity:0 -> (1, 32, 512, 32)], 
[11/13/2020-14:00:09] [V] [TRT] ModelImporter.cpp:507: Marking Identity:0_1 as output: Identity:0
 ----- Parsing of ONNX model /home/jetson/work/slash.onnx is Done ---- 
[11/13/2020-14:00:09] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best
[11/13/2020-14:00:09] [V] [TRT] Setting dynamic range for input:0 to [-2,2]
[11/13/2020-14:00:09] [V] [TRT] Setting dynamic range for StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5:0 to [-4,4]
[11/13/2020-14:00:09] [V] [TRT] Setting dynamic range for StatefulPartitionedCall/default/ResNet_bn0_1/FusedBatchNormV3:0 to [-4,4]
[11/13/2020-14:00:09] [V] [TRT] Setting dynamic range for StatefulPartitionedCall/default/ResNet_relu0_1/Relu:0 to [-4,4]
[11/13/2020-14:00:09] [V] [TRT] Setting dynamic range for StatefulPartitionedCall/default/softmax/transpose:0 to [-4,4]
[11/13/2020-14:00:09] [V] [TRT] Setting dynamic range for (Unnamed Layer* 4) [Shuffle]_output to [-4,4]
[11/13/2020-14:00:09] [V] [TRT] Setting dynamic range for (Unnamed Layer* 5) [Softmax]_output to [-4,4]
[11/13/2020-14:00:09] [V] [TRT] Setting dynamic range for StatefulPartitionedCall/default/softmax/Softmax:0 to [-4,4]
[11/13/2020-14:00:09] [V] [TRT] Setting dynamic range for Identity:0 to [-4,4]
[11/13/2020-14:00:09] [W] [TRT] Default DLA is enabled but layer StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5 is not supported on DLA, falling back to GPU.
[11/13/2020-14:00:09] [W] [TRT] Default DLA is enabled but layer StatefulPartitionedCall/default/softmax/transpose is not supported on DLA, falling back to GPU.
[11/13/2020-14:00:09] [W] [TRT] Default DLA is enabled but layer (Unnamed Layer* 4) [Shuffle] is not supported on DLA, falling back to GPU.
[11/13/2020-14:00:09] [W] [TRT] Default DLA is enabled but layer StatefulPartitionedCall/default/softmax/Softmax is not supported on DLA, falling back to GPU.
[11/13/2020-14:00:09] [W] [TRT] Default DLA is enabled but layer (Unnamed Layer* 6) [Shuffle] is not supported on DLA, falling back to GPU.
[11/13/2020-14:00:09] [W] [TRT] Default DLA is enabled but layer StatefulPartitionedCall/default/softmax/transpose_1 is not supported on DLA, falling back to GPU.
[11/13/2020-14:00:09] [W] [TRT] Calibrator is not being used. Users must provide dynamic range for all tensors that are not Int32.
[11/13/2020-14:00:09] [V] [TRT] User overriding scale with dynamic range [-2,2]
[11/13/2020-14:00:09] [V] [TRT] User overriding scale with dynamic range [-4,4]
[11/13/2020-14:00:09] [V] [TRT] User overriding scale with dynamic range [-4,4]
[11/13/2020-14:00:09] [V] [TRT] User overriding scale with dynamic range [-4,4]
[11/13/2020-14:00:09] [V] [TRT] User overriding scale with dynamic range [-4,4]
[11/13/2020-14:00:09] [V] [TRT] User overriding scale with dynamic range [-4,4]
[11/13/2020-14:00:09] [V] [TRT] User overriding scale with dynamic range [-4,4]
[11/13/2020-14:00:09] [V] [TRT] User overriding scale with dynamic range [-4,4]
[11/13/2020-14:00:09] [V] [TRT] User overriding scale with dynamic range [-4,4]
[11/13/2020-14:00:09] [V] [TRT] Original: 8 layers
[11/13/2020-14:00:09] [V] [TRT] After dead-layer removal: 8 layers
[11/13/2020-14:00:09] [V] [TRT] After DLA optimization: 7 layers
[11/13/2020-14:00:09] [V] [TRT] Fusing StatefulPartitionedCall/default/softmax/transpose with (Unnamed Layer* 4) [Shuffle]
[11/13/2020-14:00:09] [V] [TRT] Fusing (Unnamed Layer* 6) [Shuffle] with StatefulPartitionedCall/default/softmax/transpose_1
[11/13/2020-14:00:09] [V] [TRT] After Myelin optimization: 5 layers
[11/13/2020-14:00:09] [V] [TRT] After scale fusion: 5 layers
[11/13/2020-14:00:09] [V] [TRT] After vertical fusions: 5 layers
[11/13/2020-14:00:09] [V] [TRT] After final dead-layer removal: 5 layers
[11/13/2020-14:00:09] [V] [TRT] After tensor merging: 5 layers
[11/13/2020-14:00:09] [V] [TRT] After concat removal: 5 layers
[11/13/2020-14:00:09] [V] [TRT] Configuring builder for Int8 Mode completed in 0.0101641 seconds.
[11/13/2020-14:00:09] [V] [TRT] Graph construction and optimization completed in 0.0113474 seconds.
[11/13/2020-14:00:09] [I] [TRT] 
[11/13/2020-14:00:09] [I] [TRT] --------------- Layers running on DLA: 
[11/13/2020-14:00:09] [I] [TRT] {StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D,StatefulPartitionedCall/default/ResNet_relu0_1/Relu}, 
[11/13/2020-14:00:09] [I] [TRT] --------------- Layers running on GPU: 
[11/13/2020-14:00:09] [I] [TRT] StatefulPartitionedCall/default/ResNet_conv0_1/Conv2D__5, StatefulPartitionedCall/default/softmax/transpose + (Unnamed Layer* 4) [Shuffle], StatefulPartitionedCall/default/softmax/Softmax, (Unnamed Layer* 6) [Shuffle] + StatefulPartitionedCall/default/softmax/transpose_1, 
[11/13/2020-14:00:13] [V] [TRT] Constructing optimization profile number 0 [1/1].
[11/13/2020-14:00:13] [V] [TRT] Builder timing cache: created 0 entries, 0 hit(s)
[11/13/2020-14:00:13] [E] [TRT] ../builder/tacticOptimizer.cpp (2626) - Assertion Error in exemplar: 0 (pitch == 1 && "requirements impossible to satisfy")
[11/13/2020-14:00:13] [E] Engine creation failed
[11/13/2020-14:00:13] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=/home/jetson/work/slash.onnx --workspace=2048 --useDLACore=0 --allowGPUFallback --int8 --verbose

Best regards,

Hi,

Could you share the model with us?

Based on the log, this error is caused by certain layer cannot be supported due to special setting (ex. stride, padding, …)
However, a non-supported layer should fallback to GPU but somehow it turns out to be an exception.

If you can share the model with us, we can investigate more for the error.

Thanks.

Hi.
I really appreciate your reply.

I understand.
I’ll share the ONNX model for testing.
This is the model we were using when we pasted the log.

Best regards,
trt_test.zip (1.9 KB)

Hi,

Thanks for the model.

This issue can be reproduced in our environment and is under investigation.
We will update here for any process.

1 Like

Hi,

Thank you for your continuous support.
Let me know if you have anything you want to ask.

Hi,

Thanks for your patience.

This issue is fixed in our internal TensorRT branch.
Will let you know when it releases.

Thanks

1 Like

Hi,

Thank you for cooperating so kindly.

Best regards,

Hi @AastaLLL,
I’m seeing the same error and same behaviour (FP16 on DLA ok, INT8 on DLA crashes with trtexec).
Is there a way to download a fix for this issue somehow?

thanks
Eyal

Hi,

Sorry that you will need to wait for our next TensorRT release for the fix.
I will update here once a new JetPack/package is available.

Thanks.

Can you you confirm this issue is fixed in JP4.5?

I haven’t tried Jetpack 4.5, but I found this error in Jetpack 4.4.1.

The TensorRT versions of 4.5 and 4.4.1 are the same, 7.1.3, so I assume they haven’t changed.