Tensor RT7.2: Assertion `a.biasWeights.count() == a.widthB' failed

Description

Hey, I get the following error while building my model (full output at the end)

python3.8: ../builder/baseMLPBuilder.cpp:104: void 
nvinfer1::builder::baseMLPBuilder::createConstants(const nvinfer1::MLPParameters&, 
nvinfer1::rt::DeviceWeightsHunk&, nvinfer1::rt::DeviceWeightsHunk&, nvinfer1::builder::GlobWriter&, 
const string&, const nvinfer1::rt::CommonContext&, bool, int, int): Assertion `a.biasWeights.count() == 
a.widthB' failed.

My model consists of multiple MLPs which are stacked to allow for parallel execution (see the attached onnx file). I guess, that might lead to the error.

Environment

TensorRT Version: 7.2.2.3
GPU Type: Quadro P5000
Nvidia Driver Version: 460.39
CUDA Version: 11.2
CUDNN Version: 8.0.2
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.7.1
Baremetal or Container (if container which image + tag):

Relevant Files

model.zip (535.2 KB)

Steps To Reproduce

I use the following code to load the model

# initialize TensorRT engine and parse ONNX model
builder = trt.Builder(TRT_LOGGER)
explicit_batch = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
network = builder.create_network(explicit_batch)
parser = trt.OnnxParser(network, TRT_LOGGER)

# parse ONNX
with open(onnx_file_path, 'rb') as model:
    print('Beginning ONNX file parsing')
    parser.parse(model.read())
print('Completed parsing of ONNX file')

# allow TensorRT to use up to 1GB of GPU memory for tactic selection
builder.max_workspace_size = 1 << 30
# we have only one image in batch
builder.max_batch_size = 1
# use FP16 mode if possible
if builder.platform_has_fast_fp16:
    builder.fp16_mode = True

print(network.get_layer(0))

# generate TensorRT engine optimized for the target platform
print('Building an engine...')
engine = builder.build_cuda_engine(network)
context = engine.create_execution_context()
print("Completed creating Engine")

return engine, context

Full output:

[TensorRT] VERBOSE: Registered plugin creator - ::GridAnchor_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::NMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Reorg_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Region_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Clip_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::LReLU_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PriorBox_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Normalize_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::RPROI_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::BatchedNMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::FlattenConcat_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::CropAndResize version 1
[TensorRT] VERBOSE: Registered plugin creator - ::DetectionLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Proposal version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ProposalLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ResizeNearest_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Split version 1
[TensorRT] VERBOSE: Registered plugin creator - ::SpecialSlice_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::InstanceNormalization_TRT version 1
[TensorRT] VERBOSE: ModelImporter.cpp:202: Adding network input: input with dtype: float32, dimensions: (5, 100, 32)
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: input for ONNX tensor: input
[TensorRT] VERBOSE: ModelImporter.cpp:90: Importing initializer: _stacked_biases.0
[TensorRT] VERBOSE: ModelImporter.cpp:90: Importing initializer: _stacked_biases.1
[TensorRT] VERBOSE: ModelImporter.cpp:90: Importing initializer: _stacked_biases.2
[TensorRT] VERBOSE: ModelImporter.cpp:90: Importing initializer: _stacked_biases.3
[TensorRT] VERBOSE: ModelImporter.cpp:90: Importing initializer: _stacked_weights.0
[TensorRT] VERBOSE: ModelImporter.cpp:90: Importing initializer: _stacked_weights.1
[TensorRT] VERBOSE: ModelImporter.cpp:90: Importing initializer: _stacked_weights.2
[TensorRT] VERBOSE: ModelImporter.cpp:90: Importing initializer: _stacked_weights.3
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: Cast_0 [Cast]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: input
[TensorRT] VERBOSE: ModelImporter.cpp:125: Cast_0 [Cast] inputs: [input -> (5, 100, 32)], 
[TensorRT] VERBOSE: builtin_op_importers.cpp:320: Casting to type: float32
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: Cast_0 for ONNX node: Cast_0
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 49 for ONNX tensor: 49
[TensorRT] VERBOSE: ModelImporter.cpp:179: Cast_0 [Cast] outputs: [49 -> (5, 100, 32)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: MatMul_1 [MatMul]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 49
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: _stacked_weights.0
[TensorRT] VERBOSE: ModelImporter.cpp:125: MatMul_1 [MatMul] inputs: [49 -> (5, 100, 32)], [_stacked_weights.0 -> (5, 32, 100)], 
[TensorRT] VERBOSE: ImporterContext.hpp:150: Registering constant layer: _stacked_weights.0 for ONNX initializer: _stacked_weights.0
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: MatMul_1 for ONNX node: MatMul_1
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 50 for ONNX tensor: 50
[TensorRT] VERBOSE: ModelImporter.cpp:179: MatMul_1 [MatMul] outputs: [50 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: Add_2 [Add]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 50
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: _stacked_biases.0
[TensorRT] VERBOSE: ModelImporter.cpp:125: Add_2 [Add] inputs: [50 -> (5, 100, 100)], [_stacked_biases.0 -> (5, 1, 100)], 
[TensorRT] VERBOSE: ImporterContext.hpp:150: Registering constant layer: _stacked_biases.0 for ONNX initializer: _stacked_biases.0
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: Add_2 for ONNX node: Add_2
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 51 for ONNX tensor: 51
[TensorRT] VERBOSE: ModelImporter.cpp:179: Add_2 [Add] outputs: [51 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: Relu_3 [Relu]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 51
[TensorRT] VERBOSE: ModelImporter.cpp:125: Relu_3 [Relu] inputs: [51 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: Relu_3 for ONNX node: Relu_3
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 52 for ONNX tensor: 52
[TensorRT] VERBOSE: ModelImporter.cpp:179: Relu_3 [Relu] outputs: [52 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: MatMul_4 [MatMul]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 52
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: _stacked_weights.1
[TensorRT] VERBOSE: ModelImporter.cpp:125: MatMul_4 [MatMul] inputs: [52 -> (5, 100, 100)], [_stacked_weights.1 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ImporterContext.hpp:150: Registering constant layer: _stacked_weights.1 for ONNX initializer: _stacked_weights.1
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: MatMul_4 for ONNX node: MatMul_4
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 53 for ONNX tensor: 53
[TensorRT] VERBOSE: ModelImporter.cpp:179: MatMul_4 [MatMul] outputs: [53 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: Add_5 [Add]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 53
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: _stacked_biases.1
[TensorRT] VERBOSE: ModelImporter.cpp:125: Add_5 [Add] inputs: [53 -> (5, 100, 100)], [_stacked_biases.1 -> (5, 1, 100)], 
[TensorRT] VERBOSE: ImporterContext.hpp:150: Registering constant layer: _stacked_biases.1 for ONNX initializer: _stacked_biases.1
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: Add_5 for ONNX node: Add_5
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 54 for ONNX tensor: 54
[TensorRT] VERBOSE: ModelImporter.cpp:179: Add_5 [Add] outputs: [54 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: Relu_6 [Relu]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 54
[TensorRT] VERBOSE: ModelImporter.cpp:125: Relu_6 [Relu] inputs: [54 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: Relu_6 for ONNX node: Relu_6
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 55 for ONNX tensor: 55
[TensorRT] VERBOSE: ModelImporter.cpp:179: Relu_6 [Relu] outputs: [55 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: MatMul_7 [MatMul]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 55
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: _stacked_weights.2
[TensorRT] VERBOSE: ModelImporter.cpp:125: MatMul_7 [MatMul] inputs: [55 -> (5, 100, 100)], [_stacked_weights.2 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ImporterContext.hpp:150: Registering constant layer: _stacked_weights.2 for ONNX initializer: _stacked_weights.2
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: MatMul_7 for ONNX node: MatMul_7
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 56 for ONNX tensor: 56
[TensorRT] VERBOSE: ModelImporter.cpp:179: MatMul_7 [MatMul] outputs: [56 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: Add_8 [Add]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 56
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: _stacked_biases.2
[TensorRT] VERBOSE: ModelImporter.cpp:125: Add_8 [Add] inputs: [56 -> (5, 100, 100)], [_stacked_biases.2 -> (5, 1, 100)], 
[TensorRT] VERBOSE: ImporterContext.hpp:150: Registering constant layer: _stacked_biases.2 for ONNX initializer: _stacked_biases.2
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: Add_8 for ONNX node: Add_8
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 57 for ONNX tensor: 57
[TensorRT] VERBOSE: ModelImporter.cpp:179: Add_8 [Add] outputs: [57 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: Relu_9 [Relu]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 57
[TensorRT] VERBOSE: ModelImporter.cpp:125: Relu_9 [Relu] inputs: [57 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: Relu_9 for ONNX node: Relu_9
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 58 for ONNX tensor: 58
[TensorRT] VERBOSE: ModelImporter.cpp:179: Relu_9 [Relu] outputs: [58 -> (5, 100, 100)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: MatMul_10 [MatMul]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 58
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: _stacked_weights.3
[TensorRT] VERBOSE: ModelImporter.cpp:125: MatMul_10 [MatMul] inputs: [58 -> (5, 100, 100)], [_stacked_weights.3 -> (5, 100, 56)], 
[TensorRT] VERBOSE: ImporterContext.hpp:150: Registering constant layer: _stacked_weights.3 for ONNX initializer: _stacked_weights.3
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: MatMul_10 for ONNX node: MatMul_10
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: 59 for ONNX tensor: 59
[TensorRT] VERBOSE: ModelImporter.cpp:179: MatMul_10 [MatMul] outputs: [59 -> (5, 100, 56)], 
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: Add_11 [Add]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 59
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: _stacked_biases.3
[TensorRT] VERBOSE: ModelImporter.cpp:125: Add_11 [Add] inputs: [59 -> (5, 100, 56)], [_stacked_biases.3 -> (5, 1, 56)], 
[TensorRT] VERBOSE: ImporterContext.hpp:150: Registering constant layer: _stacked_biases.3 for ONNX initializer: _stacked_biases.3
[TensorRT] VERBOSE: ImporterContext.hpp:154: Registering layer: Add_11 for ONNX node: Add_11
[TensorRT] VERBOSE: ImporterContext.hpp:120: Registering tensor: output_0 for ONNX tensor: output
[TensorRT] VERBOSE: ModelImporter.cpp:179: Add_11 [Add] outputs: [output -> (5, 100, 56)], 
[TensorRT] VERBOSE: ModelImporter.cpp:510: Marking output_0 as output: output
'Completed parsing of ONNX file'                                                                                                         
<tensorrt.tensorrt.ILayer object at 0x7fa2269f4970>                                                                                      
'Building an engine...'                                                                                        | 0/500 [00:07<?, ?it/s][TensorRT] VERBOSE: Applying generic optimizations to the graph for inference.
[TensorRT] VERBOSE: Original: 20 layers
[TensorRT] VERBOSE: After dead-layer removal: 20 layers
[TensorRT] VERBOSE: After Myelin optimization: 20 layers
[TensorRT] VERBOSE: After scale fusion: 20 layers
[TensorRT] VERBOSE: BinaryFusion: Fusing MatMul_1 with Add_2
[TensorRT] VERBOSE: BinaryFusion: Fusing 1-layer MLP: MatMul_1 -> Relu_3 with Relu_3
[TensorRT] VERBOSE: BinaryFusion: Fusing 2-layer MLP: MatMul_1 -> MatMul_4 with MatMul_4
[TensorRT] VERBOSE: BinaryFusion: Fusing Add_5 with Relu_6
[TensorRT] VERBOSE: BinaryFusion: Fusing MatMul_7 with Add_8
[TensorRT] VERBOSE: BinaryFusion: Fusing 1-layer MLP: MatMul_7 -> Relu_9 with Relu_9
[TensorRT] VERBOSE: After vertical fusions: 14 layers
[TensorRT] VERBOSE: After dupe layer removal: 9 layers
[TensorRT] VERBOSE: After final dead-layer removal: 9 layers
[TensorRT] VERBOSE: After tensor merging: 9 layers
[TensorRT] VERBOSE: After concat removal: 9 layers
[TensorRT] VERBOSE: Graph construction and optimization completed in 0.00830084 seconds.
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.0.5 but loaded cuDNN 8.0.2
[TensorRT] VERBOSE: Constructing optimization profile number 0 [1/1].
[TensorRT] VERBOSE: *************** Autotuning format combination: Float(1,32,3200) -> Float(1,32,3200) ***************
[TensorRT] VERBOSE: --------------- Timing Runner: Cast_0 (Reformat)
[TensorRT] VERBOSE: Tactic: 1002 time 0.004096
[TensorRT] VERBOSE: Tactic: 0 time 0.002976
[TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.002976
[TensorRT] VERBOSE: --------------- Timing Runner: Cast_0 (Cast)
[TensorRT] VERBOSE: Cast has no valid tactics for this config, skipping
[TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
[TensorRT] VERBOSE: 
[TensorRT] VERBOSE: *************** Autotuning format combination: Float(1,32,3200) -> Float(1,100,10000) ***************
[TensorRT] VERBOSE: --------------- Timing Runner: 2-layer MLP: MatMul_1 -> MatMul_4 (CudnnMLPMM)
python3.8: ../builder/baseMLPBuilder.cpp:104: void nvinfer1::builder::baseMLPBuilder::createConstants(const nvinfer1::MLPParameters&, nvinfer1::rt::DeviceWeightsHunk&, nvinfer1::rt::DeviceWeightsHunk&, nvinfer1::builder::GlobWriter&, const string&, const nvinfer1::rt::CommonContext&, bool, int, int): Assertion `a.biasWeights.count() == a.widthB' failed.

I solved it.
Apparently the parser wasn’t happy that my data/layers weren’t batch first (for some broadcasting reasons). I restructured my network and it works now.

Best,
Sebastian

1 Like