Terminate when parsing ONNX graph (nvinfer1::AssertionFailure)

Description

When parsing ONNX graph, it returns early with “terminate called after throwing an instance of ‘nvinfer1::AssertionFailure’”. I am trying to develop two cusom operations, correlation and grid_sampler, however it does not seem to be breaking at these…yet. It is able to go past the first correlation node corretly, however it fails before it reaches the grid_sampler. The correlation node is implemented as IPluginV2DynamicExt as I previously had errors telling me I needed to use that. Here is an except of the last few messages:

UNKNOWN: ModelImporter.cpp:103: Parsing node: Tile_3508 [Tile]
UNKNOWN: ModelImporter.cpp:119: Searching for input: 5468
UNKNOWN: ModelImporter.cpp:119: Searching for input: 5473
UNKNOWN: ModelImporter.cpp:125: Tile_3508 [Tile] inputs: [5468 -> (-1)], [5473 -> (3)], 
UNKNOWN: ImporterContext.hpp:141: Registering layer: Tile_3508 for ONNX node: Tile_3508
terminate called after throwing an instance of 'nvinfer1::AssertionFailure'
  what():  std::exception
Aborted (core dumped)

Here’s an except from the correlation layer just in case.

UNKNOWN: ModelImporter.cpp:103: Parsing node: correlation_3454 [correlation]
UNKNOWN: ModelImporter.cpp:119: Searching for input: 3568
UNKNOWN: ModelImporter.cpp:119: Searching for input: 5379
UNKNOWN: ModelImporter.cpp:125: correlation_3454 [correlation] inputs: [3568 -> (-1, 384, 27, 8)], [5379 -> (-1, 384, 27, 8)], 
INFO: ModelImporter.cpp:135: No importer registered for op: correlation. Attempting to import as plugin.
INFO: builtin_op_importers.cpp:3659: Searching for plugin: correlation, plugin_version: 1, plugin_namespace: 
INFO: builtin_op_importers.cpp:3676: Successfully created plugin: correlation
UNKNOWN: ImporterContext.hpp:141: Registering layer: correlation_3454 for ONNX node: correlation_3454
UNKNOWN: ImporterContext.hpp:116: Registering tensor: 5406 for ONNX tensor: 5406
UNKNOWN: ModelImporter.cpp:179: correlation_3454 [correlation] outputs: [5406 -> (-1, 81, 27, 8)], 

Environment

TensorRT Version: 7.1.3.4-1+cuda10.2
GPU Type: GTX1070 Ti
Nvidia Driver Version: 440.33.01
CUDA Version: cuda 10.2
CUDNN Version: 8.0.0.180-1+cuda10.2
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6
TensorFlow Version (if applicable): N/A
PyTorch Version (if applicable): 1.6.0
Baremetal or Container (if container which image + tag): Baremetal

Relevant Files

Code can be found here.

ONNX file can be found here, I’ve inspected the graph with Netron and it looks fine.

Steps To Reproduce

Compile and run the executable I guess, you may need to change some path settings for the location of the onnx graph and class id’s (which is just a text file, just put 18 rows of jibberish and you should be fine if there’s any issues).

Hi @frenzi,
Thanks for sharing the details. Will try to look into it and get back with update.

Thanks

I just got my 3090 and I (painfully) reinstalled and upgraded everything to the latest i.e. TensorRT 7.2.1 + CUDA 11.1 now when I am building the graph it gives me a more descriptive error which is:

UNKNOWN: ImporterContext.hpp:154: Registering layer: Tile_3508 for ONNX node: Tile_3508
INTERNAL_ERROR: Assertion failed: equalIfKnown(a, b)
../builder/Layers.cpp:121
Aborting...
terminate called after throwing an instance of 'nvinfer1::AssertionFailure'
  what():  std::exception
Aborted (core dumped)

Hi @frenzi ,
This looks like an ONNX model issue.
Can you please raise it in the respective forum

Thanks!

Hi @AakankshaS
Once PyTorch 1.7 released a few days later, Oct 28, I moved to that as it had native CUDA 11 support (rather than recompiling master). It seems to build the ONNX graph in a slightly different way and consequently everything works fine now.
Cheers