TensorRT inference engine with a quantized onnx model does not work

AnotherChubby · October 24, 2022, 8:29am

I’m using TensorRT-8.2.5.1,
and TensorRT-8.2.5.1/samples/sampleOnnxMnist

I replaced a default model to my quantized onnx model where the model is quantized under framework onnxRT and Pytorch.

When building a netwoek and performing inference, I got errors, which are somethng like following for the onnxRT-quantized onnx model:

[10/24/2022-16:33:13] [E] [TRT] classifier.1.bias_DequantizeLinear_dequantize_scale_node: only activation types allowed as input to this layer.
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:773: While parsing node number 0 [DequantizeLinear -> "classifier.1.bias"]:
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:775: input: "classifier.1.bias_quantized"
input: "classifier.1.bias_quantized_scale"
input: "classifier.1.bias_quantized_zero_point"
output: "classifier.1.bias"
name: "classifier.1.bias_DequantizeLinear"
op_type: "DequantizeLinear"

[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:779: ERROR: ModelImporter.cpp:179 In function parseGraph:
[6] Invalid Node - classifier.1.bias_DequantizeLinear
classifier.1.bias_DequantizeLinear_dequantize_scale_node: only activation types allowed as input to this layer.

and the pytotch-quantized onnx model

[10/24/2022-16:08:33] [I] [TRT] No importer registered for op: FakeQuantize. Attempting to import as plugin.
[10/24/2022-16:08:33] [I] [TRT] Searching for plugin: FakeQuantize, plugin_version: 1, plugin_namespace: 
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:773: While parsing node number 2 [FakeQuantize -> "100"]:
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:775: input: "input.0"
input: "98"
input: "99"
input: "98"
input: "99"
output: "100"
name: "FakeQuantize_2"
op_type: "FakeQuantize"
attribute {
  name: "levels"
  i: 256
  type: INT
}
domain: "org.openvinotoolkit"

[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:4871 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

To use quantized model in tensorRT, I have to quantize my model under tensorRT framework??

AakankshaS · October 26, 2022, 10:51am

Hi @AnotherChubby ,
Yo should register the node to have a successful run.
Please refer to below links related custom plugin implementation and sample:

While IPluginV2 and IPluginV2Ext interfaces are still supported for backward compatibility with TensorRT 5.1 and 6.0.x respectively, however, we recommend that you write new plugins or refactor existing ones to target the IPluginV2DynamicExt or IPluginV2IOExt interfaces instead.

Thanks!

unexpectedvon · December 20, 2022, 2:29pm

@AnotherChubby Have you solved this problem yet?
@AakankshaS I encountered the same problem, however I checked onnx-tensorRT support matrix, operator DequantizeLinear is indeed supported. Do I also need to write Plugin in this case?

Based on the error code it seems the problem is datatype but I tried different type it still does not work

Topic		Replies	Views
Custom plugin supporting int8 I/O type check fail TensorRT	2	533	May 26, 2023
Bugs for CustomQKVToContextPluginDynamic Plugin TensorRT	5	856	February 14, 2023
Onnx2TRT missing TensorListStack plugin TensorRT	1	1068	December 9, 2021
nvonnxparser::IParse::parse() fail,and trt report paramenter check fail TensorRT tensorrt	7	1193	July 12, 2021
Assertion failed: *tensor = importer_ctx->network()->addInput( input.name().c_str(), trt_dtype, trt_dims) TensorRT tensorrt	15	923	June 24, 2022
Onnx model to TRT conversion error TensorRT	6	3073	April 15, 2022
Convert int8-onnx model to trt engine? TensorRT onnx	6	1046	April 29, 2023
Error in tensorrt test TRT file TensorRT tensorrt , onnx	3	1336	July 5, 2022
Problem converting TensorFlow 2-> ONNX model to TensorRT Engine (efficientdet_d0) TensorRT	8	1372	November 17, 2022
Elementwise layer does not support the given inputs and operator TensorRT	4	846	December 19, 2023

TensorRT inference engine with a quantized onnx model does not work

Related Topics