TensorRT inference engine with a quantized onnx model does not work

I’m using TensorRT-8.2.5.1,
and TensorRT-8.2.5.1/samples/sampleOnnxMnist

I replaced a default model to my quantized onnx model where the model is quantized under framework onnxRT and Pytorch.

When building a netwoek and performing inference, I got errors, which are somethng like following for the onnxRT-quantized onnx model:

[10/24/2022-16:33:13] [E] [TRT] classifier.1.bias_DequantizeLinear_dequantize_scale_node: only activation types allowed as input to this layer.
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:773: While parsing node number 0 [DequantizeLinear -> "classifier.1.bias"]:
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:775: input: "classifier.1.bias_quantized"
input: "classifier.1.bias_quantized_scale"
input: "classifier.1.bias_quantized_zero_point"
output: "classifier.1.bias"
name: "classifier.1.bias_DequantizeLinear"
op_type: "DequantizeLinear"

[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:779: ERROR: ModelImporter.cpp:179 In function parseGraph:
[6] Invalid Node - classifier.1.bias_DequantizeLinear
classifier.1.bias_DequantizeLinear_dequantize_scale_node: only activation types allowed as input to this layer.

and the pytotch-quantized onnx model

[10/24/2022-16:08:33] [I] [TRT] No importer registered for op: FakeQuantize. Attempting to import as plugin.
[10/24/2022-16:08:33] [I] [TRT] Searching for plugin: FakeQuantize, plugin_version: 1, plugin_namespace: 
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:773: While parsing node number 2 [FakeQuantize -> "100"]:
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:775: input: "input.0"
input: "98"
input: "99"
input: "98"
input: "99"
output: "100"
name: "FakeQuantize_2"
op_type: "FakeQuantize"
attribute {
  name: "levels"
  i: 256
  type: INT
}
domain: "org.openvinotoolkit"

[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:4871 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

To use quantized model in tensorRT, I have to quantize my model under tensorRT framework??

Hi @AnotherChubby ,
Yo should register the node to have a successful run.
Please refer to below links related custom plugin implementation and sample:

While IPluginV2 and IPluginV2Ext interfaces are still supported for backward compatibility with TensorRT 5.1 and 6.0.x respectively, however, we recommend that you write new plugins or refactor existing ones to target the IPluginV2DynamicExt or IPluginV2IOExt interfaces instead.

Thanks!

2 Likes

@AnotherChubby Have you solved this problem yet?
@AakankshaS I encountered the same problem, however I checked onnx-tensorRT support matrix, operator DequantizeLinear is indeed supported. Do I also need to write Plugin in this case?

Based on the error code it seems the problem is datatype but I tried different type it still does not work