TensorRT inference engine with a quantized onnx model does not work

I’m using TensorRT-8.2.5.1,
and TensorRT-8.2.5.1/samples/sampleOnnxMnist

I replaced a default model to my quantized onnx model where the model is quantized under framework onnxRT and Pytorch.

When building a netwoek and performing inference, I got errors, which are somethng like following for the onnxRT-quantized onnx model:

[10/24/2022-16:33:13] [E] [TRT] classifier.1.bias_DequantizeLinear_dequantize_scale_node: only activation types allowed as input to this layer.
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:773: While parsing node number 0 [DequantizeLinear -> "classifier.1.bias"]:
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:775: input: "classifier.1.bias_quantized"
input: "classifier.1.bias_quantized_scale"
input: "classifier.1.bias_quantized_zero_point"
output: "classifier.1.bias"
name: "classifier.1.bias_DequantizeLinear"
op_type: "DequantizeLinear"

[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[10/24/2022-16:33:13] [E] [TRT] ModelImporter.cpp:779: ERROR: ModelImporter.cpp:179 In function parseGraph:
[6] Invalid Node - classifier.1.bias_DequantizeLinear
classifier.1.bias_DequantizeLinear_dequantize_scale_node: only activation types allowed as input to this layer.

and the pytotch-quantized onnx model

[10/24/2022-16:08:33] [I] [TRT] No importer registered for op: FakeQuantize. Attempting to import as plugin.
[10/24/2022-16:08:33] [I] [TRT] Searching for plugin: FakeQuantize, plugin_version: 1, plugin_namespace: 
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:773: While parsing node number 2 [FakeQuantize -> "100"]:
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:775: input: "input.0"
input: "98"
input: "99"
input: "98"
input: "99"
output: "100"
name: "FakeQuantize_2"
op_type: "FakeQuantize"
attribute {
  name: "levels"
  i: 256
  type: INT
}
domain: "org.openvinotoolkit"

[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[10/24/2022-16:08:33] [E] [TRT] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:4871 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

To use quantized model in tensorRT, I have to quantize my model under tensorRT framework??

Hi @bobyfoo ,
Yo should register the node to have a successful run.
Please refer to below links related custom plugin implementation and sample:

While IPluginV2 and IPluginV2Ext interfaces are still supported for backward compatibility with TensorRT 5.1 and 6.0.x respectively, however, we recommend that you write new plugins or refactor existing ones to target the IPluginV2DynamicExt or IPluginV2IOExt interfaces instead.

Thanks!

1 Like