I made a quantized tflite file using tensorflow model optimization tool.
After that, it was converted to onnx file and tried to convert to rt file.
I faced an error like below. how can i solve this problem.
and,
tflite → onnx → rt
Is this process possible?
I understand that tflite file does not operate the way I want it to in a windows. Can I check the speed reduction effect in the windows through the above process?
ModelImporter.cpp:119: Searching for input: scale__162
ModelImporter.cpp:119: Searching for input: zero_point__163
ModelImporter.cpp:125: sequential/efficientnetb0/quantize_layer/AllValuesQuantize/FakeQuantWithMinMaxVars [DequantizeLinear] inputs: [sequential/efficientnetb0/quantize_layer/AllValuesQuantize/FakeQuantWithMinMaxVars;sequential/efficientnetb0/quantize_layer/AllValuesQuantize/FakeQuantWithMinMaxVars/ReadVariableOp/resource;sequential/efficientnetb0/quantize_layer/AllValuesQuantize/FakeQuantWithMinMaxVars/ReadVariableOp_1/resource → ()], [scale__162 → ()], [zero_point__163 → ()],
onnx2trt_utils.cpp:286: TensorRT currenly supports only zero shifts values for QuatizeLinear/DequantizeLinear ops
sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd;sequential/efficientnetb0/quant_block7a_se_expand/Conv2D;sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd/ReadVariableOp/resource_dequant_dequantize_scale_node: at least 4 dimensions are required for input.
ImporterContext.hpp:120: Registering tensor: sequential/efficientnetb0/quantize_layer/AllValuesQuantize/FakeQuantWithMinMaxVars for ONNX tensor: sequential/efficientnetb0/quantize_layer/AllValuesQuantize/FakeQuantWithMinMaxVars
ModelImporter.cpp:179: sequential/efficientnetb0/quantize_layer/AllValuesQuantize/FakeQuantWithMinMaxVars [DequantizeLinear] outputs: [sequential/efficientnetb0/quantize_layer/AllValuesQuantize/FakeQuantWithMinMaxVars → ()],
ModelImporter.cpp:103: Parsing node: sequential/efficientnetb0/quant_normalization/sub [Sub]
ModelImporter.cpp:119: Searching for input: sequential/efficientnetb0/quantize_layer/AllValuesQuantize/FakeQuantWithMinMaxVars
sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd;sequential/efficientnetb0/quant_block7a_se_expand/Conv2D;sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd/ReadVariableOp/resource_dequant_dequantize_scale_node: at least 4 dimensions are required for input.
ModelImporter.cpp:119: Searching for input: sequential/efficientnetb0/quant_normalization/Reshape
ModelImporter.cpp:125: sequential/efficientnetb0/quant_normalization/sub [Sub] inputs: [sequential/efficientnetb0/quantize_layer/AllValuesQuantize/FakeQuantWithMinMaxVars → ()], [sequential/efficientnetb0/quant_normalization/Reshape → (1, 1, 1, 3)],
sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd;sequential/efficientnetb0/quant_block7a_se_expand/Conv2D;sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd/ReadVariableOp/resource_dequant_dequantize_scale_node: at least 4 dimensions are required for input.
sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd;sequential/efficientnetb0/quant_block7a_se_expand/Conv2D;sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd/ReadVariableOp/resource_dequant_dequantize_scale_node: at least 4 dimensions are required for input.
sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd;sequential/efficientnetb0/quant_block7a_se_expand/Conv2D;sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd/ReadVariableOp/resource_dequant_dequantize_scale_node: at least 4 dimensions are required for input.
sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd;sequential/efficientnetb0/quant_block7a_se_expand/Conv2D;sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd/ReadVariableOp/resource_dequant_dequantize_scale_node: at least 4 dimensions are required for input.
sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd;sequential/efficientnetb0/quant_block7a_se_expand/Conv2D;sequential/efficientnetb0/quant_block7a_se_expand/BiasAdd/ReadVariableOp/resource_dequant_dequantize_scale_node: at least 4 dimensions are required for input.
sequential/efficientnetb0/quant_block7a_se_expand/Conv2D;sequential/efficientnetb0/quant_block7a_se_expand/LastValueQuant/FakeQuantWithMinMaxVars: invalid weights type of Int8
ERROR: onnx2trt_utils.cpp:680 In function elementwiseHelper:
[8] Assertion failed: tensor_ptr->getDimensions().nbDims == maxNbDims && “Failed to broadcast tensors elementwise!”
Assertion failed: tensor_ptr->getDimensions().nbDims == maxNbDims && “Failed to broadcast tensors elementwise!”
Building Cuda Engine
Network must have at least one output
Network validation failed.