Description
A clear and concise description of the bug or issue.
Environment
TensorRT Version: 8.0.1.6
GPU Type: 2080
Nvidia Driver Version: 470.63.01
CUDA Version: 11.3
CUDNN Version: 8.0
Operating System + Version: Ubuntu 1804
Python Version (if applicable): 3.7
PyTorch Version (if applicable): 1.9
Relevant Files
I successfully calibrated my pruned model using pytorch -quantization toolkit and exported it to ONNX file.
However when I try to build the engine from it, I receive the following error :
[TensorRT] VERBOSE: Parsing node: QuantizeLinear_694 [QuantizeLinear]
[TensorRT] VERBOSE: Searching for input: uplayer3.weight
[TensorRT] VERBOSE: Searching for input: 1270
[TensorRT] VERBOSE: Searching for input: 1396
[TensorRT] VERBOSE: QuantizeLinear_694 [QuantizeLinear] inputs: [uplayer3.weight → (64, 55, 2, 2)[FLOAT]], [1270 → (55)[FLOAT]], [1396 → (55)[INT8]],
[TensorRT] VERBOSE: Registering layer: uplayer3.weight for ONNX node: uplayer3.weight
[TensorRT] VERBOSE: Registering tensor: 1273 for ONNX tensor: 1273
[TensorRT] VERBOSE: QuantizeLinear_694 [QuantizeLinear] outputs: [1273 → (64, 55, 2, 2)[FLOAT]],
[TensorRT] VERBOSE: Parsing node: DequantizeLinear_695 [DequantizeLinear]
[TensorRT] VERBOSE: Searching for input: 1273
[TensorRT] VERBOSE: Searching for input: 1270
[TensorRT] VERBOSE: Searching for input: 1396
[TensorRT] VERBOSE: DequantizeLinear_695 [DequantizeLinear] inputs: [1273 → (64, 55, 2, 2)[FLOAT]], [1270 → (55)[FLOAT]], [1396 → (55)[INT8]],
[TensorRT] VERBOSE: Registering tensor: 1274 for ONNX tensor: 1274
[TensorRT] VERBOSE: DequantizeLinear_695 [DequantizeLinear] outputs: [1274 → (64, 55, 2, 2)[FLOAT]],
[TensorRT] VERBOSE: Parsing node: ConvTranspose_696 [ConvTranspose]
[TensorRT] VERBOSE: Searching for input: 1269
[TensorRT] VERBOSE: Searching for input: 1274
[TensorRT] VERBOSE: ConvTranspose_696 [ConvTranspose] inputs: [1269 → (1, 64, 864, 120)[FLOAT]], [1274 → (64, 55, 2, 2)[FLOAT]],
[TensorRT] VERBOSE: Convolution input dimensions: (1, 64, 864, 120)
Traceback (most recent call last):
File “/vayaalgo/Work/Pruning/Convert2TRT/main.py”, line 247, in
import_ONNX(cfg)
File “/vayaalgo/Work/Pruning/Convert2TRT/utils/models_quant.py”, line 15, in import_ONNX
engine = backend.prepare(depthnet, verbose=True, device=‘CUDA:0’)
File “/usr/local/lib/python3.7/dist-packages/onnx_tensorrt-8.0.1-py3.7.egg/onnx_tensorrt/backend.py”, line 236, in prepare
File “/usr/local/lib/python3.7/dist-packages/onnx_tensorrt-8.0.1-py3.7.egg/onnx_tensorrt/backend.py”, line 68, in init
RuntimeError: While parsing node number 696:
onnx2trt_utils.cpp:2128 In function convDeconvMultiInput:
[6] Assertion failed: (nChannel == -1 || C * ngroup == nChannel) && "The attribute group and the kernel shape misalign with the channel size of the input tensor. "
The problem comes from the ConvTranspose block as it has a Tensor of size 1x64x864x120 at its input but, has weights of 64x55x2x2, thus meaning that input channel is not equal to output channel.
I already found same issue before here: Assertion failed : QTA onnx imported by tensorrt 8 get error · Issue #1293 · NVIDIA/TensorRT · GitHub
Steps To Reproduce
My code:
import onnx
import onnx_tensorrt.backend as backend
depthnet = onnx.load(‘quantized_depthnet.onnx’)
engine = backend.prepare(depthnet, verbose=True, device=‘CUDA:0’)
depthnet_input_tensor = np.random.random(size=(1, 3, 3456,480).astype(np.float32)
output_data = engine.run(depthnet_input_tensor)
My model is also attached :quantized_depthnet.onnx (7.7 MB)
I would like to stress out that the input channels number of the ConvTranspose block being not equal to output channels number is a result of this model pruning.