Tensorrt build QAT(pytorch-quantization) int8 3D operator failed

Description

After I trained a quantized int8 MONAI BasicUNet 3D semantic segmentation model with pytorch-quantization library and exported it as an onnx model, When using trtexec command to build the engine, the build process failed. If the BasicUNet model is quantized 2D version, then process of building a tensorrt model with trtexec is OK.

There are two kinds of errors in the trtexec build log:

1.TransposedConv3d:

[01/28/2022-07:16:16] [E] Error[2]: [optimizer.cpp::isPolymorphic::1072] Error Code 2: Internal Error (Assertion asCopyingLeafNode() ? candidateChoices.empty() : !candidateChoices.empty() failed. upcat_4.upsample.deconv.weight + QuantizeLinear_215_quantize_scale_node + ConvTranspose_217)
  1. Conv3D or MaxPool3D
[01/28/2022-07:26:32] [E] Error[10]: [optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node MaxPool_42.)

and

[E] Error[10]: [optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9.)

Use a simple code can produce above error.

Environment

run on a docker based on pytorch:21.12-py3

TensorRT Version:8.2.1.8
GPU Type: A100
Nvidia Driver Version: 510.39.01
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.11.0a0+b6df043
Baremetal or Container (if container which image + tag): based on
pytorch:21.12-py3
pytorch_quantization: 2.1.2

Relevant Files

code to produce a Conv3d or ConvTranspose3d onnx model

import torch
print('torch',torch.__version__)
import torch.nn as nn
import pytorch_quantization
print('pytorch_quantization', pytorch_quantization.__version__)
from pytorch_quantization import quant_modules
from pytorch_quantization import nn as quant_nn
from pytorch_quantization.tensor_quant import QuantDescriptor


class DeconvTest(nn.Module):
    def __init__(self, dim):
        super().__init__()
        if dim==3:
            self.dc = nn.ConvTranspose3d(1, 128, (3, 3, 3), stride=(1,1, 1), padding=(1, 1, 1))
        elif dim==2:
            self.dc = nn.ConvTranspose2d(1,128,(3,3), stride=(1,1), padding=(1,1))
        elif dim==1:
            self.dc = nn.ConvTranspose1d(1,128,(3,), stride=(1,), padding=(1,))
    def forward(self, x):
        return self.dc(x)
    
class ConvTest(nn.Module):
    def __init__(self):
        super().__init__()
        self.op = nn.Conv3d(1,32,(3,3,3), stride=(1,1,1), padding=(1,1,1))
        self.op1 = nn.Conv3d(32,64,(3,3,3), stride=(1,1,1), padding=(1,1,1))
    def forward(self, x):
        x= self.op(x)
        x = self.op1(x)
        return x
    
quant_desc_weight =  QuantDescriptor(num_bits=8, axis=(0))
quant_nn.QuantConvTranspose3d.set_default_quant_desc_weight(quant_desc_weight)

quant_modules.initialize()

dim=3
# model = DeconvTest(dim=dim)
model = ConvTest()
model.cuda()
print(model)

if dim==3:
    shapes = (1,1, 128,128,128)
elif dim==2:
    shapes = (1,1,128,128)
elif dim==1:
    shapes=(1,1,128)
    
dummy_input = torch.randn(shapes,  device='cuda')

input_names = ['actual_input_1']
ouput_names = ['output']
quant_nn.TensorQuantizer.use_fb_fake_quant = True
model.eval()
torch.onnx.export(model, dummy_input, 'pyt_test_'+str(dim)+'d.onnx', verbose=False, opset_version=13)
quant_nn.TensorQuantizer.use_fb_fake_quant = False

use trtexec to test onnx model

#! /bin/bash
MODEL_NAME=pyt_test_3d
PRECISION=int8

trtexec --onnx=${MODEL_NAME}.onnx \
        --${PRECISION}   --workspace=8192 --noBuilderCache --verbose \
        > ${MODEL_NAME}.log 2>&1

##produced onnx file and trtexec’s log
pyt_test_3d.log (51.2 KB)
pyt_test_3d.onnx (221.7 KB)

Steps To Reproduce

  1. run python code to produce a onnx model file
  2. use bash code to test the onnx model
  3. edit pythcon code to used model = DeconvTest(dim=dim) then repeat 2 step

Hi,

We could reproduce the error. Our team will work on it.

Thank you.