Building INT8 engine fails: ../builder/cudnnBuilderWeightConverters.cpp:97:...

Hi everyone!

We’d like to deploy our caffe model (based on VGG16 and somewhat larger than this) as TensorRT INT8 engine. For FLOAT32 and FLOAT16 the building works straightforwardly.

For INT8 we generated a calibration file with the Python API (based on the tutorial https://devblogs.nvidia.com/int8-inference-autonomous-vehicles-tensorrt/; replacing CityScapes by COCO and using our model).

However, when using the calibration file for building the INT8 engine, the process fails and the following message shows:

../builder/cudnnBuilderWeightConverters.cpp:97: std::vector<float>
nvinfer1::cudnn::makeConvolutionInt8Weights(nvinfer1::ConvolutionParameters&, const
nvinfer1::cudnn::EngineTensor&, const nvinfer1::cudnn::EngineTensor&, nvinfer1::CpuMemoryGroup&, float):
Assertion `p.kernelWeights.type == DataType::kFLOAT && p.biasWeights.type == DataType::kFLOAT' failed.
Aborted

We tested our procedure with AlexNet, VGG16, and GoogleNet, and it works fine for them.

Does anyone have any idea what the problem might be?

Thanks in advance!

I solved this issue, by checking again the preprocessing of inputs when generating the calibration file. So the problem is on the calibration stage. I made sure the preprocessing accurately corresponds to the preprocessing used in the application our DNN runs in.

Those were the relevant lines from the example (https://devblogs.nvidia.com/int8-inference-autonomous-vehicles-tensorrt/):

MEAN = (71.60167789, 82.09696889, 72.30508881)
...
def sub_mean_chw(data):
  data = data.transpose((1,2,0)) # CHW -> HWC
  data -= np.array(MEAN) # Broadcast subtract
  data = data.transpose((2,0,1)) # HWC -> CHW
  return data
...
engine = trt.lite.Engine(...
                preprocessors={"data":sub_mean_chw},
                ...