Hi everyone!
We’d like to deploy our caffe model (based on VGG16 and somewhat larger than this) as TensorRT INT8 engine. For FLOAT32 and FLOAT16 the building works straightforwardly.
For INT8 we generated a calibration file with the Python API (based on the tutorial https://devblogs.nvidia.com/int8-inference-autonomous-vehicles-tensorrt/; replacing CityScapes by COCO and using our model).
However, when using the calibration file for building the INT8 engine, the process fails and the following message shows:
../builder/cudnnBuilderWeightConverters.cpp:97: std::vector<float>
nvinfer1::cudnn::makeConvolutionInt8Weights(nvinfer1::ConvolutionParameters&, const
nvinfer1::cudnn::EngineTensor&, const nvinfer1::cudnn::EngineTensor&, nvinfer1::CpuMemoryGroup&, float):
Assertion `p.kernelWeights.type == DataType::kFLOAT && p.biasWeights.type == DataType::kFLOAT' failed.
Aborted
We tested our procedure with AlexNet, VGG16, and GoogleNet, and it works fine for them.
Does anyone have any idea what the problem might be?
Thanks in advance!