I have a YOLOv2 model deployed successfully with tensorrt 3 using tensorrt for everything up to the final 1x1 convolutional prediction layer. I’d like to convert the early stages to int8 precision. When I run the conversion process I get this error:
NvPluginYOLO.cu:58: virtual void nvinfer1::plugin::PReLU::configure(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, int): Assertion `mBatchDim == 1’ failed.
Initially I thought this error was a internal renaming of maxBatchSize, but I’m unsure what is actually causing this error currently.
To be clear, the YOLOv2 conversion code works correctly without int8 calibration.
This is with both a batch size of 5 and 1 returned by the nvinfer1::IInt8EntropyCalibrator::getBatchSize() function.
My max batch size is set to 1 via IBuilder::setMaxBatchSize().
I was under the impression that since int8 is not supported by plugin layers, that the data is converted from int8 to fp32 and back for each plugin layer. Which is confirmed with:
Adding reformat layer: conv2 reformatted input 0 ((Unnamed ITensor* 4)) from Float(1,480,138240,4423680) to Int8(1,480,138240:4,1105920)
Adding reformat layer: relu_conv2 reformatted input 0 (bn_conv2) from Int8(1,480,138240:4,2211840) to Float(1,480,138240,8847360)