I am trying to calibrate faster rcnn to int8.
most of code is from https://devtalk.nvidia.com/default/topic/1015387/tensorrt-fails-to-build-fasterrcnn-gie-model-with-using-int8/
here is the code for calibration
CHECK(cudaMalloc(&mImInfoInput, batchSize * 3 * sizeof(float)));
float *imInfo = new float[batchSize * 3];
for (int i = 0; i < batchSize; i++) {
imInfo[i * 3] = height; // num of rows
imInfo[i * 3 + 1] = width; // num of colums
imInfo[i * 3 + 2] = scale; // image scale }
CHECK(cudaMemcpy(mImInfoInput, imInfo, batchSize * 3 * sizeof(float), cudaMemcpyHostToDevice));
delete[] imInfo;
....
bool getBatch(void *bindings[], const char *names[], int nbBindings) override {
if (!mStream.next()) return false;
CHECK(cudaMemcpy(mDataInput, mStream.getBatch(), mInputCount * sizeof(float), cudaMemcpyHostToDevice));
assert(!strcmp(names[0], "data"));
assert(!strcmp(names[1], "im_info"));
bindings[0] = mDataInput;
bindings[1] = mImInfoInput;
return true;
}
But the accuracy rate drops significantly.
there are some questions about calibration.
-
Do I need to set a large batch size for calibration? How many images should be used? any rule of thumb?
-
Here is my config for faster rcnn, my input image is 1024 * 512, before forwarding, I resize my input image to 512 * 256, so I put [1024, 512, 0.5] in imInfo. And then rois are scaled by 0.5 to get the correct size.
In the same way , when doing calibration for INT8, every image in one batch is 512 * 256, and I put [1024, 512, 0.5] in imInfo, am I doing right?
thanks.