convert faster rcnn to int8

I am trying to calibrate faster rcnn to int8.

most of code is from https://devtalk.nvidia.com/default/topic/1015387/tensorrt-fails-to-build-fasterrcnn-gie-model-with-using-int8/

here is the code for calibration

CHECK(cudaMalloc(&mImInfoInput, batchSize * 3 * sizeof(float)));                                                                                                                                                            
                                                                                                                                                                                                                                    
   float *imInfo = new float[batchSize * 3];                                                                                                                                                                                   
   for (int i = 0; i < batchSize; i++) {                                                                                                                                                                                       
       imInfo[i * 3] = height;     // num of rows                                                                                                                                                                               
       imInfo[i * 3 + 1] = width; // num of colums                                                                                                                                                                             
       imInfo[i * 3 + 2] = scale;   // image scale                                                                                                                                                                                   }                                                                                                                                                                                                                           
                                                                                                                                                                                                                                    
    CHECK(cudaMemcpy(mImInfoInput, imInfo, batchSize * 3 * sizeof(float), cudaMemcpyHostToDevice));                                                                                                                             
delete[] imInfo;          

....

bool getBatch(void *bindings[], const char *names[], int nbBindings) override {                                                                                                                                                 
    if (!mStream.next()) return false;                                                                                                                                                                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                    
    CHECK(cudaMemcpy(mDataInput, mStream.getBatch(), mInputCount * sizeof(float), cudaMemcpyHostToDevice));                                                                                                                     
    assert(!strcmp(names[0], "data"));                                                                                                                                                                                          
    assert(!strcmp(names[1], "im_info"));                                                                                                                                                                                       
    bindings[0] = mDataInput;                                                                                                                                                                                                   
    bindings[1] = mImInfoInput;                                                                                                                                                                                                 
    return true;                                                                                                                                                                                                                
}

But the accuracy rate drops significantly.

there are some questions about calibration.

  1. Do I need to set a large batch size for calibration? How many images should be used? any rule of thumb?

  2. Here is my config for faster rcnn, my input image is 1024 * 512, before forwarding, I resize my input image to 512 * 256, so I put [1024, 512, 0.5] in imInfo. And then rois are scaled by 0.5 to get the correct size.
    In the same way , when doing calibration for INT8, every image in one batch is 512 * 256, and I put [1024, 512, 0.5] in imInfo, am I doing right?

thanks.