*** HELP ME! How to get input tensor size(C,H,W) and how input/output are ordered and dynamic output size?

Hello, I’m trying to convert Pytorch’s Detectron FPN + Mask/Keypoint RCNN and some SOTA models to TensorRT. (https://github.com/dedoogong/tensorrt-detectron).

I have to port around 10 kinds of python layers to CUDA based tensorrt plugin and I can’t use the pre-made merged “RPNPooling” plugin because the whole network has FPN subnetwork which separates RPN part and Pooling part as below;

  1. TensorRT’s “RPNPooling”:
    RPN->RoIPooling (used only once and no FPN)

  2. My FPN+RPN+Pooling
    GenerateProposal_1 -|------------------------------|->RoiAlign_1-|
    GenerateProposal_2 -|------------------------------|->RoiAlign_2-|
    GenerateProposal_3 -|->CollectAndDistributeFPN-|->RoiAlign_3-|->Concat->BatchPermute->FC->…
    GenerateProposal_4 -|------------------------------|->RoiAlign_4-|
    GenerateProposal_5 -|------------------------------|

First, I’m handling “generateProposal” layer which gets 4 inputs and outputs 2 tensors as below;

const Dtype * <b>scores</b>,<u>-------</u>>> tensor size( 3,H,W) is varied depending on input image size
const Dtype * <b>bbox_deltas</b>, >> tensor size(12,H,W) is varied depending on input image size
const Dtype * <b>im_info_tensor</b>, >> tensor size always (1,3) in (rows, columns)
const Dtype * <b>anchors</b>,     >> tensor size always (1,10) in (rows, columns)  
Dtype* <b>out_rois</b>,
Dtype* <b>out_rois_probs</b>,
  1. I need to get the shape of score tensor shape. How? in some examples, it seems like

I can get all the shape information of each input tensors using getOutputDimensions(int index, const Dims* inputs, int nbInputDims) , right?

If so, when the “generateProposal” layer is used 5 times in the whole network with different inputs(so intput size also different) and hyper parameters,

let’s say, all 5 layers like
“genProposal_1, genProposal_2, genProposal_3, genProposal_4, genProposal_5”.

then, for example, “score” tensor sizes are like
(3,13,19), (3,26,38), (3,52,76), (3,104,152), (3,208,304).

I think I need to check the current layer name is “genProposal_X” to pass the correct hyper parameters in the code (as TensorRT API doesn’t provide parsing hyper params in the caffe prototxt file) and

I’m not sure whether “const Dims* inputs” are automatically set by TensorRT.
I just made a “unclear assumption” like

inputs[0].d[0]=3  // score's C of generateProposal_1
inputs[0].d[1]=13 // score's H of generateProposal_1
inputs[0].d[2]=19 // score's W of generateProposal_1 
 
inputs[1].d[0]=12 // bbox_deltas's C of generateProposal_1
inputs[1].d[1]=13 // bbox_deltas's H of generateProposal_1
inputs[1].d[2]=19 // bbox_deltas's W of generateProposal_1
 
//... for genProposal_1

inputs[0].d[0]=3   // score's C of generateProposal_2
inputs[0].d[1]=26  // score's H of generateProposal_2
inputs[0].d[2]=38  // score's W of generateProposal_2

inputs[1].d[0]=12  // bbox_deltas's C of generateProposal_2
inputs[1].d[1]=26  // bbox_deltas's H of generateProposal_2
inputs[1].d[2]=38  // bbox_deltas's W of generateProposal_2
 
//... for genProposal_2

so on.

As I need to calculate 2 output dimensions for both “out_rois” and “out_rois_probs”, if my assumption (I can get the correct size of each input tensor using getOutputDimensions’s const Dims* inputs,)is correct, I can use those values in getOuputDimensions as below;

Dims GenerateProposalLayerPlugin::getOutputDimensions(int index, const Dims* inputs, int nbInputDims)
  { 
    if(index ==0 || index ==1) 
      int rows=inputs[index].d[0]*inputs[index].d[1]*inputs[index].d[2];
      return index ==0 ? DimsCHW(rows, 5):DimsCHW(rows,1);//for "<b>out_roi</b>" & "<b>out_roi_probs</b>"

  }
  1. I think the argument “const Dims* inputs” are handled by tensorRT engine, so the input dimensions will be automatically calculated, right?

  2. And the order of input is also not clear for me. my “unclear assumption” is correct? so that I can use the meta shape info to define output dimension?

  3. If the output size is not fixed and can be changed dynamically during processing, what should I do?
    e.g., output count of rois are 300 in maximum. So it can be any number less than 300. In this case, Should I put empty and dummy rois to out_rois vector to fit the size 300?

Actually the actual number of RoIs are determined only after completing the whole custom layer’s computation. [0<= # of RoIs <=300]
so, I need to “reshape” the output dimension at the end of the custom layer function which are first specified initially by “getOutputDimensions

Please help me.

Thank you!