Interpreting output of MaskRCNN from TLT to TRT

See TLT different results - #9 by Morganh
So, please modify the preprocessing according to the hint in deepstream_tlt_apps/pgie_peopleSegNetv2_tlt_config.txt at release/tlt3.0 · NVIDIA-AI-IOT/deepstream_tlt_apps · GitHub

net-scale-factor=0.017507
offsets=123.675;116.280;103.53
model-color-format=0

Similar to keras-applications/imagenet_utils.py at master · keras-team/keras-applications · GitHub
if mode == ‘torch’:
x /= 255.
mean = [0.485, 0.456, 0.406]
std = [0.224, 0.224, 0.224]

Here are the preprocessing steps in TLT.

  1. For a given image, keep its aspect ratio and rescale the image to make it the largest rectangle to be bounded by the rectangle specified by the target_size .
  2. Pad the rescaled image such that the height and width of the image become the smallest multiple of the stride that is larger or equal to the desired output dimension.
  3. As mentioned above, will scale pixels between 0 and 1 and then will normalize each channel

Refer to:

and Discrepancy between results from tlt-infer and trt engine - #8 by Morganh, change to inference_input = preprocess_input(inf_img.transpose(2, 0, 1), mode="torch")