I am using the TAO framework to train a mask rcnn model. My objective is to reduce the inference latency. I am already using resnet10 backbone, fp16 encoding and reduced the model proposal numbers and outputs. After export, the TensorRT engine generates very nice predictions, corresponding to high AP values (above 80%).
To reduce further the latency, I changed the number of anchors generated by the model by changing the list of possible aspects ratios. By default, there are 3 differents ratios used by the model aspect_ratios: “[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]”. Keeping only one possible ratio in my configuration file “[(1.0, 1.0)]”, I get almost the same metrics (AP still above 80%) during training and evaluation. However, after exporting the model and using the engine file instead of the model.tlt, I get very bad qualitative results (AP would probably be below 30%). This is very unfortunate since I can reduce the latency by approximately 20% by removing these aspect ratios.
To reproduce this issue, I am using the mask_rcnn docker provided in nvidia-tao version 0.1.19. I don’t know if it is relevant here but my GPU and drivers are the following:
GPU Type : RTX3060
Nvidia Driver Version : 495.29.05
CUDA Version : 11.5
The only difference between a working configuration and the other one is the line defining the anchor aspect ratios in the configuration file : aspect_ratios: “[(1.0, 1.0), (3.0, 0.3), (0.3, 3.0)]” is replaced by aspect_ratios: “[(1.0, 1.0)]”.
Is it possible to export a TensorRT engine using a different number of aspect ratios than the 3 provided in the configuration file ?