Hi, I searched about it in TLT blog and docs but I didn’t found a mAP/latency comparisson between backbones/models available in TLT. Which backbone + model should have more accuracy (regardless of performance/latency)? It would be the Faster RCNN + EfficientNet B1? Has plans to make an public comparisson table between models?
In TLT user guide, it provides mAP and FPS information for some Purpose-built models. They are mostly based on TLT detectnet_v2 network. See Overview — Transfer Learning Toolkit 3.0 documentation .
The mAP result usually varies due to different input_size, dataset, pretrained models, networks, backbones, etc. For example, pretrained weights trained on the ImageNet dataset tend to provide good accuracy for object detection. But we cannot release the pretrained models trained on ImageNet. So, we write some blogs to show the steps to achieve this accuracy with TLT. For example, Preparing State-of-the-Art Models for Classification and Object Detection with the NVIDIA Transfer Learning Toolkit | NVIDIA Developer Blog .
For your case, if you want to find a model which has more accuracy (regardless of performance/latency), you can consider following blog to get pretrained model trained on the ImageNet dataset and then try TLT yolo_v4 or retinanet.