SSD Mobilenet-v2 benchmark with 960x544 inputs


I would like to replicate the SSD Mobilenet-v2 benchmark with 960x544 sized inputs.
As I understand from the source code, the currently provided sample_unpruned_mobilenet_v2.uff is for 300x300 inputs.
I see that there is a that specifies this input size but I believe the README.txt that describes how to use this is for ssd_inception_v2_coco_2017_11_17. From the other thread discussions though, I understand this mobilenet V2 is tranined on the dataset.

So what is the correct way to replicate the SSD Mobilenet v2 with 960x544 inputs, without manually training with the oxford dataset?

Thanks in advance


If your target is to improve accuracy with larger input, please retrain the model with TensorFlow frameworks.
If you just want to feed an 960x544 image into 300x300 network, a general way is to resize the image directly.


Hi AastaLLL,

THanks for the reply.

Yes I can downsample the 960x544 image to 300x300. But my aim is to first replicate the results announced by Nvidia on benchmarking page and move from that point on. Here it is clearly specified that the given rates are for inferencing only, which does not include image processing. Yet we see that the mobilenet v2 for 960x544 is slower than 300x300. From this, I am deducing that a separate network was used for 960x544.