Collecting your own detection datasets question

When following the tutorial to “collect your own detection datasets” for re-training with Mobilenet SSD, it says that you can use a tool like CVAT to annotate images if you already have a bunch of images. I want to try that, and I now have CVAT and it works fine, but I have a question about bounding boxes. In your tutorial you use rectangular bounding boxes, but CVAT has an option to use polygons if desired. The few articles I read in general seem to indicate that you potentially can get better results using polygons instead of rectangles. They also seem to work better for some of the objects in my images - objects may be long and narrow, but oriented at a 30 degree angle (for example) so that a rectangular bounding box would have to include a lot of extra background that isn’t part of the object - so it makes sense to use a polygon. But I am wondering if Mobilenet SSD can accept non-rectangular bounding boxes the way we are re-training in the tutorial? Or is there some disadvantage from using non-rectangular bounding boxes? Thank you


You will first need a model that can generate polygon output tensor.

For Mobilenet SSD, the output is bounding box.
You cannot directly change it into polygon since this the model architecture need to be updated as well.


Thank you - that is exactly what I wanted to know.