How to calculate Anchor Shape Sizes for Frcnn

Hello, I am quite to this Nvidia TLT toolkit, so i would like to apologize in advance for troubling you. I am currently using the Faster Rcnn TLT model (ver 3.0) for Object detection. My image resolution is 1920x1080. I have already calculated the average height and width of my ground truth boxes which are 24 pixel and 40 pixel respectively. May i know how to calculate the anchor scale and ratio parameter under the anchor_box_config? i have already referred to this discussion forum Which detection model will give more accuracy for arial view image detection! - #20 by samjith888 but still could not quite understand. I look forward to your reply and much thanks :)

Please refer to below where I also mentioned in that topic.

For anchor_box_config {
scale: 8.0
scale: 16.0
scale: 32.0
ratio: 1.0
ratio: 0.5
ratio: 2.0

The anchor will be

array([[[ 8.      ,  8.      ],
        [ 5.656854, 11.313708],
        [11.313708,  5.656854]],

       [[16.      , 16.      ],
        [11.313708, 22.627417],
        [22.627417, 11.313708]],

       [[32.      , 32.      ],
        [22.627417, 45.254833],
        [45.254833, 22.627417]]], dtype=float32)

8*sqrt(1)= 8
8*sqrt(0.5)= 5.656854
8*sqrt(2)  = 11.313708
16*sqrt(1)= 16
16*sqrt(0.5)= 11.313708
16*sqrt(2)  = 22.627417
32*sqrt(1)= 32
32*sqrt(0.5)= 22.627417
32*sqrt(2)  = 45.254833

Yes i have seen that comment but I don’t quite understand what the different numbers stand for

Can you elaborate which number you do not understand? Thanks.

What do the numbers for Scale parameters ( e.g . 8, 16 and 32) represent and the numbers for the ratio parameter (e.g. 1.0, 0.5 and 2.0) represent

Above is just an example. For above example, it means that if end user set such parameters in anchor_box_config, it will generate the anchors mentioned above.

referring to this example, does this means there will be 9 variations of anchor shape generated?


I see, may i know is this formula the same for TLT models such as Ssd and Yolov4 as well?
And the anchor shapes format go by Width and height am i right? Example for [ 5.656854, 11.313708], 5.656854 represent width and 11.313708 represent height

The configurate parameter are not the same for SSD and yolo_v4. Please refer to tlt/tao user guide.


Ok, thank you so much and sorry for troubling you :)

Hi @Morganh, sorry to disturb you once more. i have already read through the tlt documentation for ssd model and the values specified in the scales and aspect_ratio_global parameters seems to be different from the calculation of frcnn as the values provided in the sample are way smaller. Do u mind giving me an example for the calculation between the scales and aspect_ratio_global parameters? Really sorry for the trouble

Refer to How to set parameters for SSD sample - #2 by Morganh

Hi i have read the forum that you have sent but it still don’t explain the calculation between aspect_ratio_global and scale parameter. Is it ok if u show the calculation here using an example of these parameters as shown below?
aspect_ratio_global parameters = [[1.0, 2.0, 0.5], [1.0, 2.0, 0.5]]
my scale = [0.07, 0.15, 0.33, 0.51, 0.69, 0.87, 1.05]

Refer to SSD with achor box

