Transfer Learning using freeze_block property of Yolov4 config using Resnet 18 Arch

TLT Version → docker_tag: v3.21.08-py3
Network Type → Yolov4

Hi,

I am just trying to understand the concept of freeze blocks property for resnet 18 architecture. As I trained my custom dataset till 100 epochs and got map around 84% without using freeze_blocks property. Then I tried to analyse map variation by training using different different freeze blocks 0,1,2 and 3 and I got map for these freeze blocks 62%,86%,70% and 90% respectively. In the documentation it is written that ( A general principle to keep in mind is that the smaller the block ID, the closer it is to the model input; the larger the block ID, the closer it is to the model output).

So, my question is how many layers it is freezing for a specific freeze_block id and as I used diff diff freeze_block id and getting diff diff map so what value will be good in order to get the better accuracy and in order to save computaional time. Without freeze_block i am getting map around 84% then if am using freeze_block as 0 it decreasing to 62% which is not good. the what is the benefit to use this.

As, I am a begginer in this field I just want to know how this network is behaving while I am training with diff diff freeze_blocks. Can u please explain me this transfer learning concept in detail so that I can understand better.

Hi,

I am waiting for the response from your side.

Weights of layers in those blocks will be freezed during training.
As mentioned in YOLOv4 — TAO Toolkit 3.22.05 documentation
The list of block IDs to be frozen in the model during training. You can choose to freeze some of the CNN blocks in the model to make the training more stable and/or easier to converge. The definition of a block is heuristic for a specific architecture (for example, by stride or by logical blocks in the model). However, the block ID numbers identify the blocks in the model in a sequential order so you don’t have to know the exact locations of the blocks when you do training. A general principle to keep in mind is that the smaller the block ID, the closer it is to the model input; the larger the block ID, the closer it is to the model output.

can u elaborate more on this according to what exactly I wrote other than that is written in the documentation. Please read my full query so that u can understand what exactly I am trying to analyse.

As mentioned above, the definition of a block is heuristic for a specific architecture (for example, by stride or by logical blocks in the model). However, the block ID numbers identify the blocks in the model in a sequential order so you don’t have to know the exact locations of the blocks when you do training.

How many training images in your custom dataset? Can this be reproduced every time? Suggest you trying to use public dataset(KITTI datset) to train as well.

Weights of layers in those blocks will be freezed during training. The training will be more stable and/or easier to converge.

Hi, So if I am using Freeze block id as 3 so does it freeze all the blocks till 3 like 0,1,2,3 or it will freeze only the blocks which reside at Block id 3?

It will freeze only the blocks which reside at Block id 3.

So, What is the way to freeze all blocks?. what exactly I need to write in order to freeze all the Block id’s?

Refer to SSD — TAO Toolkit 3.22.05 documentation

freeze_blocks: [0,1,2,3]

May I know if you are using pretrained model?

Yes,I am using pretrained_object_detection_vresnet18 to train Custom Dataset

If you are using ngc pretrained model to train the custom dataset for the first time, please not set freeze_blocks.
Usually freeze blocks if a pretrained model can get an acceptable mAP for your custom dataset.

Yes I am using NGC Model. ok. So Firstly I need to check without using freeze blocks and then need to check whether it will give the acceptable map or not. If it is giving then only I can use freeze blocks. This is what u r telling right?

Train without using freeze blocks. Then the resulted model can be a pretrained model for you to train more custom dataset with using freeze_blocks.

1 Like

How many Layers it is freezing for a single block id?

For resnet18 backbone, the block 0 is the blue part.

1 Like

So, green part will be a block 1 right?

Yes.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.