TLT Version → docker_tag: v3.21.08-py3
Network Type → Yolov4
Hi,
I am just trying to understand the concept of freeze blocks property for resnet 18 architecture. As I trained my custom dataset till 100 epochs and got map around 84% without using freeze_blocks property. Then I tried to analyse map variation by training using different different freeze blocks 0,1,2 and 3 and I got map for these freeze blocks 62%,86%,70% and 90% respectively. In the documentation it is written that ( A general principle to keep in mind is that the smaller the block ID, the closer it is to the model input; the larger the block ID, the closer it is to the model output).
So, my question is how many layers it is freezing for a specific freeze_block id and as I used diff diff freeze_block id and getting diff diff map so what value will be good in order to get the better accuracy and in order to save computaional time. Without freeze_block i am getting map around 84% then if am using freeze_block as 0 it decreasing to 62% which is not good. the what is the benefit to use this.
As, I am a begginer in this field I just want to know how this network is behaving while I am training with diff diff freeze_blocks. Can u please explain me this transfer learning concept in detail so that I can understand better.
Weights of layers in those blocks will be freezed during training.
As mentioned in YOLOv4 — TAO Toolkit 3.22.05 documentation
The list of block IDs to be frozen in the model during training. You can choose to freeze some of the CNN blocks in the model to make the training more stable and/or easier to converge. The definition of a block is heuristic for a specific architecture (for example, by stride or by logical blocks in the model). However, the block ID numbers identify the blocks in the model in a sequential order so you don’t have to know the exact locations of the blocks when you do training. A general principle to keep in mind is that the smaller the block ID, the closer it is to the model input; the larger the block ID, the closer it is to the model output.
can u elaborate more on this according to what exactly I wrote other than that is written in the documentation. Please read my full query so that u can understand what exactly I am trying to analyse.
As mentioned above, the definition of a block is heuristic for a specific architecture (for example, by stride or by logical blocks in the model). However, the block ID numbers identify the blocks in the model in a sequential order so you don’t have to know the exact locations of the blocks when you do training.
How many training images in your custom dataset? Can this be reproduced every time? Suggest you trying to use public dataset(KITTI datset) to train as well.
Weights of layers in those blocks will be freezed during training. The training will be more stable and/or easier to converge.
Hi, So if I am using Freeze block id as 3 so does it freeze all the blocks till 3 like 0,1,2,3 or it will freeze only the blocks which reside at Block id 3?
If you are using ngc pretrained model to train the custom dataset for the first time, please not set freeze_blocks.
Usually freeze blocks if a pretrained model can get an acceptable mAP for your custom dataset.
Yes I am using NGC Model. ok. So Firstly I need to check without using freeze blocks and then need to check whether it will give the acceptable map or not. If it is giving then only I can use freeze blocks. This is what u r telling right?