Training Custom Object detector with 6 classes

Morganh · October 21, 2019, 3:23pm

Hi neophyte1,
Glad to know you improve the AP for class car. With resnet18, right? May I know your last result of it?

You mention a new issue for pruned and retrain, right? How did you set pth to 5.2e-6? I recall it is not the default value in the notebook.

neophyte1 · October 23, 2019, 11:32am

Hi Morganh,

With resnet-18 after training I am able to achieve 48% precision for car class. I still do not understand the working of tlt-prune tool. The “pth” value was already mentioned in the jupyter notebook. Hence I used the same. Surprisingly this value seems to work for resnet-10 based model but for resnet-18 based model it does not work.

To give more information with resnet-10 based model, I get the following log:

[INFO] iva.common.magnet_prune: Pruning ratio (pruned model / original model): 0.122844558547

With resnet-18 and same value of “pth”, I get the following log:

[INFO] iva.common.magnet_prune: Pruning ratio (pruned model / original model): 1

I don’t understand why so much difference in the pruning ratio.

Thanks.

Morganh · October 23, 2019, 2:18pm

See https://devtalk.nvidia.com/default/topic/1065254/transfer-learning-toolkit/how-to-retrain-the-model-after-pruning-/post/5394735/#5394735 for more details about pth.

neophyte1 · October 31, 2019, 5:00am

Hi Morganh,

I have few queries:

Based on your suggestion I am trying to tune class weights for the different classes and try to find a pattern which I can use to progress further. However, I am unable to comprehend the concept of class weights. I had an assumption that the class weight should be inversely proportional to the number of instances. However, this assumption falls flat on a number of occasions. Can you please let me know the significance and usage of class weights. The only information I got from you is that these weights are relative. So does that mean class weights 10, 8, 6 are the same as 5, 4, 3 ?
Why the accuracy drops when I increase the size of the training input image ? I tried out experiments with the same dataset by maintaining the aspect ratio. In one experiment, I scaled to 480x480 resolution and in the other to 640x640 resoltuion. Surprisingly, I noticed that there is a drop in accuracy for the 640x640 case. This is counter-intuitive since larger images should imply better accuracy for detection. Generally, object detectors perform better on larger image sizes.
In model_config module of the training spec file there are couple of parameters, “scale” and “offset”. What do these parameters mean? How should they be modified? What is their significance?
I am using pretrained resnet10 model. Can you let me know what input training image size it was trained on ?
What is the difference between detectnet_v1 and detectnet_v2 ?

Please help me out.

Morganh · November 1, 2019, 2:59am

Hi neophyte1,

The class weight is assigned to the corresponding target class’s cost. It can balance the training dataset for each class. Weights 10, 8, 6 are the same as 5, 4, 3 since the proportion of each class is the same for the two cases.
As we discussed previously, training image size should be closer to input image size. Our experience tells a rule, make training as close as actual.
According to tlt doc Integrating TAO Models into DeepStream — TAO Toolkit 3.22.05 documentation , “objective_set” defines what objectives is this network being trained for. For object detection networks, we set to learn cov and bbox. These parameters should not be altered for the current training pipeline.
Sorry, I did not know about that. Furthermore,the pretrained model can be used for any input sizes, it doesn’t care the input dimensions.
The naming of DetectNet_v2 is due to some historical reason in NVIDIA during the development of the DetectNet algorithm. We just want to distinguish it with our internal older version of the DetectNet algorithm.

neophyte1 · November 14, 2019, 6:45am

Hi Morganh,

I am trying to use models trained using TLT with Deepstream SDK. I wish to know how the resizing of the input image is done. Is the aspect ratio maintained while resizing before making inferences?

Thanks.

Morganh · November 16, 2019, 3:44am

Hi neophyte1,
For integrating a Detectnet_v2 model into DeepStream, as for the aspect ratio, need to set input-dims correctly in DS configuration file.

input-dims=c;h;w;0   # where c = number of channels, h = height of the model input , w = width of model input, 0: implies CHW format.

Topic		Replies	Views
Object Detection using TAO DetectNet_v2. The category accuracy results are missing TAO Toolkit	17	888	January 13, 2022
TLT-Train DetectNetv2 ResNet18 always give mAP 0% at target class TAO Toolkit	11	660	October 26, 2022
Mean average precision of 0.00 in training Trafficcamnet model using Tao Toolkit TAO Toolkit deepstream	25	263	January 13, 2025
mAP training several classes = 0.0 and so low with data custom using detectnet_v2 (resnet_18)) TAO Toolkit	33	906	February 1, 2024
Tlt detectnet training focusing on a particular class? TAO Toolkit	16	1511	October 12, 2021
Too many false positives. TAO Toolkit	33	2671	October 12, 2021
TLT - retrain trafficcamnet with customized data precision is 0 TAO Toolkit	20	903	October 12, 2021
Problem of tao detectnet_v2 evaluate 0% TAO Toolkit python	21	611	July 7, 2023
Tlt-train loss is minimal but performances are bad TAO Toolkit	11	633	October 12, 2021
unkown error by horovod TAO Toolkit	15	1823	October 12, 2021

Training Custom Object detector with 6 classes

Related topics