Byom Approach -Model Biased towards a single class

Hi,
I trained a mobilenetv2 based classification model on a balanced 5 class flower dataset.
After training the model shows strong bias toward a single class .
Evaluation results

                 precision     recall     f1-score    support

   daisy            0.00      0.00      0.00       177

dandelion 0.24 1.00 0.39 250

   roses            0.00      0.00      0.00       185

sunflowers 0.00 0.00 0.00 198

  tulips             0.00      0.00      0.00       216

accuracy                           0.24      1026

macro avg 0.05 0.20 0.08 1026

weighted avg 0.06 0.24 0.10 1026

train_spec.yaml is

model_config {

BYOM Model Architecture can be chosen

arch: “mobilenet_v2”

Pass the path of the converted BYOM model path

byom_model: “/workspace/tao-experiments/mobilenet_v2.tltb”

If use_imagenet_head is set False, -p should have been

passed to tao_byom command

#use_imagenet_head: False
resize_interpolation_method: BICUBIC

the input image size should match that of your original ONNX model.

input_image_size: “3,224,224”
}
train_config {
train_dataset_path: “/workspace/tao-experiments/Dataset/train”
val_dataset_path: “/workspace/tao-experiments/Dataset/valid”

#pretrained_model_path: “/workspace/tao-experiments/model/person.tltb”

Only [‘sgd’, ‘adam’] are supported for optimizer

optimizer {
sgd {
lr: 0.01
decay: 0.0001
momentum: 0.9
nesterov: False
}
}
batch_size_per_gpu: 16
n_epochs: 40
n_workers: 1

regularizer

reg_config {
# regularizer type can be “L1”, “L2” or “None”.
type: “L2”
# if the type is not “None”,
# scope can be either “Conv2D” or “Dense” or both.
scope: “Conv2D,Dense”
# 0 < weight decay < 1
weight_decay: 0.0001
}

learning_rate

lr_config {
cosine {
learning_rate: 0.01
soft_start: 0.05
}
}
enable_random_crop: True
enable_center_crop: True
enable_color_augmentation: True
mixup_alpha: 0.2
label_smoothing: 0.1
preprocess_mode: “tf”

image_mean {
key: ‘b’
value: 103.9
}
image_mean {
key: ‘g’
value: 116.8
}
image_mean {
key: ‘r’
value: 123.7
}

}
eval_config {
eval_dataset_path: “/workspace/tao-experiments/Dataset/test”
model_path: “/workspace/tao-experiments/train/weights/mobilenet_v2_040.hdf5”
batch_size: 8
n_workers: 1
enable_center_crop: True
}

How can I address this issue during training ?

Which version of tao notebooks do you run? Is it 4.0.2? Your shared spec file is similar to 4.x version. See TAO Toolkit Getting Started | NVIDIA NGC.
Also, is it a must for you to use the 3rd-party onnx model? Could you please share its link?
If it is not a must to run with this 3rd-party onnx model, you can follow latest TAO and its pretrained model to train your dataset directly. For classification, you can run with transformer-based models as well.

Onnx model Link :
models/validated/vision/classification/mobilenet/model/mobilenetv2-10.onnx at main · onnx/models
format_version: 3.0
toolkit_version: 5.5.0

Could you please share the official spec file or a reference link compatible with TAO v5.3.0 (or latest version) ?

The spec file and notebooks are available in TAO Toolkit Getting Started | NVIDIA NGC . You can find different versions.

Hi,

• Network Type: Mobilenetv2
• TLT Version: TAO v5.3.0
• Framework: TF1

spec file for training is uploaded below (I’m uploading the text since yaml is not supported here):
spec_file_training.txt (1.9 KB)

Even I faced the similar issues with Mobilenet_v2 model trained on Imagenet earlier while training on a multi-class dataset in TAO BYOM.

One thing I observed from spec file is [preprocess_mode: “torch”] is similar to imagenet. Any other steps are we missing to know, since all the below steps are followed as specified in TAO doc

  1. Convert the model from ONNX to .tltb with penultimate layer(global average pooling removed)

  2. Check with dataset format and the classmap.json

  3. Training is ongoing without any errors but the evaluation results is similar to the above post.

Please let us know for any corrections to be made in the above procedure followed.

How to leverage the pretrained model to train on custom dataset. Please let me know the link for Model available. Is it from the jupyter notebook provided?

Please check the training log to check the loss status.

Yes, you can find BYOM notebook for TAO tf2 docker. See tao_tutorials/notebooks/tao_launcher_starter_kit/classification_tf2/byom_voc/byom_classification.ipynb at main · NVIDIA/tao_tutorials · GitHub.

The issue was resolved by switching from Tf1 to TF2. Maybe TF1 is deprecated and some of the functionalities are not been working as expected. Thanks for the input.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.