Please provide the following information when requesting support.
• Hardware : GTX 1080
• Network Type Classification
• TLT Version:
dockers: [‘nvidia/tao/tao-toolkit-tf’, ‘nvidia/tao/tao-toolkit-pyt’, ‘nvidia/tao/tao-toolkit-lm’]
format_version: 2.0
toolkit_version: 3.22.05
published_date: 05/25/2022
I am trying to train a vehicle classification model from the VehicleTypeNet pretrained model. I have 10 classes of vehicles, each class having roughly about 20K images. When I train the model in TAO, after 147 epochs the validation precision is low in some classes as per the evaluation report:
Found 22809 images belonging to 10 classes.
INFO: Processing dataset (evaluation): /workspace/tao-experiments/data/test
Evaluation Loss: 1.7708099206685863
Evaluation Top K accuracy: 0.5250997413144974
Found 22809 images belonging to 10 classes.
INFO: Calculating per-class P/R and confusion matrix. It may take a while...
Confusion Matrix
[[1008 44 1 5 437 101 29 161 2 29]
[ 42 1062 3 7 856 8 7 26 3 91]
[ 1 3 137 1 980 4 9 0 0 3]
[ 142 102 3 3858 2050 12 29 1944 153 64]
[ 0 17 15 2 432 7 0 1 1 2]
[ 32 8 0 5 174 1490 8 8 1 2]
[ 0 0 0 0 0 0 28 0 0 0]
[ 2 0 0 4 3 0 0 501 0 0]
[ 12 9 3 129 217 4 13 215 1098 26]
[ 13 22 2 9 2344 11 51 101 7 2363]]
Classification Report
precision recall f1-score support
bigbus 0.81 0.55 0.66 1817
heavytruck 0.84 0.50 0.63 2105
lighttruck 0.84 0.12 0.21 1138
microbus 0.96 0.46 0.62 8357
midtruck 0.06 0.91 0.11 477
minibus 0.91 0.86 0.89 1728
motorbike 0.16 1.00 0.28 28
sedan 0.17 0.98 0.29 510
suv 0.87 0.64 0.73 1726
threewheeler 0.92 0.48 0.63 4923
accuracy 0.53 22809
macro avg 0.65 0.65 0.50 22809
weighted avg 0.87 0.53 0.62 22809
For example, the midtruck, sedan and motorbike class accuracy is so low compared to others. I have checked both the training and validation dataset but found no inconsistency. My training spec file is as follows:
model_config {
arch: "resnet",
n_layers: 18
# Setting these parameters to true to match the template downloaded from NGC.
use_batch_norm: true
all_projections: true
freeze_blocks: 0
freeze_blocks: 1
input_image_size: "3,224,224"
}
train_config {
train_dataset_path: "/workspace/tao-experiments/data/train"
val_dataset_path: "/workspace/tao-experiments/data/val"
pretrained_model_path: "/workspace/tao-experiments/pretrained_resnet18/resnet18_vehicletypenet.tlt"
optimizer {
sgd {
lr: 0.01
decay: 0.0
momentum: 0.9
nesterov: False
}
}
batch_size_per_gpu: 64
n_epochs: 200
n_workers: 16
preprocess_mode: "caffe"
enable_random_crop: True
enable_center_crop: True
label_smoothing: 0.0
mixup_alpha: 0.1
# regularizer
reg_config {
type: "L2"
scope: "Conv2D,Dense"
weight_decay: 0.00005
}
# learning_rate
lr_config {
step {
learning_rate: 0.006
step_size: 10
gamma: 0.1
}
}
visualizer{
enabled: true
}
}
eval_config {
eval_dataset_path: "/workspace/tao-experiments/data/test"
model_path: "/workspace/tao-experiments/classification/output/weights/resnet_097.tlt"
top_k: 1
batch_size: 256
n_workers: 8
enable_center_crop: True
}