Hi everyone,
I trained a model for classification tasks with TAO.
I used ResNet18 for classification and divided my dataset into the training set, validation set, and testing set(70%, 10%, 20%).
I trained the model several times and each time validation set accuracy and testing set accuracy were higher than training set accuracy. I shuffled the dataset several times but each time the result was the same and the training set accuracy was lower than the validation and the testing set accuracy. Also, I changed the training set, validation set, and testing set ratio into (40% ,40% , 20%) but the result was the same.
The best result was :
training set accuracy : 0.94
validation set accuracy : 0.98
testing set accuracy : 0.97
classification model => ResNet18
TAO version => v3.21.08-py3
RTX 2080
Unfortunately did not change the results. I also examine ‘relu’ and ‘swish’, but did not change. I share my training config and I ask you to see it, please.
Thanks a lot.
Actually it is normal to get this result since they are different datasets.
Training dataset is different from test or val dataset.
We cannot draw a conclusion that for tao classification network the training accuracy will be always lower than val set.
Sometimes val accuracy is higher and sometimes training accuracy is higher.
You can try more experiments on other datasets. My result above is running with Imagenet dataset.
Your comment is true but I trained ResNet with all parameters in the TAO train config file in the Tensorflow framework. The training set accuracy was higher than the validation set accuracy with the Tensorflow framework.
May you say it was true or false train config file? all parameter was true?
For example, below config is an example to train Imagenet dataset.
model_config {
# Model Architecture can be chosen from:
# ['resnet', 'vgg', 'googlenet', 'alexnet']
arch: "cspdarknet"
# for resnet --> n_layers can be [10, 18, 50]
# for vgg --> n_layers can be [16, 19]
n_layers: 53
use_batch_norm: True
use_bias: False
use_imagenet_head: True
all_projections: False
use_pooling: True
# if you want to use the pretrained model,
# image size should be "3,224,224"
# otherwise, it can be "3, X, Y", where X,Y >= 16
input_image_size: "3,224,224"
activation {
activation_type: "mish"
}
}
train_config {
train_dataset_path: "/raid/ImageNet2012/ImageNet2012/train"
val_dataset_path: "/raid/ImageNet2012/ImageNet2012/val"
# Only ['sgd', 'adam'] are supported for optimizer
optimizer {
sgd {
lr: 0.01
decay: 0.0
momentum: 0.9
nesterov: False
}
}
preprocess_mode: "torch"
enable_random_crop: True
enable_center_crop: True
label_smoothing: 0.0
batch_size_per_gpu: 64
n_epochs: 300
mixup_alpha: 0.2
# Number of CPU cores for loading data
n_workers: 40
# regularizer
reg_config {
# regularizer type can be "L1", "L2" or "None".
type: "L2"
# if the type is not "None",
# scope can be either "Conv2D" or "Dense" or both.
scope: "Conv2D,Dense"
# 0 < weight decay < 1
weight_decay: 0.00003
}
# learning_rate
lr_config {
cosine{
learning_rate: 0.05
soft_start: 0.0
min_lr_ratio: 0.001
}
}
}