Please provide the following information when requesting support.
• Hardware (GTX 1080)
• Network Type (LPRnet)
• TLT Version (dockers: [‘nvidia/tao/tao-toolkit-tf’, ‘nvidia/tao/tao-toolkit-pyt’, ‘nvidia/tao/tao-toolkit-lm’]
format_version: 2.0
toolkit_version: 3.22.05)
As suggested by @Morganh on a previous post, I have successfully trained a LPRnet model for reading the digits in the lower section of Bangladeshi license plate. The model works great by training with real-world dataset. As further suggested, I am trying to train another LPRnet model to read the upper line. By generating synthetic data of 1M images, I tried to train the model but it is not converging, giving only 0.02% validation accuracy. Here is my training spec file:
random_seed: 42
lpr_config {
hidden_units: 512
max_label_length: 8
arch: "baseline"
nlayers: 18 #setting nlayers to be 10 to use baseline10 model
}
training_config {
batch_size_per_gpu: 64
num_epochs: 200
checkpoint_interval: 1
learning_rate {
soft_start_annealing_schedule {
min_learning_rate: 1e-6
max_learning_rate: 1e-4
soft_start: 0.001
annealing: 0.5
}
}
regularizer {
type: L2
weight: 5e-4
}
visualizer{
enabled: true
}
}
eval_config {
validation_period_during_training: 1
batch_size: 64
}
augmentation_config {
output_width: 96
output_height: 48
output_channel: 3
max_rotate_degree: 5
rotate_prob: 0.5
gaussian_kernel_size: 5
gaussian_kernel_size: 7
gaussian_kernel_size: 15
blur_prob: 0.5
reverse_color_prob: 0.5
keep_original_prob: 0.3
}
dataset_config {
data_sources: {
label_directory_path: "/workspace/tao-experiments/data/bdalpr/train/labels"
image_directory_path: "/workspace/tao-experiments/data/bdalpr/train/images"
}
characters_list_file: "/workspace/tao-experiments/lprnet/specs/bd_lp_characters.txt"
validation_data_sources: {
label_directory_path: "/workspace/tao-experiments/data/bdalpr/val/labels"
image_directory_path: "/workspace/tao-experiments/data/bdalpr/val/images"
}
}
And the custom character label file
bd_lp_characters.txt (131 Bytes)
For reference, I can provide a sample dataset for reproducing:
sample_bangla_dataset_lprnet.tar.xz (26.5 MB)
I tried to train the model from scratch with the command:
!tao lprnet train --gpus=1 --gpu_index=$GPU_INDEX \
-e $SPECS_DIR/tutorial_spec.txt \
-r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
-k $KEY
And after 25 epochs, the validation accuracy is 0.02037037037037 with loss = 3.2846608 and lr = 0.0001
I wonder why the first model worked fantastically while the second model fails terribly? Am I missing something crucial? Both the training and validation dataset is generated synthetically with different combination of fonts.
Note: The first model was trained with nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3
wheres the second model is being trained on the latest TAO toolkit.