LPRNet training not converging for 2 line License Plate images

Please provide the following information when requesting support.

• Hardware (GTX 1080)
• Network Type (LPRnet)
• TLT Version (dockers: [‘nvidia/tao/tao-toolkit-tf’, ‘nvidia/tao/tao-toolkit-pyt’, ‘nvidia/tao/tao-toolkit-lm’]
format_version: 2.0
toolkit_version: 3.22.05)

As suggested by @Morganh on a previous post, I have successfully trained a LPRnet model for reading the digits in the lower section of Bangladeshi license plate. The model works great by training with real-world dataset. As further suggested, I am trying to train another LPRnet model to read the upper line. By generating synthetic data of 1M images, I tried to train the model but it is not converging, giving only 0.02% validation accuracy. Here is my training spec file:

random_seed: 42
lpr_config {
  hidden_units: 512
  max_label_length: 8
  arch: "baseline"
  nlayers: 18 #setting nlayers to be 10 to use baseline10 model
}
training_config {
  batch_size_per_gpu: 64
  num_epochs: 200
  checkpoint_interval: 1
  learning_rate {
  soft_start_annealing_schedule {
    min_learning_rate: 1e-6
    max_learning_rate: 1e-4
    soft_start: 0.001
    annealing: 0.5
  }
  }
  regularizer {
    type: L2
    weight: 5e-4
  }
    visualizer{
    enabled: true
}

}
eval_config {
  validation_period_during_training: 1
  batch_size: 64
}
augmentation_config {
    output_width: 96
    output_height: 48
    output_channel: 3
    max_rotate_degree: 5
    rotate_prob: 0.5
    gaussian_kernel_size: 5
    gaussian_kernel_size: 7
    gaussian_kernel_size: 15
    blur_prob: 0.5
    reverse_color_prob: 0.5
    keep_original_prob: 0.3
}
dataset_config {
  data_sources: {
    label_directory_path: "/workspace/tao-experiments/data/bdalpr/train/labels"
    image_directory_path: "/workspace/tao-experiments/data/bdalpr/train/images"
  }
  characters_list_file: "/workspace/tao-experiments/lprnet/specs/bd_lp_characters.txt"
  validation_data_sources: {
    label_directory_path: "/workspace/tao-experiments/data/bdalpr/val/labels"
    image_directory_path: "/workspace/tao-experiments/data/bdalpr/val/images"
  }
}

And the custom character label file
bd_lp_characters.txt (131 Bytes)

For reference, I can provide a sample dataset for reproducing:
sample_bangla_dataset_lprnet.tar.xz (26.5 MB)

I tried to train the model from scratch with the command:

!tao lprnet train --gpus=1 --gpu_index=$GPU_INDEX \
                  -e $SPECS_DIR/tutorial_spec.txt \
                  -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                  -k $KEY

And after 25 epochs, the validation accuracy is 0.02037037037037 with loss = 3.2846608 and lr = 0.0001

I wonder why the first model worked fantastically while the second model fails terribly? Am I missing something crucial? Both the training and validation dataset is generated synthetically with different combination of fonts.

Note: The first model was trained with nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3 wheres the second model is being trained on the latest TAO toolkit.

In the real-world dataset, there are two lines , right?

According to your description, I summarize as below.
real data + 2nd line(digit line) + 3.0-dp tao —> works great
synthetic data + upper line + 22.05 tao —> do not work

How about below?
real data + upper line + 3.0-dp tao

In the real-word dataset, there were two lines in the image. But the label file contained only the label of the second line (digits). The method worked great and the inferencing performance is better than expected.

I can’t train the upper line model with the same method as there is lack of annotated labels. Thus I have generated synthetic dataset. As the validation dataset is also synthetic (with slightly different font), theoretically the model should perform well. If the model wouldn’t perform good when trained with synthetic data and validated with real-world data, then I could conclude that the synthetic dataset is not enough.

I can’t also try training with 3.0-dp tao because the docker container has been removed from my PC and ngc registry endpoint is no more available.

Thanks for the details.
How about
synthetic data + 2nd line(digit line) + 22.05 tao ?

I haven’t tried yet but will give a try and let you know

I have trained with limited synthetic data + 2nd line (digits) + 22.05 tao for 100 epochs and the validation accuracy comes to 0.3. I guess it could be more if I had put more training data and run training for more epochs.

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

So, when you run "synthetic data + 2nd line(digit line) + 22.05 tao ", do you mean you train with synthetic data and then validate with real-world data?

After checking your synthetic data,
image

It is quite different from real data.

That means the data distribution is quite different.

Suggest you to generate synthetic data which is similar to real data as much as possible.
For example, use the same white background and black characters/numbers.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.