Improving mAP of License Plate Detection

Would you give us to improve the precision with using OpenALPR data?

Here is our results.

Epoch 120/120
=========================

Validation cost: 0.000082
Mean average_precision (in %): 63.5910

class name      average precision (in %)
------------  --------------------------
lpd                               63.591

• Hardware (T4/V100/Xavier/Nano/etc) : A100
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) : detectnet_v2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here): v3.0-py3
• Training spec file(If have, please share here)

  random_seed: 42
dataset_config {
  data_sources {
    tfrecords_path: "/workspace/output/lpd/experiments/lpd_tfrecord/*"
    image_directory_path: "/workspace/output/lpd/experiments/lpd/data/"
  }
  image_extension: "jpg"
  target_class_mapping {
    key: "lpd"
    value: "lpd"
  }
  validation_fold: 0
}
augmentation_config {
  preprocessing {
    output_image_width: 640
    output_image_height: 480
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contrast_scale_max: 0.10000000149
    contrast_center: 0.5
  }
}
postprocessing_config {
  target_class_config {
    key: "lpd"
    value {
      clustering_config {
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.20000000298
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 4
      }
    }
  }
}
model_config {
  pretrained_model_file: "/workspace/output/lpd/experiments/pretrained/usa_unpruned.tlt"
  num_layers: 18
  use_batch_norm: true
  objective_set {
    bbox {
      scale: 35.0
      offset: 0.5
    }
    cov {
    }
  }
  training_precision {
    backend_floatx: FLOAT32
  }
  arch: "resnet"
}
evaluation_config {
  validation_period_during_training: 10
  first_validation_epoch: 1
  minimum_detection_ground_truth_overlap {
    key: "lpd"
    value: 0.699999988079
  }
  evaluation_box_config {
    key: "lpd"
    value {
      minimum_height: 10
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  average_precision_mode: INTEGRATE
}
cost_function_config {
  target_classes {
    name: "lpd"
    class_weight: 1.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  enable_autoweighting: true
  max_objective_weight: 0.999899983406
  min_objective_weight: 9.99999974738e-05
}
training_config {
  batch_size_per_gpu: 4
  num_epochs: 120
  enable_qat: False
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-06
      max_learning_rate: 5e-04
      soft_start: 0.10000000149
      annealing: 0.699999988079
    }
  }
  regularizer {
    type: L1
    weight: 3.00000002618e-09
  }
  optimizer {
    adam {
      epsilon: 9.99999993923e-09
      beta1: 0.899999976158
      beta2: 0.999000012875
    }
  }
  cost_scaling {
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
  checkpoint_interval: 10
}
bbox_rasterizer_config {
  target_class_config {
    key: "lpd"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.40000000596
      cov_radius_y: 0.40000000596
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.400000154972
}

• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

It was just followed the way in NV blog.

$ cat SPECS_tfrecord.txt
kitti_config {
  root_directory_path: "/workspace/output/lpd/experiments/lpd/data"
  image_dir_name: "image"
  label_dir_name: "label"
  image_extension: ".jpg"
  partition_mode: "random"
  num_partitions: 2
  val_split: 20
  num_shards: 4
}
image_directory_path: "/workspace/output/lpd/experiments/lpd/data"
$ detectnet_v2 dataset_convert -d specs/SPECS_tfrecord.txt -o $LOCAL_EXPERIMENT_DIR/lpd_tfrecord/lpd
$ detectnet_v2 train -e specs/SPECS_train.txt -r $LOCAL_EXPERIMENT_DIR/exp_unpruned_epoch -k nvidia_tlt

Best reagrds.
Kaka

Hi,
Could you share the training log? You can attach it as a txt file. We can check the loss in it.

Sure. Please see attached log file.
train.log (468.9 KB)
Note: It was changed the epoch size from 120 to 240.

Can you try more epochs? For example, 1200.

Hi

I performed the additional learning with more epoch size but the mAP did not improve…

Epoch 1400/1400
=========================

Validation cost: 0.000098
Mean average_precision (in %): 62.3547

class name      average precision (in %)
------------  --------------------------
lpd                              62.3547

It is very strange for your case. Can you check your training log if there is higher mAP than 62.3547% during every 10 epochs?

The output was erased… So, I re-run the training and attached the completed output log file. You can confirm the mAP values in timeline.

Untitled.md (2.6 MB)

The best mAP values was 71.8334 but it is not reach to your reference result in blog.

    Epoch 461/1400
    =========================
    
    Validation cost: 0.000096
    Mean average_precision (in %): 71.8334
    
    class name      average precision (in %)
    ------------  --------------------------
    lpd                              71.8334

Could you help generate new tfrecord files and try again? I’m afraid in those 222 images, different training images or validation images will result in different results.

I re-generated the new tfrecord file and try again… Please see the output log file for each commands.

Untitled.md (2.6 MB)

The blog is released about half a year ago. It aims at 3.0-dp-py3 docker.
If possible, you can try that docker.
More, since the OpenALPR has only 222 images( train 178 images and validate 44 images) ,the mAP may vary. Suggest training a public dataset which contains more images.

Also, you can run tao inference to get the annotated images. To check which lpd are detected correctly and which are not.