YOLO V4 TLT Training Low mAP

1733208392 · June 18, 2021, 11:34am

I am trying to train YOLO V4 with custom dataset, the dataset images are 1500+ images that are crawled from the intenet with sizes varies. I have attached a few as example.
018c2fdb-c9e1-4653-b98f-07e226b62ae5
00485f4d-c60e-4bd3-9f4d-77c34b880dec

However, the training seems to have very low APs like 0.07 after 20-30 epochs, can someone help to look into it the issue?

Thanks,
Kai

The config file is as follows,

random_seed: 42
yolov4_config {
  big_anchor_shape: "[(114.94, 60.67), (159.06, 114.59), (297.59, 176.38)]"
  mid_anchor_shape: "[(42.99, 31.91), (79.57, 31.75), (56.80, 56.93)]"
  small_anchor_shape: "[(15.60, 13.88), (30.25, 20.25), (20.67, 49.63)]"
  box_matching_iou: 0.25
  arch: "resnet"
  nlayers: 18
  arch_conv_blocks: 2
  loss_loc_weight: 0.8
  loss_neg_obj_weights: 100.0
  loss_class_weights: 0.5
  label_smoothing: 0.0
  big_grid_xy_extend: 0.05
  mid_grid_xy_extend: 0.1
  small_grid_xy_extend: 0.2
  freeze_bn: false
  #freeze_blocks: 0
  force_relu: false
}
training_config {
  batch_size_per_gpu: 4
  num_epochs: 80
  enable_qat: false
  checkpoint_interval: 5
  learning_rate {
    soft_start_cosine_annealing_schedule {
      min_learning_rate: 1e-4
      max_learning_rate: 1e-2
      soft_start: 0.3
    }
  }
  regularizer {
    type: L1
    weight: 3e-5
  }
  optimizer {
    adam {
      epsilon: 1e-7
      beta1: 0.9
      beta2: 0.999
      amsgrad: false
    }
  }
  pretrain_model_path: "/workspace/tlt-experiments/yolo_v4/pretrained_resnet18/tlt_pretrained_object_detection_vresnet18/resnet_18.hdf5"
}
eval_config {
  average_precision_mode: SAMPLE
  batch_size: 4
  matching_iou_threshold: 0.5
}
nms_config {
  confidence_threshold: 0.001
  clustering_iou_threshold: 0.5
  top_k: 200
}
augmentation_config {
  hue: 0.1
  saturation: 1.5
  exposure:1.5
  vertical_flip:0
  horizontal_flip: 0.5
  jitter: 0.3
  output_width: 608
  output_height: 608
  output_channel: 3
  randomize_input_shape_period: 0
  mosaic_prob: 0.5
  mosaic_min_ratio:0.2
}
dataset_config {
  data_sources: {
      label_directory_path: "/workspace/tlt-experiments/data/training/label_2"
      image_directory_path: "/workspace/tlt-experiments/data/training/image_2"
  }
  include_difficult_in_training: true
  target_class_mapping {
      key: "fire"
      value: "fire"
  }
  validation_data_sources: {
      label_directory_path: "/workspace/tlt-experiments/data/val/label"
      image_directory_path: "/workspace/tlt-experiments/data/val/image"
  }
}

Morganh · June 18, 2021, 1:13pm

Please generate new anchor shapes. Refer to NVIDIA TAO Documentation or Jupyter notebooks.

1733208392 · June 19, 2021, 3:19am

Thanks!

I have tried with following command then updated the specs.

!tlt yolo_v4 kmeans -l /workspace/tlt-experiments/data/training/label_2 -i /workspace/tlt-experiments/data/training/image_2/ -x 608 -y 608

The training result shows AP is 0.1577 after 5 epoches. Is this 15.7% or 0.1577% , if this is the first then I guess the result is acceptable.

Regards,
Kai

Morganh · June 19, 2021, 3:25am

15.77%

1733208392 · June 30, 2021, 11:43am

Just an update: I managed to get 69% of mAP. When I used it to do infer, the result looks Okay but it mistakenly picked up many small items, is there a way to improve?

Morganh · July 1, 2021, 1:17am

Can you try different threshold during yolo_v4 inference ?