YOLOv4: loss converges well to 0, but the inference result mAP is always 0

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) : T4

• Network Type : Yolo_v4

• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)

Configuration of the TLT Instance
dockers: ['nvidia/tlt-streamanalytics', 'nvidia/tlt-pytorch']
format_version: 1.0
tlt_version: 3.0
published_date: 04/16/2021
docker_tag: v3.0-py3

• Training spec file(If have, please share here)

random_seed: 42
yolov4_config {
  big_anchor_shape: "[(247.00, 178.00), (193.00, 235.00), (267.00, 299.00), (385.00, 406.00), (642.00, 614.00)]"
  mid_anchor_shape: "[(108.00, 85.00), (99.00, 118.00), (146.00, 120.00), (118.00, 151.00), (158.00, 175.00)]"
  small_anchor_shape: "[(45.00, 49.00), (54.00, 64.00), (76.00, 66.00), (64.00, 81.00), (80.00, 97.00)]"
  box_matching_iou: 0.3
  arch: "resnet"
  nlayers: 50
  arch_conv_blocks: 2
  loss_loc_weight: 5.0
  loss_neg_obj_weights: 50.0
  loss_class_weights: 0.9
  label_smoothing: 0.1
  big_grid_xy_extend: 0.05
  mid_grid_xy_extend: 0.1
  small_grid_xy_extend: 0.2
  freeze_bn: false
  #freeze_blocks: 0
  force_relu: false
}
training_config {
  batch_size_per_gpu: 1
  num_epochs: 10
  enable_qat: true
  checkpoint_interval: 1
  learning_rate {
    soft_start_cosine_annealing_schedule {
      min_learning_rate: 1e-7
      max_learning_rate: 1e-4
      soft_start: 0.3
    }
  }
  regularizer {
    type: L1
    weight: 3e-5
  }
  optimizer {
    adam {
      epsilon: 1e-7
      beta1: 0.9
      beta2: 0.999
      amsgrad: false
    }
  }
  pretrain_model_path: "/workspace/tlt-experiments/yolo_v4/pretrained_resnet50/tlt_pretrained_object_detection_vresnet50/resnet_50.hdf5"
}
eval_config {
  average_precision_mode: SAMPLE
  batch_size: 1
  matching_iou_threshold: 0.4
}
nms_config {
  confidence_threshold: 0.001
  clustering_iou_threshold: 0.4
  top_k: 200
}
augmentation_config {
  hue: 0.1
  saturation: 1.5
  exposure:1.5
  vertical_flip:0
  horizontal_flip: 0.5
  jitter: 0.3
  output_width: 1920
  output_height: 1024
  output_channel: 3
  randomize_input_shape_period: 0
  mosaic_prob: 0.5
  mosaic_min_ratio:0.2
}
dataset_config {
  data_sources: {
      label_directory_path: "/workspace/tlt-experiments/data/training/label_2"
      image_directory_path: "/workspace/tlt-experiments/data/training/image_2"
  }
  include_difficult_in_training: true
  target_class_mapping {
      key: "안전벨트 착용"
      value: "Belt on"
  }
  target_class_mapping {
      key: "안전벨트 미착용"
      value: "Belt off"
  }
  target_class_mapping {
      key: "안전화 착용"
      value: "Shoes on"
  }
  target_class_mapping {
      key: "안전모 착용"
      value: "Helmet on"
  }
  validation_data_sources: {
      label_directory_path: "/workspace/tlt-experiments/data/val/label"
      image_directory_path: "/workspace/tlt-experiments/data/val/image"
  }
}

• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

training results :

Producing predictions: 100%|████████████████| 1665/1665 [06:25<00:00,  4.32it/s]
Start to calculate AP for each class
*******************************
belt off      AP    0.0
belt on       AP    0.0
helmet on     AP    0.0
shoes on      AP    0.0
              mAP   0.0
*******************************
Validation loss: 0.00021867455646127195

==========

During training, I watched the loss close to 0, but the inference result always yields 0 mAP.

Is there anything I need to edit in the spec file?
Also, is it correct to set the nlayers of the spec file to 18 for resnet18 and 50 for resnet50 depending on the backbone?

We trained on 4 classes, and all classes have a similar number of instances.

It is related to the label name.

You can set below and retry.

value: “Belt on”

to

value: “Belt_on”

thank you. In my case, while converting the dataset to KITTI format, it was confirmed that there was a problem in which spaces disappeared from Hangul.

It was successfully learned by changing “안전벨트 착용” to “안전벨트착용”.
Thanks for the always accurate and quick replies :)

1 Like