YOLOv4: loss converges well to 0, but the inference result mAP is always 0

moey9201 · August 20, 2021, 3:44am

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) : T4

• Network Type : Yolo_v4

• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)

Configuration of the TLT Instance
dockers: ['nvidia/tlt-streamanalytics', 'nvidia/tlt-pytorch']
format_version: 1.0
tlt_version: 3.0
published_date: 04/16/2021
docker_tag: v3.0-py3

• Training spec file(If have, please share here)

random_seed: 42
yolov4_config {
  big_anchor_shape: "[(247.00, 178.00), (193.00, 235.00), (267.00, 299.00), (385.00, 406.00), (642.00, 614.00)]"
  mid_anchor_shape: "[(108.00, 85.00), (99.00, 118.00), (146.00, 120.00), (118.00, 151.00), (158.00, 175.00)]"
  small_anchor_shape: "[(45.00, 49.00), (54.00, 64.00), (76.00, 66.00), (64.00, 81.00), (80.00, 97.00)]"
  box_matching_iou: 0.3
  arch: "resnet"
  nlayers: 50
  arch_conv_blocks: 2
  loss_loc_weight: 5.0
  loss_neg_obj_weights: 50.0
  loss_class_weights: 0.9
  label_smoothing: 0.1
  big_grid_xy_extend: 0.05
  mid_grid_xy_extend: 0.1
  small_grid_xy_extend: 0.2
  freeze_bn: false
  #freeze_blocks: 0
  force_relu: false
}
training_config {
  batch_size_per_gpu: 1
  num_epochs: 10
  enable_qat: true
  checkpoint_interval: 1
  learning_rate {
    soft_start_cosine_annealing_schedule {
      min_learning_rate: 1e-7
      max_learning_rate: 1e-4
      soft_start: 0.3
    }
  }
  regularizer {
    type: L1
    weight: 3e-5
  }
  optimizer {
    adam {
      epsilon: 1e-7
      beta1: 0.9
      beta2: 0.999
      amsgrad: false
    }
  }
  pretrain_model_path: "/workspace/tlt-experiments/yolo_v4/pretrained_resnet50/tlt_pretrained_object_detection_vresnet50/resnet_50.hdf5"
}
eval_config {
  average_precision_mode: SAMPLE
  batch_size: 1
  matching_iou_threshold: 0.4
}
nms_config {
  confidence_threshold: 0.001
  clustering_iou_threshold: 0.4
  top_k: 200
}
augmentation_config {
  hue: 0.1
  saturation: 1.5
  exposure:1.5
  vertical_flip:0
  horizontal_flip: 0.5
  jitter: 0.3
  output_width: 1920
  output_height: 1024
  output_channel: 3
  randomize_input_shape_period: 0
  mosaic_prob: 0.5
  mosaic_min_ratio:0.2
}
dataset_config {
  data_sources: {
      label_directory_path: "/workspace/tlt-experiments/data/training/label_2"
      image_directory_path: "/workspace/tlt-experiments/data/training/image_2"
  }
  include_difficult_in_training: true
  target_class_mapping {
      key: "안전벨트 착용"
      value: "Belt on"
  }
  target_class_mapping {
      key: "안전벨트 미착용"
      value: "Belt off"
  }
  target_class_mapping {
      key: "안전화 착용"
      value: "Shoes on"
  }
  target_class_mapping {
      key: "안전모 착용"
      value: "Helmet on"
  }
  validation_data_sources: {
      label_directory_path: "/workspace/tlt-experiments/data/val/label"
      image_directory_path: "/workspace/tlt-experiments/data/val/image"
  }
}

• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

training results :

Producing predictions: 100%|████████████████| 1665/1665 [06:25<00:00,  4.32it/s]
Start to calculate AP for each class
*******************************
belt off      AP    0.0
belt on       AP    0.0
helmet on     AP    0.0
shoes on      AP    0.0
              mAP   0.0
*******************************
Validation loss: 0.00021867455646127195

==========

During training, I watched the loss close to 0, but the inference result always yields 0 mAP.

Is there anything I need to edit in the spec file?
Also, is it correct to set the nlayers of the spec file to 18 for resnet18 and 50 for resnet50 depending on the backbone?

We trained on 4 classes, and all classes have a similar number of instances.

Morganh · August 20, 2021, 7:08am

It is related to the label name.

You can set below and retry.

value: “Belt on”

to

value: “Belt_on”

moey9201 · August 20, 2021, 9:42am

thank you. In my case, while converting the dataset to KITTI format, it was confirmed that there was a problem in which spaces disappeared from Hangul.

It was successfully learned by changing “안전벨트 착용” to “안전벨트착용”.
Thanks for the always accurate and quick replies :)

Topic		Replies	Views
YOLO V4 TLT Training Low mAP TAO Toolkit	6	1083	October 12, 2021
TLT with YOLOv3 Achieved 0 MaP after 120 Epoch TAO Toolkit	10	887	October 12, 2021
mAP and every AP not improving while training TLT YOLO_V4 with custom data TAO Toolkit	6	582	October 12, 2021
YOLOv3+darknet53 encountered low mAP on VOC dataset TAO Toolkit yolo	9	1157	October 12, 2021
Invalid loss on YOLO v4 model with latest TAO release TAO Toolkit	8	798	January 4, 2022
TLT AP value 0.0 for all classes TAO Toolkit	8	479	October 12, 2021
Is it possible to adjust class_weight in YOLOv4 like DetectNet v2? TAO Toolkit	7	1247	October 12, 2021
Tltv3 yolov4 train set aren't loaded TAO Toolkit tensorflow	4	622	June 25, 2021
Yolo v4 Giving 0.0 AP for less images class TAO Toolkit yolo , tao	19	859	September 12, 2023
During training, the mAP value becomes 0 TAO Toolkit	2	745	October 12, 2021

YOLOv4: loss converges well to 0, but the inference result mAP is always 0

Related topics