0.0 average precision during a detectnet_v2 training

I am getting a a 0.0 average precision during a detectnet_v2 training.

Command:

!tao model detectnet_v2 train -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti-1Class.txt \
                        -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                        -n resnet18_detector \
                        --gpus $NUM_GPUS 

A Sample annotation file:

rumex 0.0 0 0 1830 1195 1996 1348 0 0 0 0 0 0 0

• Hardware: T4
• Network Type: Detectnet_v2
• TAO Version: 5.0.0
• Training spec file

random_seed: 42
dataset_config {
  data_sources {
    tfrecords_path: "/workspace/tao-experiments/data/tfrecords/kitti_trainval/*"
    image_directory_path: "/workspace/tao-experiments/data/training"
  }
  image_extension: "png"
  target_class_mapping {
    key: "rumex"
    value: "rumex"
  }  validation_fold: 0
}
augmentation_config {
  preprocessing {
    output_image_width: 2048
    output_image_height: 1376
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contrast_scale_max: 0.10000000149
    contrast_center: 0.5
  }
}
postprocessing_config {
  target_class_config {
    key: "rumex"
    value {
      clustering_config {
        clustering_algorithm: DBSCAN
        dbscan_confidence_threshold: 0.9
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.20000000298
        dbscan_min_samples: 1
        minimum_bounding_box_height: 20
      }
    }
  }
}
model_config {
  pretrained_model_file: "/workspace/tao-experiments/detectnet_v2/pretrained_resnet18/pretrained_detectnet_v2_vresnet18/resnet18.hdf5"
  num_layers: 18
  use_batch_norm: true
  objective_set {
    bbox {
      scale: 35.0
      offset: 0.5
    }
    cov {
    }
  }
  arch: "resnet"
}
evaluation_config {
  validation_period_during_training: 10
  first_validation_epoch: 30
  minimum_detection_ground_truth_overlap {
    key: "rumex"
    value: 0.5
  }
  evaluation_box_config {
    key: "rumex"
    value {
      minimum_height: 20
      maximum_height: 1000
      minimum_width: 10
      maximum_width: 1000
    }
  }
  average_precision_mode: INTEGRATE
}
cost_function_config {
  target_classes {
    name: "rumex"
    class_weight: 1.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  enable_autoweighting: false
  max_objective_weight: 0.999899983406
  min_objective_weight: 9.99999974738e-05
}
training_config {
  batch_size_per_gpu: 4
  num_epochs: 1000
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-07
      max_learning_rate: 5e-05
      soft_start: 0.10000000149
      annealing: 0.699999988079
    }
  }
  regularizer {
    type: L1
    weight: 3.00000002618e-09
  }
  optimizer {
    adam {
      epsilon: 9.99999993923e-09
      beta1: 0.899999976158
      beta2: 0.999000012875
    }
  }
  cost_scaling {
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
  visualizer{
    enabled: true
    num_images: 3
    scalar_logging_frequency: 10
    infrequent_logging_frequency: 5
    target_class_config {
      key: "rumex"
      value: {
        coverage_threshold: 0.005
      }
    }
    clearml_config{
      project: "TAO DetectNet 1 Class"
      task: "detectnet_v2_resnet18_clearml"
      tags: "detectnet_v2"
      tags: "training"
      tags: "resnet18"
      tags: "unpruned"
    }
    wandb_config{
      project: "TAO Toolkit Wandb Demo"
      name: "detectnet_v2_resnet18_wandb"
      tags: "detectnet_v2"
      tags: "training"
      tags: "resnet18"
      tags: "unpruned"
    }
  }
  checkpoint_interval: 10
}
bbox_rasterizer_config {
  target_class_config {
    key: "rumex"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.40000000596
      cov_radius_y: 0.40000000596
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.400000154972
}

Please try to set enable_auto_resize to true. More info can be found in DetectNet_v2 - NVIDIA Docs

I’ve modified the Proprocessing config as follows:

augmentation_config {
  preprocessing {
    output_image_width: 2048
    output_image_height: 1376
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
    enable_auto_resize: true
  }
  spatial_augmentation {
    hflip_probability: 0.5
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contrast_scale_max: 0.10000000149
    contrast_center: 0.5
  }
}

And then, I am rerunning the same training cell.
The train halts with an Exit code 1.

I thought this might come from the fact that enable_auto_resize was False in the previous checkpoint. So, I deleted all the previous checkpoint so that it starts with new configuration from scratch. Now, the command runs, but the average precision is still 0%.

I also have two other questions:

  • Why would the enable_auto_resize parameter affect the average precision?
  • My images are all of the same size. Why would, at all, enable_auto_resize have effect?

The enable_auto_resize parameter is to train with multiple resolutions images. Since your training images are all of the same size, it is not needed.
Seems that the objects are small, please refer to Frequently Asked Questions - NVIDIA Docs ,

In DetectNet_V2, are there any parameters that can help improve AP (average precision) on training small objects?

Following parameters can help you improve AP on smaller objects:

  • Increase num_layers of resnet
  • class_weight for small objects
  • Increase the coverage_radius_x and coverage_radius_y parameters of the bbox_rasterizer_config section for the small objects class
  • Decrease minimum_detection_ground_truth_overlap
  • Lower minimum_height to cover more small objects for evaluation.

Hi @Morganh. Thanks for your input.
I did actually tweaked these values. Things improved a bit but not as expected. Is there an official paper about the detectnetv2 explaining the mathematical/algorithmic meaning of these hyperparameters? Working with them without know what they mean is a bit like working in the dark.

You can refer to user guide DetectNet_v2 - NVIDIA Docs and the source code.

 minimum_detection _ground_truth_overlap: Minimum IOU between ground truth and predicted box after clustering to call a valid detection. This parameter is a repeatable dictionary and a separate one must be defined for every class.
 minimum_height: Minimum height in pixels for a valid ground truth and prediction bbox.
 cov_radius_x (float): x-radius of the coverage ellipse

Also,
coverage_radius_x, https://github.com/NVIDIA/tao_tensorflow1_backend/blob/c7a3926ddddf3911842e057620bceb45bb5303cc/nvidia_tao_tf1/cv/detectnet_v2/evaluation/evaluation_config.py#L79.

More, please share your latest spec file and training log.
For your original training images, are they of the same resolution? What is the resolution?

specs.txt (3.7 KB)
This is the last specs file.

The original (330) images are of resolution: 2048x1376. I do not change this resolution during the training.

The mAP with the above run looks like this:

I think my current direction is to further tune these two parameters:

  • class_weight: would this related somehow to class frequency from the whole dataset? Does it have a particular effect if I have one class and background only? What does it mean in practice to make it bigger or smaller?
  • coverage_foreground_weight: this class probably makes sense in my case because my bounding boxes contain weeds ==> which means that the box itself also contains a lot of background. Rougly, my weeds leaves cover 50% of the bounding box. It is wise to use 0.5 instead of 0.05 (the default)?

Please share the training log as well.
Also, is it possible to share several training images and their labels? You can share with me by sending private message.

1 Like

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Received the dataset. Some images are missing the objects. Suggest to label more and improve the label quality.
More, seems that this detection task is a bit difficult. Some images are difficult for human eyes to find the rumex object. The rumex looks very similar to the green background.
Suggest to use yolov4 and deeper backbone to train. Also, D-DETR and DINO can be considered as well.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.