Using detectnet_v2 pretrained models in TLT v3.0

Dustin.Webb · October 12, 2021, 2:56pm

I have started training two detectnet_v2 models, one with a pre-trained model and the other without, but they appear to be performing identically as in this old post. Besides defining the pretrained model location and which layers to fix, the configuration files are identical. I’m also training on the same dataset. Below are details on the hardware I’m using, the TLT version (3.0; upgrading to TAO is not an option right now), and the configuration files I’m using for the two experiments.

Hardware: 2 identical machines with 8 A100s, 256 cores, 1TB of RAM
Network type: Detectnet_v2
TLT Version:

Configuration of the TLT Instance

dockers: 		
	nvcr.io/nvidia/tlt-streamanalytics: 			
		docker_tag: v3.0-dp-py3
		tasks: 
			1. augment
			2. classification
			3. detectnet_v2
			4. dssd
			5. emotionnet
			6. faster_rcnn
			7. fpenet
			8. gazenet
			9. gesturenet
			10. heartratenet
			11. lprnet
			12. mask_rcnn
			13. retinanet
			14. ssd
			15. unet
			16. yolo_v3
			17. yolo_v4
			18. tlt-converter
	nvcr.io/nvidia/tlt-pytorch: 			
		docker_tag: v3.0-dp-py3
		tasks: 
			1. speech_to_text
			2. text_classification
			3. question_answering
			4. token_classification
			5. intent_slot_classification
			6. punctuation_and_capitalization
format_version: 1.0
tlt_version: 3.0
published_date: 02/02/2021

Below are the two training specification files. Some details have been modified for privacy.

Training spec file 1:

# Model config
model_config {
  arch: "resnet"
  pretrained_model_file: ""
  all_projections: True
  num_layers: 18
  use_pooling: False
  use_batch_norm: True
  dropout_rate: 0
  training_precision: {
    backend_floatx: FLOAT32
  }
  objective_set: {
    cov {}
    bbox {
      scale: 35.0
      offset: 0.5
    }
  }
}

# Bbox rasterizer
bbox_rasterizer_config {
  target_class_config {
    key: "weed"
    value: {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.4
      cov_radius_y: 0.4
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "carrot"
    value: {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.4
      cov_radius_y: 0.4
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.67
}

postprocessing_config {
  target_class_config {
    key: "weed"
    value: {
      clustering_config {
        coverage_threshold: 0.005
        dbscan_eps: 0.20    
        dbscan_min_samples: 0.05
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "carrot"
    value: {
      clustering_config {
        coverage_threshold: 0.005
        dbscan_eps: 0.20
        dbscan_min_samples: 0.05
        minimum_bounding_box_height: 20
      }
    }
  }
}

cost_function_config {
  target_classes {
    name: "weed"
    class_weight: 1.0
    coverage_foreground_weight: 0.05
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 1.0
      weight_target: 1.0
    }
  }
  target_classes {
    name: "carrot"
    class_weight: 1.0
    coverage_foreground_weight: 0.05
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 1.0
      weight_target: 1.0
    }
  }
  enable_autoweighting: True
  max_objective_weight: 0.9999
  min_objective_weight: 0.0001
}

training_config {
  batch_size_per_gpu: 96 
  num_epochs: 10000
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-9
      max_learning_rate: 5e-4
      soft_start: 0.1
      annealing: 0.7
    }
  }
  regularizer {
    type: L1
    weight: 3e-9
  }
  optimizer {
    adam {
      epsilon: 1e-08
      beta1: 0.9
      beta2: 0.999
    }
  }
  cost_scaling {
    enabled: False
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
}

augmentation_config {
  preprocessing {
    output_image_width: 768
    output_image_height: 768
    output_image_channel: 3
    min_bbox_width: 5.0
    min_bbox_height: 5.0
  }
  spatial_augmentation {
    hflip_probability: 0.5
    vflip_probability: 0.5
    zoom_min: 0.9
    zoom_max: 1.0
    translate_max_x: 100.0
    translate_max_y: 100.0
    rotate_rad_max: 0.69
  }
  color_augmentation {
    color_shift_stddev: 0.0
    hue_rotation_max: 25.0
    saturation_shift_max: 0.2
    contrast_scale_max: 0.1
    contrast_center: 0.5
  }
}

evaluation_config {
  average_precision_mode: SAMPLE
  validation_period_during_training: 10
  first_validation_epoch: 30
  minimum_detection_ground_truth_overlap {
    key: "weed"
    value: 0.3
  }
  minimum_detection_ground_truth_overlap {
    key: "carrot"
    value: 0.3
  }
  evaluation_box_config {
    key: "weed"
    value {
      minimum_height: 4
      maximum_height: 9999
      minimum_width: 4
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "carrot"
    value {
      minimum_height: 4
      maximum_height: 9999
      minimum_width: 4
      maximum_width: 9999
    }
  }
}

dataset_config {
  data_sources: {
    tfrecords_path: "/data/tfrecords/train-*"
    image_directory_path: "/data"
  }
  image_extension: "jpg"
  target_class_mapping {
      key: "weed"
      value: "weed"
  }
  target_class_mapping {
      key: "carrot"
      value: "carrot"
  }
  validation_data_source: {
    tfrecords_path: "/data/tfrecords/validation-*"
    image_directory_path: "/data"
  }
}

Training spec file 2:

# Model config
model_config {
  arch: "resnet"
  pretrained_model_file: "/models/detectnet_v2/2-class/model.tlt"
  freeze_blocks: 0
  freeze_blocks: 1
  freeze_blocks: 2
  all_projections: True
  num_layers: 18
  use_pooling: False
  use_batch_norm: True
  dropout_rate: 0
  training_precision: {
    backend_floatx: FLOAT32
  }
  objective_set: {
    cov {}
    bbox {
      scale: 35.0
      offset: 0.5
    }
  }
}

# Bbox rasterizer
bbox_rasterizer_config {
  target_class_config {
    key: "weed"
    value: {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.4
      cov_radius_y: 0.4
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "carrot"
    value: {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.4
      cov_radius_y: 0.4
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.67
}

postprocessing_config {
  target_class_config {
    key: "weed"
    value: {
      clustering_config {
        coverage_threshold: 0.005
        dbscan_eps: 0.20    
        dbscan_min_samples: 0.05
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "carrot"
    value: {
      clustering_config {
        coverage_threshold: 0.005
        dbscan_eps: 0.20
        dbscan_min_samples: 0.05
        minimum_bounding_box_height: 20
      }
    }
  }
}

cost_function_config {
  target_classes {
    name: "weed"
    class_weight: 1.0
    coverage_foreground_weight: 0.05
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 1.0
      weight_target: 1.0
    }
  }
  target_classes {
    name: "carrot"
    class_weight: 1.0
    coverage_foreground_weight: 0.05
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 1.0
      weight_target: 1.0
    }
  }
  enable_autoweighting: True
  max_objective_weight: 0.9999
  min_objective_weight: 0.0001
}

training_config {
  batch_size_per_gpu: 96 
  num_epochs: 10000
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-9
      max_learning_rate: 5e-4
      soft_start: 0.1
      annealing: 0.7
    }
  }
  regularizer {
    type: L1
    weight: 3e-9
  }
  optimizer {
    adam {
      epsilon: 1e-08
      beta1: 0.9
      beta2: 0.999
    }
  }
  cost_scaling {
    enabled: False
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
}

augmentation_config {
  preprocessing {
    output_image_width: 768
    output_image_height: 768
    output_image_channel: 3
    min_bbox_width: 5.0
    min_bbox_height: 5.0
  }
  spatial_augmentation {
    hflip_probability: 0.5
    vflip_probability: 0.5
    zoom_min: 0.9
    zoom_max: 1.0
    translate_max_x: 100.0
    translate_max_y: 100.0
    rotate_rad_max: 0.69
  }
  color_augmentation {
    color_shift_stddev: 0.0
    hue_rotation_max: 25.0
    saturation_shift_max: 0.2
    contrast_scale_max: 0.1
    contrast_center: 0.5
  }
}

evaluation_config {
  average_precision_mode: SAMPLE
  validation_period_during_training: 10
  first_validation_epoch: 30
  minimum_detection_ground_truth_overlap {
    key: "weed"
    value: 0.3
  }
  minimum_detection_ground_truth_overlap {
    key: "carrot"
    value: 0.3
  }
  evaluation_box_config {
    key: "weed"
    value {
      minimum_height: 4
      maximum_height: 9999
      minimum_width: 4
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "carrot"
    value {
      minimum_height: 4
      maximum_height: 9999
      minimum_width: 4
      maximum_width: 9999
    }
  }
}

dataset_config {
  data_sources: {
    tfrecords_path: "/data/tfrecords/train-*"
    image_directory_path: "/data"
  }
  image_extension: "jpg"
  target_class_mapping {
      key: "weed"
      value: "weed"
  }
  target_class_mapping {
      key: "carrot"
      value: "carrot"
  }
  validation_data_source: {
    tfrecords_path: "/data/tfrecords/validation-*"
    image_directory_path: "/data"
  }
}

Morganh · October 12, 2021, 4:03pm

For detectnet_v2, below are the finding.

Achieve the same accuracy with less data while using pretrained models
Thus, lower training cost while using pretrained models

Dustin.Webb · October 12, 2021, 4:15pm

Your response doesn’t appear to address the problem. Why would two models, one started with a pretrained model and the other not, perform identically?

Morganh · October 12, 2021, 4:21pm

Per internal experiments, it is not identical. We run experiments with peoplenet as the pretrained model. And train on public IR dataset. At the beginning phase, the mAP result with a pretrained model is higher than the mAP result without a pretrained model.

Dustin.Webb · October 12, 2021, 4:28pm

I understand it should not be the case but it is what I am experiencing given the configuration outlined in my original post. The question is, why would it happen?

Morganh · October 12, 2021, 4:44pm

Can you remove above and retry?

BTW, is “/models/detectnet_v2/2-class/model.tlt” trained by your training images?

Dustin.Webb · October 12, 2021, 4:46pm

Removing those lines results in the same behavior.

Yes, /models/detectnet_v2/2-class/model.tlt is trained with our own images.

Morganh · October 12, 2021, 4:48pm

Did you draw the mAP curves for your two experiments ? If yes, please share with us.

Dustin.Webb · October 12, 2021, 5:41pm

No. The mAP in both cases are nearly identical at each epoch.

Morganh · October 13, 2021, 12:51am

Can you share the logs?

Morganh · October 13, 2021, 12:53am

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

More, how many images in your training dataset and validation dataset?

system · October 27, 2021, 12:54am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TLT2.0 When using DetectNet/PeopleNet, do you need at least 2 classes..? TAO Toolkit	9	818	April 11, 2022
"tlt-train detectnet_v2" lead core dump TAO Toolkit	7	968	October 12, 2021
Problem of tao detectnet_v2 evaluate 0% TAO Toolkit python	21	412	July 7, 2023
Evaluate Trained models in Tao toolkit TAO Toolkit	37	1375	July 5, 2022
Error detectnet_V2 train with TAO : dbscan_min_samples: 0.05' TAO Toolkit tao	4	399	November 7, 2023
Troubles Replicating TLT Model Training Experiment with TAO TAO Toolkit	6	535	November 21, 2023
TLT trained model accuracy worse after deployment TAO Toolkit	11	845	October 12, 2021
TLT Detectnet with Standford Drone Dataset Low Average Precision TAO Toolkit	18	779	October 12, 2021
Mix propriertary and public dataset for retrain TAO Toolkit	34	1196	March 10, 2022
Detectnet_v2 tlt ( training to detect person) TAO Toolkit	12	717	October 12, 2021

Using detectnet_v2 pretrained models in TLT v3.0

Related topics