Train peoplenet resnet34 with my own data TAO Toolkit

I’m trying to train the PeopleNet model with ResNet34 using my own dataset, following the TAO Toolkit notebook. I replaced the model configuration from ResNet18 to ResNet34. Here’s a detailed overview of what I’ve done:

  1. Dataset Conversion:

    • Converted the Pascal dataset to the KITTI dataset. I added default values for any missing variables.
    • Converted the KITTI dataset to TFRecords format.
  2. Model Configuration:

    • Adjusted the configuration to fit ResNet34 for the “person” class, using the “peoplenet_vtrainable_v2.6” model as the pretrained base.
    • Configuration details:
    random_seed: 42
    dataset_config {
      data_sources {
        tfrecords_path: "/workspace/tao-experiments/data/tfrecords/kitti_trainval/*"
        image_directory_path: "/workspace/tao-experiments/data/training"
      }
      image_extension: "jpg"
      target_class_mapping {
        key: "person"
        value: "person"
      }
      validation_fold: 0
    }
    augmentation_config {
      preprocessing {
        output_image_width: 1248
        output_image_height: 384
        min_bbox_width: 1.0
        min_bbox_height: 1.0
        output_image_channel: 3
      }
      spatial_augmentation {
        hflip_probability: 0.5
        zoom_min: 1.0
        zoom_max: 1.0
        translate_max_x: 8.0
        translate_max_y: 8.0
      }
      color_augmentation {
        hue_rotation_max: 25.0
        saturation_shift_max: 0.20000000298
        contrast_scale_max: 0.10000000149
        contrast_center: 0.5
      }
    }
    postprocessing_config {
      target_class_config {
        key: "person"
        value {
          clustering_config {
            clustering_algorithm: DBSCAN
            dbscan_confidence_threshold: 0.9
            coverage_threshold: 0.00749999983236
            dbscan_eps: 0.230000004172
            dbscan_min_samples: 1
            minimum_bounding_box_height: 20
          }
        }
      }
    }
    model_config {
      num_layers: 34
      pretrained_model_file: "/workspace/tao-experiments/detectnet_v2/pretrained_resnet34/peoplenet_vtrainable_v2.6/model.hdf5"
      use_batch_norm: true
      activation {
        activation_type: "relu"
      }
      objective_set {
        bbox {
          scale: 35.0
          offset: 0.5
        }
        cov {
        }
      }
      training_precision {
        backend_floatx: FLOAT32
      }
      arch: "resnet"
      all_projections: true
    }
    evaluation_config {
      validation_period_during_training: 10
      first_validation_epoch: 30
      minimum_detection_ground_truth_overlap {
        key: "person"
        value: 0.5
      }
      evaluation_box_config {
        key: "person"
        value {
          minimum_height: 20
          maximum_height: 9999
          minimum_width: 10
          maximum_width: 9999
        }
      }
      average_precision_mode: INTEGRATE
    }
    cost_function_config {
      target_classes {
        name: "person"
        class_weight: 4.0
        coverage_foreground_weight: 0.0500000007451
        objectives {
          name: "cov"
          initial_weight: 1.0
          weight_target: 1.0
        }
        objectives {
          name: "bbox"
          initial_weight: 10.0
          weight_target: 10.0
        }
      }
      enable_autoweighting: false
      max_objective_weight: 0.999899983406
      min_objective_weight: 9.99999974738e-05
    }
    training_config {
      batch_size_per_gpu: 4
      num_epochs: 120
      learning_rate {
        soft_start_annealing_schedule {
          min_learning_rate: 0.01
          max_learning_rate: 0.01
          soft_start: 0.10000000149
          annealing: 0.699999988079
        }
      }
      regularizer {
        type: L1
        weight: 3.00000002618e-09
      }
      optimizer {
        adam {
          epsilon: 9.99999993923e-09
          beta1: 0.899999976158
          beta2: 0.999000012875
        }
      }
      cost_scaling {
        initial_exponent: 20.0
        increment: 0.005
        decrement: 1.0
      }
      visualizer {
        enabled: true
        num_images: 3
        scalar_logging_frequency: 50
        infrequent_logging_frequency: 5
        target_class_config {
          key: "person"
          value: {
            coverage_threshold: 0.005
          }
        }
        clearml_config {
          project: "TAO Toolkit ClearML Demo"
          task: "detectnet_v2_resnet18_clearml"
          tags: "detectnet_v2"
          tags: "training"
          tags: "resnet18"
          tags: "unpruned"
        }
        wandb_config {
          project: "TAO Toolkit Wandb Demo"
          name: "detectnet_v2_resnet18_wandb"
          tags: "detectnet_v2"
          tags: "training"
          tags: "resnet18"
          tags: "unpruned"
        }
      }
      checkpoint_interval: 10
    }
    bbox_rasterizer_config {
      target_class_config {
        key: "person"
        value {
          cov_center_x: 0.5
          cov_center_y: 0.5
          cov_radius_x: 1.0
          cov_radius_y: 1.0
          bbox_min_radius: 1.0
        }
      }
      deadzone_radius: 0.400000154972
    }
    
  3. Training Command:

    • Trained the model with the following command:
    detectnet_v2 train -e /workspace/tao-experiments/specs/detectnet_v2_train_resnet34_kitti.txt \
                       -r /workspace/tao-experiments/experiments/experiment_dir_unpruned \
                       -n resnet34_detector \
                       --gpus 1 \
                       -k key
    
  4. Issue Encountered:

    • The training process ran as expected, but when I evaluated the model on the test data, the mAP was consistently 0.0, and the inference results were poor.

Question:
Do you have any suggestions for improving accuracy and why I always get 0.0 MPA is there anything wrong I made?

Hi @maha.alhobishi ,
I believe TAO toolkit will be able to assist better on this.

Thanks

I haven’t found anyone to help me with this so far.