Evaluate Trained models in Tao toolkit

pallavi1.halarnkar · June 23, 2022, 7:16am

Please provide the following information when requesting support.

• Hardware : Triton inference server
• Network Type :Yolo_v4 for People detection
• Training spec file

random_seed: 42
yolov4_config {
 # big_anchor_shape: "[(114.94, 60.67), (159.06, 114.59), (297.59, 176.38)]"
 # mid_anchor_shape: "[(42.99, 31.91), (79.57, 31.75), (56.80, 56.93)]"
 # small_anchor_shape: "[(15.60, 13.88), (30.25, 20.25), (20.67, 49.63)]"
  
  big_anchor_shape: "[(56.00, 136.00), (95.00, 167.00), (170.00, 212.00)]"
  mid_anchor_shape: "[(51.00, 64.00), (36.00, 109.00), (53.00, 98.00)]"
  small_anchor_shape: "[(21.00, 39.00), (35.00, 43.00), (30.00, 66.00)]"
  
  box_matching_iou: 0.25
  matching_neutral_box_iou: 0.5
  arch: "resnet"
  nlayers: 18
  arch_conv_blocks: 2
  loss_loc_weight: 1.0
  loss_neg_obj_weights: 1.0
  loss_class_weights: 1.0
  label_smoothing: 0.0
  big_grid_xy_extend: 0.05
  mid_grid_xy_extend: 0.1
  small_grid_xy_extend: 0.2
  freeze_bn: false
  #freeze_blocks: 0
  force_relu: false
}
training_config {
  batch_size_per_gpu: 8
  num_epochs: 80
  enable_qat: false
  checkpoint_interval: 2
  learning_rate {
    soft_start_cosine_annealing_schedule {
      min_learning_rate: 1e-7
      max_learning_rate: 1e-4
      soft_start: 0.3
    }
  }
  regularizer {
    type: L1
    weight: 3e-5
  }
  optimizer {
    adam {
      epsilon: 1e-7
      beta1: 0.9
      beta2: 0.999
      amsgrad: false
    }
  }
  pretrain_model_path: "/workspace/tao-experiments/yolo_v4/pretrained_resnet18/pretrained_object_detection_vresnet18/resnet_18.hdf5"
  #resume_model_path: "/workspace/tao-experiments/yolo_v4/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_054.tlt"
}
eval_config {
  average_precision_mode: SAMPLE
  batch_size: 8
  matching_iou_threshold: 0.5
}
nms_config {
  confidence_threshold: 0.001
  clustering_iou_threshold: 0.5
  force_on_cpu: true
  top_k: 200
}
augmentation_config {
  hue: 0.1
  saturation: 1.5
  exposure:1.5
  vertical_flip:0
  horizontal_flip: 0.5
  jitter: 0.3
  #output_width: 1248
  #output_height: 384
  output_width: 960
  output_height: 544
  output_channel: 3
  randomize_input_shape_period: 0
  mosaic_prob: 0.5
  mosaic_min_ratio:0.2
}
dataset_config {
  data_sources: {
      tfrecords_path: "/workspace/tao-experiments/data/training/tfrecords/train*"
      image_directory_path: "/workspace/tao-experiments/data/training"
  }
  include_difficult_in_training: true
  image_extension: "jpg"
 #target_class_mapping {
   #   key: "car"
    #  value: "car"
  #}
  target_class_mapping {
      key: "person"
      value: "pedestrian"
  }
 # target_class_mapping {
  #    key: "cyclist"
   #   value: "cyclist"
  #}
  #target_class_mapping {
   #   key: "van"
    #  value: "car"
  #}
  #target_class_mapping {
   #   key: "person_sitting"
    #  value: "pedestrian"
  #}

  validation_data_sources: {
      tfrecords_path: "/workspace/tao-experiments/data/val/tfrecords/val*"
      image_directory_path: "/workspace/tao-experiments/data/val"
  }
}

• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

I am using the Tao toolkit Jupyter Notebook for Yolo v4 for custom training object detection on our data set.
The training is sucessfully completed. However when i try executing the Evaluate Trained module section

!tao yolo_v4 evaluate -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                      -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                      -k $KEY

It does not show the out put sometimes

Following is the screenshot of the same.

Can you tell what could be the problem as to why it is not evaluating the trained model. Note: Sometimes it does shows me the output predictions and sometimes the container just stops

Morganh · June 23, 2022, 7:41am

Could you bakup the .ipynb file and then open a terminal to update below wheel?
$ pip3 install nvidia-tao==0.1.24

Then trigger notebook and retry.

pallavi1.halarnkar · June 23, 2022, 8:50am

I tried the above method and the issue has been resolved. Thanks a lot.

pallavi1.halarnkar · June 23, 2022, 8:53am

I wanted to know , can i evaluate pretrained model resnet18_peoplenet.tlt (downloaded from Nvidia site) in the tao toolkit.

I tried using the same but it gives me error.

i want to compare this pretrained model to my newly custom trained model of yolov4

Morganh · June 23, 2022, 9:06am

Yes, please set correct key. You can find it in the website.

pallavi1.halarnkar · June 23, 2022, 9:07am

I have set the key generated in ngc nvidia in my account, should i be using a different key?

Morganh · June 23, 2022, 9:13am

Refer to PeopleNet | NVIDIA NGC

Model load key: tlt_encode

Morganh · June 23, 2022, 9:21am

But please note that the peoplenet is trained with detectnet_v2 network. It can only be evaluated with detectnet_v2 instead of yolov4.

pallavi1.halarnkar · June 23, 2022, 9:56am

i used the detectnet_v2 notebook to evaluate resnet18_peoplenet model . Gives me the following error

/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
  warnings.warn('No training configuration found in save file: '
2022-06-23 09:53:15,773 [INFO] iva.detectnet_v2.objectives.bbox_objective: Default L1 loss function will be used.
2022-06-23 09:53:15,773 [INFO] root: Building dataloader.
Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/b770f990bb7b9e2db5771981fb3a38b4/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/evaluate.py", line 204, in <module>
  File "/root/.cache/bazel/_bazel_root/b770f990bb7b9e2db5771981fb3a38b4/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/evaluate.py", line 194, in <module>
  File "<decorator-gen-2>", line 2, in main
  File "/root/.cache/bazel/_bazel_root/b770f990bb7b9e2db5771981fb3a38b4/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/utilities/timer.py", line 46, in wrapped_fn
  File "/root/.cache/bazel/_bazel_root/b770f990bb7b9e2db5771981fb3a38b4/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/evaluate.py", line 177, in main
  File "/root/.cache/bazel/_bazel_root/b770f990bb7b9e2db5771981fb3a38b4/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/evaluation/build_evaluator.py", line 114, in build_evaluator_for_trained_gridbox
  File "/root/.cache/bazel/_bazel_root/b770f990bb7b9e2db5771981fb3a38b4/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/build_dataloader.py", line 233, in build_dataloader
  File "/root/.cache/bazel/_bazel_root/b770f990bb7b9e2db5771981fb3a38b4/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/build_dataloader.py", line 100, in build_data_source_lists
  File "/root/.cache/bazel/_bazel_root/b770f990bb7b9e2db5771981fb3a38b4/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/build_dataloader.py", line 65, in _pattern_to_files
AssertionError: No files match pattern /workspace/tao-experiments/data/tfrecords/kitti_trainval/*.
2022-06-23 15:23:18,274 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Morganh · June 23, 2022, 9:57am

Make sure you set correct ~/.tao_mounts.json to map local files into docker.

pallavi1.halarnkar · June 23, 2022, 10:37am

Hi,
Issues have been resolved. The code has started to work but it give me the following output

Validation cost: 0.000236
Mean average_precision (in %): 0.0000

class name      average precision (in %)
------------  --------------------------
car                                    0
cyclist                                0
pedestrian                             0

pallavi1.halarnkar · June 23, 2022, 11:01am

Image size i have used is

output_width: 960
output_height: 544

Morganh · June 24, 2022, 6:36am

For peoplenet, from the model card, there are 3 classes - person, bag, face
So, it can not evaluate “car”, “cyclist” and “pedestrian”.

pallavi1.halarnkar · June 24, 2022, 7:58am

I have trained yolov4 model only on Person class . So i want to evaluate peoplenet model on only person class. So please guide me how can i evaluate pretrained people net model on my custom dataset

Morganh · June 26, 2022, 3:36pm

Refer to People Net - - #5 by Morganh

pallavi1.halarnkar · June 27, 2022, 9:44am

Hi ,

I have trained my yolov4 model only for one class that is Person. So in my Kitti dataset we have one class called Person.
Now on this same dataset i want to evaluate Peoplenet mode with Resnet18 architecture.
Is it possible to evaluate the trained model on my custom dataset which has one class Person only or do i need to modify my custom dataset

Morganh · June 27, 2022, 9:46am

Yes, it is possible.

pallavi1.halarnkar · June 27, 2022, 9:51am

Hi,

So I did use the same data set on DetectnetV2 Jupyter notebook with Key changed to tlt_encode. It still calculates the MAP as 0 but when i do the inferencing on images it shows me two classed bag and person detection with good MAP.

So now can you help me where i am going wrong as to why it is calculating the MAP as 0

Morganh · June 27, 2022, 9:54am

Please share the spec file.

pallavi1.halarnkar · June 27, 2022, 9:58am

Hi Pls find below the file used in evaluation of Pretrained model of Peoplenet

detectnet_v2_train_resnet18_kitti.txt

random_seed: 42
dataset_config {
  data_sources {
    #tfrecords_path: "/workspace/tao-experiments/data/tfrecords/kitti_trainval/*"
    tfrecords_path: "/workspace/tao-experiments/data/training/tfrecords/train*"
    image_directory_path: "/workspace/tao-experiments/data/training"
  }
  image_extension: "jpg"
  target_class_mapping {
    key: "car"
    value: "car"
  }
  target_class_mapping {
    key: "cyclist"
    value: "cyclist"
  }
  target_class_mapping {
    key: "person"
    value: "person"
  }
  target_class_mapping {
    #key: "person_sitting"
    key: "person"
    value: "person"
  }
  target_class_mapping {
    key: "van"
    value: "car"
  }
  validation_fold: 0
}
augmentation_config {
  preprocessing {
    #output_image_width: 1248
    #output_image_height: 384
    output_image_width: 960
    output_image_height: 544
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contrast_scale_max: 0.10000000149
    contrast_center: 0.5
  }
}
postprocessing_config {
  target_class_config {
    key: "car"
    value {
      clustering_config {
        clustering_algorithm: DBSCAN
        dbscan_confidence_threshold: 0.9
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.20000000298
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "cyclist"
    value {
      clustering_config {
        clustering_algorithm: DBSCAN
        dbscan_confidence_threshold: 0.9
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.15000000596
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "person"
    value {
      clustering_config {
        clustering_algorithm: DBSCAN
        dbscan_confidence_threshold: 0.9
        coverage_threshold: 0.00749999983236
        dbscan_eps: 0.230000004172
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
}
model_config {
  pretrained_model_file: "/workspace/tao-experiments/detectnet_v2/pretrained_resnet18/pretrained_detectnet_v2_vresnet18/resnet18.hdf5"
  num_layers: 18
  use_batch_norm: true
  objective_set {
    bbox {
      scale: 35.0
      offset: 0.5
    }
    cov {
    }
  }
  arch: "resnet"
}
evaluation_config {
  validation_period_during_training: 10
  first_validation_epoch: 30
  minimum_detection_ground_truth_overlap {
    key: "car"
    value: 0.699999988079
  }
  minimum_detection_ground_truth_overlap {
    key: "cyclist"
    value: 0.5
  }
  minimum_detection_ground_truth_overlap {
    key: "person"
    value: 0.5
  }
  evaluation_box_config {
    key: "car"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "cyclist"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "person"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  average_precision_mode: INTEGRATE
}
cost_function_config {
  target_classes {
    name: "car"
    class_weight: 1.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  target_classes {
    name: "cyclist"
    class_weight: 8.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 1.0
    }
  }
  target_classes {
    name: "person"
    class_weight: 4.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  enable_autoweighting: true
  max_objective_weight: 0.999899983406
  min_objective_weight: 9.99999974738e-05
}
training_config {
  batch_size_per_gpu: 4
  num_epochs: 120
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-06
      max_learning_rate: 5e-04
      soft_start: 0.10000000149
      annealing: 0.699999988079
    }
  }
  regularizer {
    type: L1
    weight: 3.00000002618e-09
  }
  optimizer {
    adam {
      epsilon: 9.99999993923e-09
      beta1: 0.899999976158
      beta2: 0.999000012875
    }
  }
  cost_scaling {
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
  checkpoint_interval: 10
}
bbox_rasterizer_config {
  target_class_config {
    key: "car"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.40000000596
      cov_radius_y: 0.40000000596
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "cyclist"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "person"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.400000154972
}

Topic		Replies	Views
Training Custom Object detector with 6 classes TAO Toolkit	27	2180	October 12, 2021
TLT Detectnet with Standford Drone Dataset Low Average Precision TAO Toolkit	18	754	October 12, 2021
Error while using Tlt-infer TAO Toolkit	6	693	October 12, 2021
PeopleNet precision low for person class TAO Toolkit	22	1765	October 12, 2021
Problem of tao detectnet_v2 evaluate 0% TAO Toolkit python	21	390	July 7, 2023
Invalid argument: Invalid JPEG data or crop window, data size 786432 TAO Toolkit	9	1344	March 20, 2023
PeopleNet v1.0 unpruned model shows very bad results on COCO dataset TAO Toolkit	12	1850	October 12, 2021
Cannot train custom model, IndexError: list index (0) out of range TAO Toolkit	4	404	June 27, 2023
Detectnet v2 training :: very low or zero precision TAO Toolkit	4	646	April 17, 2023
Retrain TrafficCamNet with custom vehicle dataset using TLT 3.0 TAO Toolkit	10	986	March 1, 2022

Evaluate Trained models in Tao toolkit

Related topics