Hello,
I have trained a FasterRCNN resent18 model on coco with 80 classes (I converted the dataset to KITTI format). I have a custom validation set that contains just the person class. When I run the inference, the models runs through all the validation images but throw an error when calculating the mAP. (Note: It works perfectly well on the orignal coco validation set since all the 80 classes are present)
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 512/512 [04:45<00:00, 1.79it/s]
==========================================================================================
Class AP precision recall RPN_recall
------------------------------------------------------------------------------------------
Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/script
s/evaluate.py", line 167, in <module>
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/script
s/evaluate.py", line 155, in <module>
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/script
s/evaluate.py", line 149, in main
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/utils/
utils.py", line 503, in compute_map_list
KeyError: 'airplane'
Here’s my config file
# Copyright (c) 2017-2020, NVIDIA CORPORATION. All rights reserved.
random_seed: 42
enc_key: 'tlt'
verbose: True
model_config {
input_image_config {
image_type: RGB
image_channel_order: 'bgr'
size_height_width {
height: 384
width: 1248
}
image_channel_mean {
key: 'b'
value: 103.939
}
image_channel_mean {
key: 'g'
value: 116.779
}
image_channel_mean {
key: 'r'
value: 123.68
}
image_scaling_factor: 1.0
max_objects_num_per_image: 100
}
arch: "resnet:18"
anchor_box_config {
scale: 64.0
scale: 128.0
scale: 256.0
ratio: 1.0
ratio: 0.5
ratio: 2.0
}
freeze_bn: True
freeze_blocks: 0
freeze_blocks: 1
roi_mini_batch: 256
rpn_stride: 16
use_bias: False
roi_pooling_config {
pool_size: 7
pool_size_2x: False
}
all_projections: True
use_pooling:False
}
dataset_config {
data_sources: {
tfrecords_path: "/workspace/tao-experiments/data/tfrecords/train/*"
image_directory_path: "/workspace/tao-experiments/data/training"
}
validation_data_source {
tfrecords_path: "/workspace/tao-experiments/data/tfrecords/val/*"
image_directory_path: "/workspace/tao-experiments/data/testing"
}
image_extension: 'jpg'
target_class_mapping{
key:'person'
value:'person'}
target_class_mapping{
key:'bicycle'
value:'bicycle'}
target_class_mapping{
key:'car'
value:'car'}
target_class_mapping{
key:'motorcycle'
value:'motorcycle'}
target_class_mapping{
key:'airplane'
value:'airplane'}
target_class_mapping{
key:'bus'
value:'bus'}
target_class_mapping{
key:'train'
value:'train'}
target_class_mapping{
key:'truck'
value:'truck'}
target_class_mapping{
key:'boat'
value:'boat'}
target_class_mapping{
key:'traffic light'
value:'traffic light'}
target_class_mapping{
key:'fire hydrant'
value:'fire hydrant'}
target_class_mapping{
key:'stop sign'
value:'stop sign'}
target_class_mapping{
key:'parking meter'
value:'parking meter'}
target_class_mapping{
key:'bench'
value:'bench'}
target_class_mapping{
key:'bird'
value:'bird'}
target_class_mapping{
key:'cat'
value:'cat'}
target_class_mapping{
key:'dog'
value:'dog'}
target_class_mapping{
key:'horse'
value:'horse'}
target_class_mapping{
key:'sheep'
value:'sheep'}
target_class_mapping{
key:'cow'
value:'cow'}
target_class_mapping{
key:'elephant'
value:'elephant'}
target_class_mapping{
key:'bear'
value:'bear'}
target_class_mapping{
key:'zebra'
value:'zebra'}
target_class_mapping{
key:'giraffe'
value:'giraffe'}
target_class_mapping{
key:'backpack'
value:'backpack'}
target_class_mapping{
key:'umbrella'
value:'umbrella'}
target_class_mapping{
key:'handbag'
value:'handbag'}
target_class_mapping{
key:'tie'
value:'tie'}
target_class_mapping{
key:'suitcase'
value:'suitcase'}
target_class_mapping{
key:'frisbee'
value:'frisbee'}
target_class_mapping{
key:'skis'
value:'skis'}
target_class_mapping{
key:'snowboard'
value:'snowboard'}
target_class_mapping{
key:'sports ball'
value:'sports ball'}
target_class_mapping{
key:'kite'
value:'kite'}
target_class_mapping{
key:'baseball bat'
value:'baseball bat'}
target_class_mapping{
key:'baseball glove'
value:'baseball glove'}
target_class_mapping{
key:'skateboard'
value:'skateboard'}
target_class_mapping{
key:'surfboard'
value:'surfboard'}
target_class_mapping{
key:'tennis racket'
value:'tennis racket'}
target_class_mapping{
key:'bottle'
value:'bottle'}
target_class_mapping{
key:'wine glass'
value:'wine glass'}
target_class_mapping{
key:'cup'
value:'cup'}
target_class_mapping{
key:'fork'
value:'fork'}
target_class_mapping{
key:'knife'
value:'knife'}
target_class_mapping{
key:'spoon'
value:'spoon'}
target_class_mapping{
key:'bowl'
value:'bowl'}
target_class_mapping{
key:'banana'
value:'banana'}
target_class_mapping{
key:'apple'
value:'apple'}
target_class_mapping{
key:'sandwich'
value:'sandwich'}
target_class_mapping{
key:'orange'
value:'orange'}
target_class_mapping{
key:'broccoli'
value:'broccoli'}
target_class_mapping{
key:'carrot'
value:'carrot'}
target_class_mapping{
key:'hot dog'
value:'hot dog'}
target_class_mapping{
key:'pizza'
value:'pizza'}
target_class_mapping{
key:'donut'
value:'donut'}
target_class_mapping{
key:'cake'
value:'cake'}
target_class_mapping{
key:'chair'
value:'chair'}
target_class_mapping{
key:'couch'
value:'couch'}
target_class_mapping{
key:'potted plant'
value:'potted plant'}
target_class_mapping{
key:'bed'
value:'bed'}
target_class_mapping{
key:'dining table'
value:'dining table'}
target_class_mapping{
key:'toilet'
value:'toilet'}
target_class_mapping{
key:'tv'
value:'tv'}
target_class_mapping{
key:'laptop'
value:'laptop'}
target_class_mapping{
key:'mouse'
value:'mouse'}
target_class_mapping{
key:'remote'
value:'remote'}
target_class_mapping{
key:'keyboard'
value:'keyboard'}
target_class_mapping{
key:'cell phone'
value:'cell phone'}
target_class_mapping{
key:'microwave'
value:'microwave'}
target_class_mapping{
key:'oven'
value:'oven'}
target_class_mapping{
key:'toaster'
value:'toaster'}
target_class_mapping{
key:'sink'
value:'sink'}
target_class_mapping{
key:'refrigerator'
value:'refrigerator'}
target_class_mapping{
key:'book'
value:'book'}
target_class_mapping{
key:'clock'
value:'clock'}
target_class_mapping{
key:'vase'
value:'vase'}
target_class_mapping{
key:'scissors'
value:'scissors'}
target_class_mapping{
key:'teddy bear'
value:'teddy bear'}
target_class_mapping{
key:'hair drier'
value:'hair drier'}
target_class_mapping{
key:'toothbrush'
value:'toothbrush'}
}
augmentation_config {
preprocessing {
output_image_width: 1248
output_image_height: 384
output_image_channel: 3
min_bbox_width: 1.0
min_bbox_height: 1.0
enable_auto_resize: True
}
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 0
translate_max_y: 0
}
color_augmentation {
hue_rotation_max: 0.0
saturation_shift_max: 0.0
contrast_scale_max: 0.0
contrast_center: 0.5
}
}
training_config {
enable_augmentation: True
enable_qat: False
batch_size_per_gpu: 16
num_epochs: 3
pretrained_weights: "/workspace/tao-experiments/faster_rcnn/resnet_18.hdf5"
#resume_from_model: "/workspace/tao-experiments/faster_rcnn/frcnn_kitti_resnet18.epoch2.tlt"
output_model: "/workspace/tao-experiments/faster_rcnn/frcnn_kitti_resnet18.tlt"
rpn_min_overlap: 0.3
rpn_max_overlap: 0.7
classifier_min_overlap: 0.0
classifier_max_overlap: 0.5
gt_as_roi: False
std_scaling: 1.0
classifier_regr_std {
key: 'x'
value: 10.0
}
classifier_regr_std {
key: 'y'
value: 10.0
}
classifier_regr_std {
key: 'w'
value: 5.0
}
classifier_regr_std {
key: 'h'
value: 5.0
}
rpn_mini_batch: 256
rpn_pre_nms_top_N: 12000
rpn_nms_max_boxes: 2000
rpn_nms_overlap_threshold: 0.7
regularizer {
type: L2
weight: 1e-4
}
optimizer {
sgd {
lr: 0.02
momentum: 0.9
decay: 0.0
nesterov: False
}
}
learning_rate {
soft_start {
base_lr: 0.02
start_lr: 0.002
soft_start: 0.1
annealing_points: 0.8
annealing_points: 0.9
annealing_divider: 10.0
}
}
lambda_rpn_regr: 1.0
lambda_rpn_class: 1.0
lambda_cls_regr: 1.0
lambda_cls_class: 1.0
}
inference_config {
images_dir: '/workspace/tao-experiments/data/testing/image_2'
model: '/workspace/tao-experiments/faster_rcnn/frcnn_kitti_resnet18.epoch3.tlt'
batch_size: 1
detection_image_output_dir: '/workspace/tao-experiments/faster_rcnn/inference_results_imgs_resnet18'
labels_dump_dir: '/workspace/tao-experiments/faster_rcnn/inference_dump_labels_resnet18'
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
object_confidence_thres: 0.0001
bbox_visualize_threshold: 0.6
classifier_nms_max_boxes: 100
classifier_nms_overlap_threshold: 0.3
}
evaluation_config {
model: '/workspace/tao-experiments/faster_rcnn/frcnn_kitti_resnet18.epoch3.tlt'
batch_size: 16
validation_period_during_training: 1
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
classifier_nms_max_boxes: 100
classifier_nms_overlap_threshold: 0.3
object_confidence_thres: 0.0001
use_voc07_11point_metric:False
gt_matching_iou_threshold: 0.5
}
When converting to validation set to tfRecord, the stdout is as follows:
2021-09-27 14:35:16,681 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 1
...
...
....
2021-09-27 14:35:19,872 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -
Wrote the following numbers of objects:
b'person': 11498
2021-09-27 14:35:19,872 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 0
...
...
...
2021-09-27 14:35:23,409 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -
Wrote the following numbers of objects:
b'person': 11328
2021-09-27 14:35:23,410 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Cumulative object statistics
2021-09-27 14:35:23,410 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -
Wrote the following numbers of objects:
b'person': 22826
2021-09-27 14:35:23,410 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Class map.
Label in GT: Label in tfrecords file
b'person': b'person'
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.
Is this a problem since it only shows one class rather than the orignal 80 classes which are seen when creating tfrecords for the training set?
The system specifications are as follows:
- Hardware (Quadro RTX 6000)
- Network Type (Faster_rcnn)
- TAO Version (docker_tag = v3.21.08-py3 )