Object detection: How to figure out class index

Hello everyone,

I have trained a SSD_RESNET18 model with TLT, after that I try to do inference with tensorrt, the detection seems to work (right accuracy and right position of the bounding boxes) but I don’t understand how the index label order is defined. I saw in the “IVA Getting started Guide” that a classmap.json is created during the training of a classification model but we don’t have one for a detection model.

For example with the following spec file:

ssd_config {
  aspect_ratios_global: "[1.0, 2.0, 0.5, 3.0, 1.0/3.0]"
  scales: "[0.05, 0.1, 0.25, 0.4, 0.55, 0.7, 0.85]"
  two_boxes_for_ar1: true
  clip_boxes: false
  loss_loc_weight: 0.8
  focal_loss_alpha: 0.25
  focal_loss_gamma: 2.0
  variances: "[0.1, 0.1, 0.2, 0.2]"
  arch: "resnet18"
  freeze_bn: false
  freeze_blocks: 0
}
training_config {
  batch_size_per_gpu: 16
  num_epochs: 500
  learning_rate {
  soft_start_annealing_schedule {
    min_learning_rate: 5e-5
    max_learning_rate: 2e-2
    soft_start: 0.1
    annealing: 0.3
    }
  }
  regularizer {
    type: L1
    weight: 3.00000002618e-09
  }
}
eval_config {
  validation_period_during_training: 10
  average_precision_mode: SAMPLE
  batch_size: 32
  matching_iou_threshold: 0.5
}
nms_config {
  confidence_threshold: 0.01
  clustering_iou_threshold: 0.6
  top_k: 200
} 
augmentation_config {
  preprocessing {
    output_image_width: 800
    output_image_height: 600
    output_image_channel: 3
    crop_right: 800
    crop_bottom: 600
    min_bbox_width: 1.0
    min_bbox_height: 1.0
  }
  spatial_augmentation {
    hflip_probability: 0.0
    vflip_probability: 0.0
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 0.0
    translate_max_y: 0.0
  }
  color_augmentation {
    hue_rotation_max: 0.0
    saturation_shift_max: 0.0
    contrast_scale_max:0.0
    contrast_center: 0.0
  }
}
dataset_config {
  data_sources: {
    tfrecords_path: "/workspace/training/data/tfrecords/kitti_trainval*"
    image_directory_path: "/workspace/training/data/train"
  }
  image_extension: "png"
  target_class_mapping {
      key: "class_A"
      value: "class_A"
  }
  target_class_mapping {
      key: "class_B"
      value: "class_B"
  }
  target_class_mapping {
      key: "class_C"
      value: "class_C"
  }
validation_fold: 0
}

When I do inference, the index ‘0’ corresponds to the “class_A” label, index ‘1’ to “class_C” and index ‘2’ to “class_B”.

Where can I find how the classmap is defined?

I saw that in the “ssd_training_log_resnet18.csv” file (created during training) the order of labels seems to corresponds to the index order, is this a coincidence?

epoch	AP_class_A	AP_class_C	AP_class_B	loss	                mAP
0	nan	        nan	         nan	        229.0712734222412	nan
1	nan	        nan	         nan	        8.410841814676921	nan
2	nan	        nan	         nan	        7.41365286509196	nan

thank you

Please see “integrating an SSD model” section in the tlt user guide.
The order in which the classes are listed in label file must match the order in which the model predicts the output.This order is derived from the order the objects are instantiated in the dataset_config field of SSD experiment config file.

Hi Morganh,

thank for your answer but it’s not working like that for my part, classes are declared in this order in the config file:

dataset_config {
  data_sources: {
    tfrecords_path: "/workspace/training/data/tfrecords/kitti_trainval*"
    image_directory_path: "/workspace/training/data/train"
  }
  image_extension: "png"
  target_class_mapping {
      key: "class_A"
      value: "class_A"
  }
  target_class_mapping {
      key: "class_B"
      value: "class_B"
  }
  target_class_mapping {
      key: "class_C"
      value: "class_C"
  }
validation_fold: 0
}

But they have to be declared in this order in the label file:

class_A
class_C
class_B

Moreover, in the deepstream custom app sample (https://github.com/NVIDIA-AI-IOT/deepstream_4.x_apps) the same thing is happening:

SSD config file in the sample:

dataset_config {
  data_sources {
    tfrecords_path: "/home/projects2_metropolis/datasets/maglev_tfrecords/ivalarge_tfrecord_qres/*"
    image_directory_path: "/home/IVAData2/datasets/ivalarge_cyclops-b"
  }
  data_sources {
    tfrecords_path: "/home/projects2_metropolis/datasets/maglev_tfrecords/its_datasets_qres/aicities_highway/*"
    image_directory_path: "/home/projects2_metropolis/exports/IVA-0010-01_181016"
  }
  validation_fold: 0
  image_extension: "jpg"
  target_class_mapping {
    key: "AutoMobile"
    value: "car"
  }
  target_class_mapping {
    key: "Automobile"
    value: "car"
  }
  target_class_mapping {
    key: "Bicycle"
    value: "bicycle"
  }
  target_class_mapping {
    key: "Heavy Truck"
    value: "car"
  }
  target_class_mapping {
    key: "Motorcycle"
    value: "bicycle"
  }
  target_class_mapping {
    key: "Person"
    value: "person"
  }

  ...

  }
  target_class_mapping {
    key: "traffic_light"
    value: "road_sign"
  }
  target_class_mapping {
    key: "twowheeler"
    value: "bicycle"
  }
  target_class_mapping {
    key: "vehicle"
    value: "car"
  }
}

and corresponding label file:

bicycle
car
person
road_sign

I don’t understand how this order is defined.

Thank you

Hi dbrazey ,
Firstly, please set all the class-name in your spec file to lowercase. See https://devtalk.nvidia.com/default/topic/1069848/transfer-learning-toolkit/mean-average_precision-is-0-for-all-classes-using-detectnet_v2/

Then for the class order, please see https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#deepstream_deployment

The label file is a text file, containing the names of the classes that the SSD model is trained to detect. The order in which the classes are listed here must match the order in which the model predicts the output. This order is derived from the order the objects are instantiated in the dataset_config field of the SSD experiment config file.

For https://github.com/NVIDIA-AI-IOT/deepstream_4.x_apps you mentioned, I will check if there is some typos or errors in the github.

Hello,

After more investigations, I think (with no proof) that the class order is the alphabetical order.
And not the order specified during training.