Creating a separate evaluation TFRecord (PeopleNet)

ONGC0091 · December 31, 2020, 5:02am

Hi,

I would like to use a separate folder for evaluation dataset.

May I know what are the values to fill in for the following '?'s?

FOR TRAINING DATA
kitti_config {
root_directory_path: “/path/to/training/data”
image_dir_name: “images”
label_dir_name: “labels”
image_extension: “.jpg”
partition_mode: “random”
num_partitions: 2???
val_split: ???
num_shards: 10 }

FOR EVALUATION DATA
kitti_config {
root_directory_path: “/path/to/evaluation/data”
image_dir_name: “images”
label_dir_name: “labels”
image_extension: “.jpg”
partition_mode: “random”
num_partitions: 2???
val_split: ???
num_shards: 10 }

Is the following dataset_config correct?

FOR TRAINING SPEC FILE
dataset_config {
data_sources: {
tfrecords_path: “/path/to/training/TFRecords/"
image_directory_path: “/path/to/training/data/root”
}
validation_data_source: {
tfrecords_path: "/path/to/validation/TFRecords/”
image_directory_path: “/path/to/validation/data/root”
}
image_extension: “jpg”
target_class_mapping {
key: “person”
value: “person”
}

target_class_mapping {
key: “background”
value: “background”
}

target_class_mapping {
key: “face”
value: “face”
}
}

Morganh · December 31, 2020, 6:34am

Please refer to Integrating TAO Models into DeepStream — TAO Toolkit 3.22.05 documentation

Normally, you can set num_partitions to 2. And set val_split to the prcentage of data to be separated for validation. For example, 14 or 20 or others.

ONGC0091 · December 31, 2020, 7:51am

i am using a separate dataset for evaluation so i do not want to separate a validation set from my training dataset

Morganh · December 31, 2020, 8:03am

For separate dataset for evaluation, it is not related to the val_split or num_partitions.
You just need to set “validation_data_source” as mentioned in https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/text/creating_experiment_spec.html#specification-file-for-detectnet-v2

If you prefer to run evaluation on a different validation dataset as opposed to a split from the training dataset, then please convert this dataset into tfrecords as well using the tlt-dataset-convert tool as mentioned here, and use the validation_data_source field in the dataset_config to define this. In this case, please do not forget to remove the validation_fold field from the spec. When generating the TFRecords for evaluation by using the validation_data_source field, please review the notes here.

validation_data_source: {
tfrecords_path: " /tfrecords validation pattern>"
image_directory_path: " "
}

ONGC0091 · December 31, 2020, 8:06am

so when i am creating a training TFRecord, if my val_split is set to 100, the model will still train with all the data?

I have already read through the docs, and included it in my training spec file. kindly refer to above

Morganh · December 31, 2020, 8:10am

If you set separate validation_data_source in your training spec, yes, no matter how much you set for the val_split when you generate training tfrecord, the training will load all of your training tfrecords. During validation, the validation will load all the tfrecords in your validation_data_source.