Offline augmentation problems


for my master project I want to apply the transfer learning toolkit (tlt-streamanalytics:v2.0_dp_py2) to a previously augmented (offline augmentation) image set that has been manually split into train, test and val to test object detection using pre-trained Retinanet and MobileNetV2. Unfortunately I encountered two problems which I have not yet solved.

Problem 1 - Augmentation:

In the experimet spec file for the RetinaNet I can’t get the online augmentation switched off.
Under point 6.10 Training config a switch is mentioned (use_augmentation: True), which unfortunately does not work. But as soon as I leave out certain parts of the augmentation or set it to 0, I get an error message, e.g. that the zoom_min value must be at least greater than 0.0. Does anyone know a way to switch off the online augmentation completely?

Problem 2 - TFRecords:

I can’t find an example how to adjust the Configuration file for dataset converter so that I don’t have to do a split during the conversion. partition_mode = random cannot be used, because the value of the num_partitions is set to 2 as default value and cannot be changed to 1. Trying to set the val_split value to 0 does not work, although this value is listed in the Support Values.
I have tried to use the partition_mode = sequence, but unfortunately I did not find an example how to apply it to images.

I hope someone can help me with one or even both problems, for which I would be very grateful.


  1. Please do not set spatial_augmentation or color_augmentation.
  2. You can trigger Jupyter notebook. There are examples for each network. partition_mode = random should be working.

I thank you for the quick answer.

To point 1: If I don’t set or omit the spatial_augmentation and color_augmentation in the experiment spec file, the following error occurs anyway:

augmentation_config {
  preprocessing {
    output_image_width: 672
    output_image_height: 384
    output_image_channel: 3
    crop_right: 672
    crop_bottom: 384
    min_bbox_width: 1.0
    min_bbox_height: 1.0
2020-06-09 09:37:33,719 [INFO] iva.retinanet.scripts.train: Loading pretrained weights. This may take a while...
Traceback (most recent call last):
  File "/usr/local/bin/tlt-train-g1", line 8, in <module>
  File "./common/", line 40, in main
  File "./retinanet/scripts/", line 247, in main
  File "./retinanet/scripts/", line 112, in run_experiment
  File "./retinanet/builders/", line 69, in build
  File "./retinanet/builders/", line 44, in __init__
  File "./detectnet_v2/dataloader/", line 90, in build_dataloader
  File "./detectnet_v2/dataloader/augmentation/", line 89, in build_augmentation_config
  File "./detectnet_v2/dataloader/augmentation/", line 54, in build_spatial_augmentation_config
  File "./detectnet_v2/dataloader/augmentation/", line 107, in __init__
ValueError: zoom_min must be > 0.0

To point 2: When I set the num_partitions to 1 in the TFrecords conversion spec file for training, the following error message appears:

2020-06-09 09:49:35,820 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter
Traceback (most recent call last):
  File "/usr/local/bin/tlt-dataset-convert", line 8, in <module>
  File "./detectnet_v2/scripts/", line 63, in main
  File "./detectnet_v2/dataio/", line 76, in build_converter
  File "./detectnet_v2/dataio/", line 89, in __init__
AssertionError: Invalid number of partitions (1) for random split mode.

and if I set val_split to 0 then it ignores this setting and still divides the dataset into train and val

kitti_config {
  root_directory_path: "/workspace/Git_Repos_Datasets/GDS_V1/"
  image_dir_name: "images/train"
  label_dir_name: "annotations/train"
  image_extension: ".jpg"
  partition_mode: "random"
  num_partitions: 2
  val_split: 0
  num_shards: 1
image_directory_path: "/workspace/Git_Repos_Datasets/GDS_V1/"
2020-06-09 09:51:58,936 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter
2020-06-09 09:51:58,952 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Num images in
Train: 3500	Val: 874
2020-06-09 09:51:58,952 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2020-06-09 09:51:58,954 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 0
/usr/local/lib/python2.7/dist-packages/iva/detectnet_v2/dataio/ VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
/usr/local/lib/python2.7/dist-packages/iva/detectnet_v2/dataio/ UserWarning: genfromtxt: Empty input file: "/workspace/Git_Repos_Datasets/GDS_V1/annotations/train/0973.txt"
2020-06-09 09:51:59,940 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
cars: 1138

2020-06-09 09:51:59,940 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 0
2020-06-09 09:52:03,795 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
cars: 4476

2020-06-09 09:52:03,795 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Cumulative object statistics
2020-06-09 09:52:03,796 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
cars: 5614

2020-06-09 09:52:03,796 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Class map. 
Label in GT: Label in tfrecords file 
cars: cars
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.

2020-06-09 09:52:03,796 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Tfrecords generation complete.

  1. Please use below to disable spatial_augmentation

spatial_augmentation {
hflip_probability: 0.0
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 0.0
translate_max_y: 0.0

  1. Please set val_split to a very small value, like 0.001

Thank you @Morganh . Your last answer solves the problems.