Training detectnet_v2 Issue

I am new to TLT and I received this error while training detectnet_v2. I cannot understand which paramteter or configuration that caused this error.

Using TensorFlow backend.
2020-03-09 07:56:03.176802: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-03-09 07:56:03.246178: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-09 07:56:03.246602: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5bd6a90 executing computations on platform CUDA. Devices:
2020-03-09 07:56:03.246623: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): GeForce GTX 1050, Compute Capability 6.1
2020-03-09 07:56:03.248490: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299965000 Hz
2020-03-09 07:56:03.248824: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5cf20f0 executing computations on platform Host. Devices:
2020-03-09 07:56:03.248845: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2020-03-09 07:56:03.248984: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
totalMemory: 3.95GiB freeMemory: 3.65GiB
2020-03-09 07:56:03.249025: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-03-09 07:56:03.249707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-09 07:56:03.249727: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2020-03-09 07:56:03.249739: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2020-03-09 07:56:03.249862: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3439 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-03-09 07:56:03,250 [INFO] iva.detectnet_v2.scripts.train: Loading experiment spec at /workspace/file/tlt/pc/specs/detectnet_v2_train_resnet18_kitti.txt.
2020-03-09 07:56:03,251 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/file/tlt/pc/specs/detectnet_v2_train_resnet18_kitti.txt
WARNING:tensorflow:From ./detectnet_v2/dataloader/utilities.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
2020-03-09 07:56:03,259 [WARNING] tensorflow: From ./detectnet_v2/dataloader/utilities.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
2020-03-09 07:56:03,319 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 50 samples with a batch size of 4; each epoch will therefore take one extra step.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2020-03-09 07:56:03,323 [WARNING] tensorflow: From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/horovod/tensorflow/__init__.py:91: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
2020-03-09 07:56:03,378 [WARNING] tensorflow: From /usr/local/lib/python2.7/dist-packages/horovod/tensorflow/__init__.py:91: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 3, 384, 1248) 0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 192, 624) 9472        input_1[0][0]                    
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 64, 192, 624) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 64, 192, 624) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D)        (None, 64, 96, 312)  36928       activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, 96, 312)  256         block_1a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 64, 96, 312)  0           block_1a_bn_1[0][0]              
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D)        (None, 64, 96, 312)  36928       activation_2[0][0]               
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 96, 312)  4160        activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, 96, 312)  256         block_1a_conv_2[0][0]            
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, 96, 312)  256         block_1a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_1 (Add)                     (None, 64, 96, 312)  0           block_1a_bn_2[0][0]              
                                                                 block_1a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 64, 96, 312)  0           add_1[0][0]                      
__________________________________________________________________________________________________
block_1b_conv_1 (Conv2D)        (None, 64, 96, 312)  36928       activation_3[0][0]               
__________________________________________________________________________________________________
block_1b_bn_1 (BatchNormalizati (None, 64, 96, 312)  256         block_1b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 64, 96, 312)  0           block_1b_bn_1[0][0]              
__________________________________________________________________________________________________
block_1b_conv_2 (Conv2D)        (None, 64, 96, 312)  36928       activation_4[0][0]               
__________________________________________________________________________________________________
block_1b_bn_2 (BatchNormalizati (None, 64, 96, 312)  256         block_1b_conv_2[0][0]            
__________________________________________________________________________________________________
add_2 (Add)                     (None, 64, 96, 312)  0           block_1b_bn_2[0][0]              
                                                                 activation_3[0][0]               
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 64, 96, 312)  0           add_2[0][0]                      
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D)        (None, 128, 48, 156) 73856       activation_5[0][0]               
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, 48, 156) 512         block_2a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 128, 48, 156) 0           block_2a_bn_1[0][0]              
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D)        (None, 128, 48, 156) 147584      activation_6[0][0]               
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 48, 156) 8320        activation_5[0][0]               
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, 48, 156) 512         block_2a_conv_2[0][0]            
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, 48, 156) 512         block_2a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_3 (Add)                     (None, 128, 48, 156) 0           block_2a_bn_2[0][0]              
                                                                 block_2a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 128, 48, 156) 0           add_3[0][0]                      
__________________________________________________________________________________________________
block_2b_conv_1 (Conv2D)        (None, 128, 48, 156) 147584      activation_7[0][0]               
__________________________________________________________________________________________________
block_2b_bn_1 (BatchNormalizati (None, 128, 48, 156) 512         block_2b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 128, 48, 156) 0           block_2b_bn_1[0][0]              
__________________________________________________________________________________________________
block_2b_conv_2 (Conv2D)        (None, 128, 48, 156) 147584      activation_8[0][0]               
__________________________________________________________________________________________________
block_2b_bn_2 (BatchNormalizati (None, 128, 48, 156) 512         block_2b_conv_2[0][0]            
__________________________________________________________________________________________________
add_4 (Add)                     (None, 128, 48, 156) 0           block_2b_bn_2[0][0]              
                                                                 activation_7[0][0]               
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 128, 48, 156) 0           add_4[0][0]                      
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D)        (None, 256, 24, 78)  295168      activation_9[0][0]               
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, 24, 78)  1024        block_3a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 256, 24, 78)  0           block_3a_bn_1[0][0]              
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D)        (None, 256, 24, 78)  590080      activation_10[0][0]              
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 24, 78)  33024       activation_9[0][0]               
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, 24, 78)  1024        block_3a_conv_2[0][0]            
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, 24, 78)  1024        block_3a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_5 (Add)                     (None, 256, 24, 78)  0           block_3a_bn_2[0][0]              
                                                                 block_3a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 256, 24, 78)  0           add_5[0][0]                      
__________________________________________________________________________________________________
block_3b_conv_1 (Conv2D)        (None, 256, 24, 78)  590080      activation_11[0][0]              
__________________________________________________________________________________________________
block_3b_bn_1 (BatchNormalizati (None, 256, 24, 78)  1024        block_3b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 256, 24, 78)  0           block_3b_bn_1[0][0]              
__________________________________________________________________________________________________
block_3b_conv_2 (Conv2D)        (None, 256, 24, 78)  590080      activation_12[0][0]              
__________________________________________________________________________________________________
block_3b_bn_2 (BatchNormalizati (None, 256, 24, 78)  1024        block_3b_conv_2[0][0]            
__________________________________________________________________________________________________
add_6 (Add)                     (None, 256, 24, 78)  0           block_3b_bn_2[0][0]              
                                                                 activation_11[0][0]              
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 256, 24, 78)  0           add_6[0][0]                      
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D)        (None, 512, 24, 78)  1180160     activation_13[0][0]              
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (None, 512, 24, 78)  2048        block_4a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 512, 24, 78)  0           block_4a_bn_1[0][0]              
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D)        (None, 512, 24, 78)  2359808     activation_14[0][0]              
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (None, 512, 24, 78)  131584      activation_13[0][0]              
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (None, 512, 24, 78)  2048        block_4a_conv_2[0][0]            
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (None, 512, 24, 78)  2048        block_4a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_7 (Add)                     (None, 512, 24, 78)  0           block_4a_bn_2[0][0]              
                                                                 block_4a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 512, 24, 78)  0           add_7[0][0]                      
__________________________________________________________________________________________________
block_4b_conv_1 (Conv2D)        (None, 512, 24, 78)  2359808     activation_15[0][0]              
__________________________________________________________________________________________________
block_4b_bn_1 (BatchNormalizati (None, 512, 24, 78)  2048        block_4b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 512, 24, 78)  0           block_4b_bn_1[0][0]              
__________________________________________________________________________________________________
block_4b_conv_2 (Conv2D)        (None, 512, 24, 78)  2359808     activation_16[0][0]              
__________________________________________________________________________________________________
block_4b_bn_2 (BatchNormalizati (None, 512, 24, 78)  2048        block_4b_conv_2[0][0]            
__________________________________________________________________________________________________
add_8 (Add)                     (None, 512, 24, 78)  0           block_4b_bn_2[0][0]              
                                                                 activation_15[0][0]              
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 512, 24, 78)  0           add_8[0][0]                      
__________________________________________________________________________________________________
output_bbox (Conv2D)            (None, 12, 24, 78)   6156        activation_17[0][0]              
__________________________________________________________________________________________________
output_cov (Conv2D)             (None, 3, 24, 78)    1539        activation_17[0][0]              
==================================================================================================
Total params: 11,203,023
Trainable params: 11,193,295
Non-trainable params: 9,728
__________________________________________________________________________________________________

target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
2020-03-09 07:56:26,632 [INFO] iva.detectnet_v2.scripts.train: Found 50 samples in training set
Traceback (most recent call last):
  File "/usr/local/bin/tlt-train-g1", line 8, in <module>
    sys.exit(main())
  File "./common/magnet_train.py", line 37, in main
  File "</usr/local/lib/python2.7/dist-packages/decorator.pyc:decorator-gen-2>", line 2, in main
  File "./detectnet_v2/utilities/timer.py", line 46, in wrapped_fn
  File "./detectnet_v2/scripts/train.py", line 633, in main
  File "./detectnet_v2/scripts/train.py", line 557, in run_experiment
  File "./detectnet_v2/scripts/train.py", line 480, in train_gridbox
  File "./detectnet_v2/scripts/train.py", line 354, in build_validation_graph
  File "./detectnet_v2/dataloader/default_dataloader.py", line 198, in get_dataset_tensors
  File "./detectnet_v2/dataloader/utilities.py", line 181, in extract_tfrecords_features
StopIteration

This is the config file. I changed all pedestrian to person, because my training dataset is labelled as person.

random_seed: 42
dataset_config {
  data_sources {
    tfrecords_path: "/workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/*"
    image_directory_path: "/workspace/file/Python/annotate_kitti/poc_mrt_frames_Images_KITTI"
  }
  image_extension: "jpg"
  target_class_mapping {
    key: "car"
    value: "car"
  }
  target_class_mapping {
    key: "cyclist"
    value: "cyclist"
  }
  target_class_mapping {
    key: "pedestrian"
    value: "person"
  }
  target_class_mapping {
    key: "person_sitting"
    value: "person"
  }
  target_class_mapping {
    key: "van"
    value: "car"
  }
  validation_fold: 0
}
augmentation_config {
  preprocessing {
    output_image_width: 1248
    output_image_height: 384
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contrast_scale_max: 0.10000000149
    contrast_center: 0.5
  }
}
postprocessing_config {
  target_class_config {
    key: "car"
    value {
      clustering_config {
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.20000000298
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "cyclist"
    value {
      clustering_config {
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.15000000596
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "person"
    value {
      clustering_config {
        coverage_threshold: 0.00749999983236
        dbscan_eps: 0.230000004172
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
}
model_config {
  pretrained_model_file: "/workspace/file/tlt/pc/experiment/pretrained_resnet18/tlt_resnet18_detectnet_v2_v1/resnet18.hdf5"
  num_layers: 18
  use_batch_norm: true
  activation {
    activation_type: "relu"
  }
  objective_set {
    bbox {
      scale: 35.0
      offset: 0.5
    }
    cov {
    }
  }
  training_precision {
    backend_floatx: FLOAT32
  }
  arch: "resnet"
}
evaluation_config {
  validation_period_during_training: 10
  first_validation_epoch: 1
  minimum_detection_ground_truth_overlap {
    key: "car"
    value: 0.699999988079
  }
  minimum_detection_ground_truth_overlap {
    key: "cyclist"
    value: 0.5
  }
  minimum_detection_ground_truth_overlap {
    key: "person"
    value: 0.5
  }
  evaluation_box_config {
    key: "car"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "cyclist"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "person"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  average_precision_mode: INTEGRATE
}
cost_function_config {
  target_classes {
    name: "car"
    class_weight: 1.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  target_classes {
    name: "cyclist"
    class_weight: 8.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 1.0
    }
  }
  target_classes {
    name: "person"
    class_weight: 4.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  enable_autoweighting: true
  max_objective_weight: 0.999899983406
  min_objective_weight: 9.99999974738e-05
}
training_config {
  batch_size_per_gpu: 4
  num_epochs: 120
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-06
      max_learning_rate: 5e-04
      soft_start: 0.10000000149
      annealing: 0.699999988079
    }
  }
  regularizer {
    type: L1
    weight: 3.00000002618e-09
  }
  optimizer {
    adam {
      epsilon: 9.99999993923e-09
      beta1: 0.899999976158
      beta2: 0.999000012875
    }
  }
  cost_scaling {
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
  checkpoint_interval: 10
}
bbox_rasterizer_config {
  target_class_config {
    key: "car"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.40000000596
      cov_radius_y: 0.40000000596
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "cyclist"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "person"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.400000154972
}

Hi macro,
Could you please paste the log when you generate tfrecords via tlt-dataset-convert?

This is the log

Converting Tfrecords for kitti trainval dataset
Using TensorFlow backend.
2020-03-09 08:05:48,999 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter
2020-03-09 08:05:48,999 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Num images in
Train: 50	Val: 7
2020-03-09 08:05:48,999 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2020-03-09 08:05:48,999 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 0
2020-03-09 08:05:48,999 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 1
2020-03-09 08:05:48,999 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 2
2020-03-09 08:05:49,000 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 3
2020-03-09 08:05:49,000 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 4
2020-03-09 08:05:49,000 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 5
2020-03-09 08:05:49,000 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 6
2020-03-09 08:05:49,000 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 7
2020-03-09 08:05:49,000 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 8
2020-03-09 08:05:49,000 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 9
/usr/local/lib/python2.7/dist-packages/iva/detectnet_v2/dataio/kitti_converter_lib.py:266: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
2020-03-09 08:05:49,181 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
person: 7

2020-03-09 08:05:49,181 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 0
2020-03-09 08:05:49,255 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 1
2020-03-09 08:05:49,321 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 2
2020-03-09 08:05:49,350 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 3
2020-03-09 08:05:49,381 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 4
2020-03-09 08:05:49,393 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 5
2020-03-09 08:05:49,405 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 6
2020-03-09 08:05:49,413 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 7
2020-03-09 08:05:49,428 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 8
2020-03-09 08:05:49,438 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 9
2020-03-09 08:05:49,446 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
person: 66

2020-03-09 08:05:49,446 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Cumulative object statistics
2020-03-09 08:05:49,446 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
person: 73

2020-03-09 08:05:49,446 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Class map. 
Label in GT: Label in tfrecords file 
person: person
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.

2020-03-09 08:05:49,446 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Tfrecords generation complete.

And this is the config file

TFrecords conversion spec file for kitti training
kitti_config {
  # Path to the root directory of the dataset
  root_directory_path: "/workspace/file/Python/annotate_kitti"
  image_dir_name: "poc_mrt_frames_Images_KITTI"
  label_dir_name: "poc_mrt_frames_Annotations_KITTI"
  image_extension: ".jpg"
  partition_mode: "random"
  num_partitions: 2
  val_split: 14
  num_shards: 10
}
# For most cases, this will be the same as the root_directory_path. If
# for some reason the images are in a different directory, then 
# the images will be dereferenced as
# image_directory_path/image_dir_name/<xxxx><image_extension>
image_directory_path: "/workspace/file/Python/annotate_kitti"

In your tfrecords, only one class “person” is available. So, please remove other classes in your training config.
Only set as below.

target_class_mapping {
    key: "car"
    value: "car"
}

This config file after I removed other classes

random_seed: 42
dataset_config {
  data_sources {
    tfrecords_path: "/workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/*"
    image_directory_path: "/workspace/file/Python/annotate_kitti"
  }
  image_extension: "jpg"
  target_class_mapping {
    key: "person"
    value: "person"
  }
  validation_fold: 0
}
augmentation_config {
  preprocessing {
    output_image_width: 1248
    output_image_height: 384
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contrast_scale_max: 0.10000000149
    contrast_center: 0.5
  }
}
postprocessing_config {
  target_class_config {
    key: "car"
    value {
      clustering_config {
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.20000000298
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "cyclist"
    value {
      clustering_config {
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.15000000596
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "person"
    value {
      clustering_config {
        coverage_threshold: 0.00749999983236
        dbscan_eps: 0.230000004172
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
}
model_config {
  pretrained_model_file: "/workspace/file/tlt/pc/experiment/pretrained_resnet18/tlt_resnet18_detectnet_v2_v1/resnet18.hdf5"
  num_layers: 18
  use_batch_norm: true
  activation {
    activation_type: "relu"
  }
  objective_set {
    bbox {
      scale: 35.0
      offset: 0.5
    }
    cov {
    }
  }
  training_precision {
    backend_floatx: FLOAT32
  }
  arch: "resnet"
}
evaluation_config {
  validation_period_during_training: 10
  first_validation_epoch: 1
  minimum_detection_ground_truth_overlap {
    key: "car"
    value: 0.699999988079
  }
  minimum_detection_ground_truth_overlap {
    key: "cyclist"
    value: 0.5
  }
  minimum_detection_ground_truth_overlap {
    key: "person"
    value: 0.5
  }
  evaluation_box_config {
    key: "car"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "cyclist"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "person"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  average_precision_mode: INTEGRATE
}
cost_function_config {
  target_classes {
    name: "car"
    class_weight: 1.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  target_classes {
    name: "cyclist"
    class_weight: 8.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 1.0
    }
  }
  target_classes {
    name: "person"
    class_weight: 4.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  enable_autoweighting: true
  max_objective_weight: 0.999899983406
  min_objective_weight: 9.99999974738e-05
}
training_config {
  batch_size_per_gpu: 4
  num_epochs: 120
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-06
      max_learning_rate: 5e-04
      soft_start: 0.10000000149
      annealing: 0.699999988079
    }
  }
  regularizer {
    type: L1
    weight: 3.00000002618e-09
  }
  optimizer {
    adam {
      epsilon: 9.99999993923e-09
      beta1: 0.899999976158
      beta2: 0.999000012875
    }
  }
  cost_scaling {
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
  checkpoint_interval: 10
}
bbox_rasterizer_config {
  target_class_config {
    key: "car"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.40000000596
      cov_radius_y: 0.40000000596
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "cyclist"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "person"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.400000154972
}

Do I need to remove other classes configuration such as target_class_config?

Yes, it is needed.

I already change the config file to

random_seed: 42
dataset_config {
  data_sources {
    tfrecords_path: "/workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/*"
    image_directory_path: "/workspace/file/Python/annotate_kitti"
  }
  image_extension: "jpg"
  target_class_mapping {
    key: "person"
    value: "person"
  }
  validation_fold: 0
}
augmentation_config {
  preprocessing {
    output_image_width: 1248
    output_image_height: 384
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contrast_scale_max: 0.10000000149
    contrast_center: 0.5
  }
}
postprocessing_config {
  target_class_config {
    key: "person"
    value {
      clustering_config {
        coverage_threshold: 0.00749999983236
        dbscan_eps: 0.230000004172
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
}
model_config {
  pretrained_model_file: "/workspace/file/tlt/pc/experiment/pretrained_resnet18/tlt_resnet18_detectnet_v2_v1/resnet18.hdf5"
  num_layers: 18
  use_batch_norm: true
  activation {
    activation_type: "relu"
  }
  objective_set {
    bbox {
      scale: 35.0
      offset: 0.5
    }
    cov {
    }
  }
  training_precision {
    backend_floatx: FLOAT32
  }
  arch: "resnet"
}
evaluation_config {
  validation_period_during_training: 10
  first_validation_epoch: 1
  minimum_detection_ground_truth_overlap {
    key: "person"
    value: 0.5
  }
  evaluation_box_config {
    key: "person"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  average_precision_mode: INTEGRATE
}
cost_function_config {
  target_classes {
    name: "person"
    class_weight: 4.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  enable_autoweighting: true
  max_objective_weight: 0.999899983406
  min_objective_weight: 9.99999974738e-05
}
training_config {
  batch_size_per_gpu: 4
  num_epochs: 120
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-06
      max_learning_rate: 5e-04
      soft_start: 0.10000000149
      annealing: 0.699999988079
    }
  }
  regularizer {
    type: L1
    weight: 3.00000002618e-09
  }
  optimizer {
    adam {
      epsilon: 9.99999993923e-09
      beta1: 0.899999976158
      beta2: 0.999000012875
    }
  }
  cost_scaling {
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
  checkpoint_interval: 10
}
bbox_rasterizer_config {
  target_class_config {
    key: "person"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.400000154972
}

But, when i run transfer learning it throw same error

Using TensorFlow backend.
2020-03-09 09:26:31.718195: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-03-09 09:26:31.829126: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-09 09:26:31.829719: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x6fcb970 executing computations on platform CUDA. Devices:
2020-03-09 09:26:31.829810: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): GeForce GTX 1050, Compute Capability 6.1
2020-03-09 09:26:31.832741: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299965000 Hz
2020-03-09 09:26:31.833378: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x70e6fd0 executing computations on platform Host. Devices:
2020-03-09 09:26:31.833428: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2020-03-09 09:26:31.833630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
totalMemory: 3.95GiB freeMemory: 3.61GiB
2020-03-09 09:26:31.833684: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-03-09 09:26:31.834355: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-09 09:26:31.834374: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2020-03-09 09:26:31.834407: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2020-03-09 09:26:31.834533: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3401 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-03-09 09:26:31,835 [INFO] iva.detectnet_v2.scripts.train: Loading experiment spec at /workspace/file/tlt/pc/specs/detectnet_v2_train_resnet18_kitti.txt.
2020-03-09 09:26:31,837 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/file/tlt/pc/specs/detectnet_v2_train_resnet18_kitti.txt
WARNING:tensorflow:From ./detectnet_v2/dataloader/utilities.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
2020-03-09 09:26:31,844 [WARNING] tensorflow: From ./detectnet_v2/dataloader/utilities.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
2020-03-09 09:26:31,898 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 50 samples with a batch size of 4; each epoch will therefore take one extra step.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2020-03-09 09:26:31,904 [WARNING] tensorflow: From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/horovod/tensorflow/__init__.py:91: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
2020-03-09 09:26:31,926 [WARNING] tensorflow: From /usr/local/lib/python2.7/dist-packages/horovod/tensorflow/__init__.py:91: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 3, 384, 1248) 0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 192, 624) 9472        input_1[0][0]                    
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 64, 192, 624) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 64, 192, 624) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D)        (None, 64, 96, 312)  36928       activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, 96, 312)  256         block_1a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 64, 96, 312)  0           block_1a_bn_1[0][0]              
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D)        (None, 64, 96, 312)  36928       activation_2[0][0]               
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 96, 312)  4160        activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, 96, 312)  256         block_1a_conv_2[0][0]            
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, 96, 312)  256         block_1a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_1 (Add)                     (None, 64, 96, 312)  0           block_1a_bn_2[0][0]              
                                                                 block_1a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 64, 96, 312)  0           add_1[0][0]                      
__________________________________________________________________________________________________
block_1b_conv_1 (Conv2D)        (None, 64, 96, 312)  36928       activation_3[0][0]               
__________________________________________________________________________________________________
block_1b_bn_1 (BatchNormalizati (None, 64, 96, 312)  256         block_1b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 64, 96, 312)  0           block_1b_bn_1[0][0]              
__________________________________________________________________________________________________
block_1b_conv_2 (Conv2D)        (None, 64, 96, 312)  36928       activation_4[0][0]               
__________________________________________________________________________________________________
block_1b_bn_2 (BatchNormalizati (None, 64, 96, 312)  256         block_1b_conv_2[0][0]            
__________________________________________________________________________________________________
add_2 (Add)                     (None, 64, 96, 312)  0           block_1b_bn_2[0][0]              
                                                                 activation_3[0][0]               
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 64, 96, 312)  0           add_2[0][0]                      
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D)        (None, 128, 48, 156) 73856       activation_5[0][0]               
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, 48, 156) 512         block_2a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 128, 48, 156) 0           block_2a_bn_1[0][0]              
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D)        (None, 128, 48, 156) 147584      activation_6[0][0]               
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 48, 156) 8320        activation_5[0][0]               
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, 48, 156) 512         block_2a_conv_2[0][0]            
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, 48, 156) 512         block_2a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_3 (Add)                     (None, 128, 48, 156) 0           block_2a_bn_2[0][0]              
                                                                 block_2a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 128, 48, 156) 0           add_3[0][0]                      
__________________________________________________________________________________________________
block_2b_conv_1 (Conv2D)        (None, 128, 48, 156) 147584      activation_7[0][0]               
__________________________________________________________________________________________________
block_2b_bn_1 (BatchNormalizati (None, 128, 48, 156) 512         block_2b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 128, 48, 156) 0           block_2b_bn_1[0][0]              
__________________________________________________________________________________________________
block_2b_conv_2 (Conv2D)        (None, 128, 48, 156) 147584      activation_8[0][0]               
__________________________________________________________________________________________________
block_2b_bn_2 (BatchNormalizati (None, 128, 48, 156) 512         block_2b_conv_2[0][0]            
__________________________________________________________________________________________________
add_4 (Add)                     (None, 128, 48, 156) 0           block_2b_bn_2[0][0]              
                                                                 activation_7[0][0]               
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 128, 48, 156) 0           add_4[0][0]                      
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D)        (None, 256, 24, 78)  295168      activation_9[0][0]               
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, 24, 78)  1024        block_3a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 256, 24, 78)  0           block_3a_bn_1[0][0]              
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D)        (None, 256, 24, 78)  590080      activation_10[0][0]              
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 24, 78)  33024       activation_9[0][0]               
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, 24, 78)  1024        block_3a_conv_2[0][0]            
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, 24, 78)  1024        block_3a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_5 (Add)                     (None, 256, 24, 78)  0           block_3a_bn_2[0][0]              
                                                                 block_3a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 256, 24, 78)  0           add_5[0][0]                      
__________________________________________________________________________________________________
block_3b_conv_1 (Conv2D)        (None, 256, 24, 78)  590080      activation_11[0][0]              
__________________________________________________________________________________________________
block_3b_bn_1 (BatchNormalizati (None, 256, 24, 78)  1024        block_3b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 256, 24, 78)  0           block_3b_bn_1[0][0]              
__________________________________________________________________________________________________
block_3b_conv_2 (Conv2D)        (None, 256, 24, 78)  590080      activation_12[0][0]              
__________________________________________________________________________________________________
block_3b_bn_2 (BatchNormalizati (None, 256, 24, 78)  1024        block_3b_conv_2[0][0]            
__________________________________________________________________________________________________
add_6 (Add)                     (None, 256, 24, 78)  0           block_3b_bn_2[0][0]              
                                                                 activation_11[0][0]              
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 256, 24, 78)  0           add_6[0][0]                      
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D)        (None, 512, 24, 78)  1180160     activation_13[0][0]              
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (None, 512, 24, 78)  2048        block_4a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 512, 24, 78)  0           block_4a_bn_1[0][0]              
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D)        (None, 512, 24, 78)  2359808     activation_14[0][0]              
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (None, 512, 24, 78)  131584      activation_13[0][0]              
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (None, 512, 24, 78)  2048        block_4a_conv_2[0][0]            
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (None, 512, 24, 78)  2048        block_4a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_7 (Add)                     (None, 512, 24, 78)  0           block_4a_bn_2[0][0]              
                                                                 block_4a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 512, 24, 78)  0           add_7[0][0]                      
__________________________________________________________________________________________________
block_4b_conv_1 (Conv2D)        (None, 512, 24, 78)  2359808     activation_15[0][0]              
__________________________________________________________________________________________________
block_4b_bn_1 (BatchNormalizati (None, 512, 24, 78)  2048        block_4b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 512, 24, 78)  0           block_4b_bn_1[0][0]              
__________________________________________________________________________________________________
block_4b_conv_2 (Conv2D)        (None, 512, 24, 78)  2359808     activation_16[0][0]              
__________________________________________________________________________________________________
block_4b_bn_2 (BatchNormalizati (None, 512, 24, 78)  2048        block_4b_conv_2[0][0]            
__________________________________________________________________________________________________
add_8 (Add)                     (None, 512, 24, 78)  0           block_4b_bn_2[0][0]              
                                                                 activation_15[0][0]              
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 512, 24, 78)  0           add_8[0][0]                      
__________________________________________________________________________________________________
output_bbox (Conv2D)            (None, 4, 24, 78)    2052        activation_17[0][0]              
__________________________________________________________________________________________________
output_cov (Conv2D)             (None, 1, 24, 78)    513         activation_17[0][0]              
==================================================================================================
Total params: 11,197,893
Trainable params: 11,188,165
Non-trainable params: 9,728
__________________________________________________________________________________________________

target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
2020-03-09 09:26:47,888 [INFO] iva.detectnet_v2.scripts.train: Found 50 samples in training set
Traceback (most recent call last):
  File "/usr/local/bin/tlt-train-g1", line 8, in <module>
    sys.exit(main())
  File "./common/magnet_train.py", line 37, in main
  File "</usr/local/lib/python2.7/dist-packages/decorator.pyc:decorator-gen-2>", line 2, in main
  File "./detectnet_v2/utilities/timer.py", line 46, in wrapped_fn
  File "./detectnet_v2/scripts/train.py", line 633, in main
  File "./detectnet_v2/scripts/train.py", line 557, in run_experiment
  File "./detectnet_v2/scripts/train.py", line 480, in train_gridbox
  File "./detectnet_v2/scripts/train.py", line 354, in build_validation_graph
  File "./detectnet_v2/dataloader/default_dataloader.py", line 198, in get_dataset_tensors
  File "./detectnet_v2/dataloader/utilities.py", line 181, in extract_tfrecords_features
StopIteration

OK, I recall it. Your validation data is too small.

Please see https://devtalk.nvidia.com/default/topic/1067065/transfer-learning-toolkit/tlt-train-error-when-deploy-mobilenet_v2-by-using-detectnet/post/5405633/#5405633

val_images is (val_split)% of total images.train_images is (100-val_split)% of total images.

Please make sure below at the same time.
1) val_images >= num_shards
2) train_images >= num_shards

I set num_shards = 2, and generated new TFRecords

Converting Tfrecords for kitti trainval dataset
Using TensorFlow backend.
2020-03-09 11:32:20,875 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter
2020-03-09 11:32:20,875 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Num images in
Train: 52	Val: 5
2020-03-09 11:32:20,875 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2020-03-09 11:32:20,875 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 0
/usr/local/lib/python2.7/dist-packages/iva/detectnet_v2/dataio/kitti_converter_lib.py:266: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
2020-03-09 11:32:20,881 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 1
2020-03-09 11:32:20,884 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
person: 5

2020-03-09 11:32:20,884 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 0
2020-03-09 11:32:20,908 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 1
2020-03-09 11:32:20,932 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
person: 68

2020-03-09 11:32:20,932 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Cumulative object statistics
2020-03-09 11:32:20,932 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
person: 73

2020-03-09 11:32:20,932 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Class map. 
Label in GT: Label in tfrecords file 
person: person
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.

2020-03-09 11:32:20,932 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Tfrecords generation complete.

But my TLT process still continue to experience StopIteration

Using TensorFlow backend.
2020-03-09 11:32:33.784710: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-03-09 11:32:33.844663: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-09 11:32:33.845015: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5a1f930 executing computations on platform CUDA. Devices:
2020-03-09 11:32:33.845036: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): GeForce GTX 1050, Compute Capability 6.1
2020-03-09 11:32:33.846940: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299965000 Hz
2020-03-09 11:32:33.847246: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5b3af80 executing computations on platform Host. Devices:
2020-03-09 11:32:33.847277: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2020-03-09 11:32:33.847366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
totalMemory: 3.95GiB freeMemory: 3.64GiB
2020-03-09 11:32:33.847384: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-03-09 11:32:33.847814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-09 11:32:33.847828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2020-03-09 11:32:33.847835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2020-03-09 11:32:33.847890: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3426 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-03-09 11:32:33,848 [INFO] iva.detectnet_v2.scripts.train: Loading experiment spec at /workspace/file/tlt/pc/specs/detectnet_v2_train_resnet18_kitti.txt.
2020-03-09 11:32:33,849 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/file/tlt/pc/specs/detectnet_v2_train_resnet18_kitti.txt
WARNING:tensorflow:From ./detectnet_v2/dataloader/utilities.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
2020-03-09 11:32:33,854 [WARNING] tensorflow: From ./detectnet_v2/dataloader/utilities.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
2020-03-09 11:32:33,906 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 157 samples with a batch size of 4; each epoch will therefore take one extra step.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2020-03-09 11:32:33,909 [WARNING] tensorflow: From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/horovod/tensorflow/__init__.py:91: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
2020-03-09 11:32:33,920 [WARNING] tensorflow: From /usr/local/lib/python2.7/dist-packages/horovod/tensorflow/__init__.py:91: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 3, 384, 1248) 0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 192, 624) 9472        input_1[0][0]                    
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 64, 192, 624) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 64, 192, 624) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D)        (None, 64, 96, 312)  36928       activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, 96, 312)  256         block_1a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 64, 96, 312)  0           block_1a_bn_1[0][0]              
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D)        (None, 64, 96, 312)  36928       activation_2[0][0]               
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 96, 312)  4160        activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, 96, 312)  256         block_1a_conv_2[0][0]            
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, 96, 312)  256         block_1a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_1 (Add)                     (None, 64, 96, 312)  0           block_1a_bn_2[0][0]              
                                                                 block_1a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 64, 96, 312)  0           add_1[0][0]                      
__________________________________________________________________________________________________
block_1b_conv_1 (Conv2D)        (None, 64, 96, 312)  36928       activation_3[0][0]               
__________________________________________________________________________________________________
block_1b_bn_1 (BatchNormalizati (None, 64, 96, 312)  256         block_1b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 64, 96, 312)  0           block_1b_bn_1[0][0]              
__________________________________________________________________________________________________
block_1b_conv_2 (Conv2D)        (None, 64, 96, 312)  36928       activation_4[0][0]               
__________________________________________________________________________________________________
block_1b_bn_2 (BatchNormalizati (None, 64, 96, 312)  256         block_1b_conv_2[0][0]            
__________________________________________________________________________________________________
add_2 (Add)                     (None, 64, 96, 312)  0           block_1b_bn_2[0][0]              
                                                                 activation_3[0][0]               
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 64, 96, 312)  0           add_2[0][0]                      
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D)        (None, 128, 48, 156) 73856       activation_5[0][0]               
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, 48, 156) 512         block_2a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 128, 48, 156) 0           block_2a_bn_1[0][0]              
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D)        (None, 128, 48, 156) 147584      activation_6[0][0]               
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 48, 156) 8320        activation_5[0][0]               
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, 48, 156) 512         block_2a_conv_2[0][0]            
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, 48, 156) 512         block_2a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_3 (Add)                     (None, 128, 48, 156) 0           block_2a_bn_2[0][0]              
                                                                 block_2a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 128, 48, 156) 0           add_3[0][0]                      
__________________________________________________________________________________________________
block_2b_conv_1 (Conv2D)        (None, 128, 48, 156) 147584      activation_7[0][0]               
__________________________________________________________________________________________________
block_2b_bn_1 (BatchNormalizati (None, 128, 48, 156) 512         block_2b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 128, 48, 156) 0           block_2b_bn_1[0][0]              
__________________________________________________________________________________________________
block_2b_conv_2 (Conv2D)        (None, 128, 48, 156) 147584      activation_8[0][0]               
__________________________________________________________________________________________________
block_2b_bn_2 (BatchNormalizati (None, 128, 48, 156) 512         block_2b_conv_2[0][0]            
__________________________________________________________________________________________________
add_4 (Add)                     (None, 128, 48, 156) 0           block_2b_bn_2[0][0]              
                                                                 activation_7[0][0]               
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 128, 48, 156) 0           add_4[0][0]                      
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D)        (None, 256, 24, 78)  295168      activation_9[0][0]               
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, 24, 78)  1024        block_3a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 256, 24, 78)  0           block_3a_bn_1[0][0]              
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D)        (None, 256, 24, 78)  590080      activation_10[0][0]              
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 24, 78)  33024       activation_9[0][0]               
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, 24, 78)  1024        block_3a_conv_2[0][0]            
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, 24, 78)  1024        block_3a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_5 (Add)                     (None, 256, 24, 78)  0           block_3a_bn_2[0][0]              
                                                                 block_3a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 256, 24, 78)  0           add_5[0][0]                      
__________________________________________________________________________________________________
block_3b_conv_1 (Conv2D)        (None, 256, 24, 78)  590080      activation_11[0][0]              
__________________________________________________________________________________________________
block_3b_bn_1 (BatchNormalizati (None, 256, 24, 78)  1024        block_3b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 256, 24, 78)  0           block_3b_bn_1[0][0]              
__________________________________________________________________________________________________
block_3b_conv_2 (Conv2D)        (None, 256, 24, 78)  590080      activation_12[0][0]              
__________________________________________________________________________________________________
block_3b_bn_2 (BatchNormalizati (None, 256, 24, 78)  1024        block_3b_conv_2[0][0]            
__________________________________________________________________________________________________
add_6 (Add)                     (None, 256, 24, 78)  0           block_3b_bn_2[0][0]              
                                                                 activation_11[0][0]              
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 256, 24, 78)  0           add_6[0][0]                      
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D)        (None, 512, 24, 78)  1180160     activation_13[0][0]              
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (None, 512, 24, 78)  2048        block_4a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 512, 24, 78)  0           block_4a_bn_1[0][0]              
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D)        (None, 512, 24, 78)  2359808     activation_14[0][0]              
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (None, 512, 24, 78)  131584      activation_13[0][0]              
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (None, 512, 24, 78)  2048        block_4a_conv_2[0][0]            
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (None, 512, 24, 78)  2048        block_4a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_7 (Add)                     (None, 512, 24, 78)  0           block_4a_bn_2[0][0]              
                                                                 block_4a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 512, 24, 78)  0           add_7[0][0]                      
__________________________________________________________________________________________________
block_4b_conv_1 (Conv2D)        (None, 512, 24, 78)  2359808     activation_15[0][0]              
__________________________________________________________________________________________________
block_4b_bn_1 (BatchNormalizati (None, 512, 24, 78)  2048        block_4b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 512, 24, 78)  0           block_4b_bn_1[0][0]              
__________________________________________________________________________________________________
block_4b_conv_2 (Conv2D)        (None, 512, 24, 78)  2359808     activation_16[0][0]              
__________________________________________________________________________________________________
block_4b_bn_2 (BatchNormalizati (None, 512, 24, 78)  2048        block_4b_conv_2[0][0]            
__________________________________________________________________________________________________
add_8 (Add)                     (None, 512, 24, 78)  0           block_4b_bn_2[0][0]              
                                                                 activation_15[0][0]              
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 512, 24, 78)  0           add_8[0][0]                      
__________________________________________________________________________________________________
output_bbox (Conv2D)            (None, 4, 24, 78)    2052        activation_17[0][0]              
__________________________________________________________________________________________________
output_cov (Conv2D)             (None, 1, 24, 78)    513         activation_17[0][0]              
==================================================================================================
Total params: 11,197,893
Trainable params: 11,188,165
Non-trainable params: 9,728
__________________________________________________________________________________________________

target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
target/truncation is not updated to match the crop areaif the dataset contains target/truncation.
2020-03-09 11:32:44,720 [INFO] iva.detectnet_v2.scripts.train: Found 157 samples in training set
Traceback (most recent call last):
  File "/usr/local/bin/tlt-train-g1", line 8, in <module>
    sys.exit(main())
  File "./common/magnet_train.py", line 37, in main
  File "</usr/local/lib/python2.7/dist-packages/decorator.pyc:decorator-gen-2>", line 2, in main
  File "./detectnet_v2/utilities/timer.py", line 46, in wrapped_fn
  File "./detectnet_v2/scripts/train.py", line 633, in main
  File "./detectnet_v2/scripts/train.py", line 557, in run_experiment
  File "./detectnet_v2/scripts/train.py", line 480, in train_gridbox
  File "./detectnet_v2/scripts/train.py", line 354, in build_validation_graph
  File "./detectnet_v2/dataloader/default_dataloader.py", line 198, in get_dataset_tensors
  File "./detectnet_v2/dataloader/utilities.py", line 181, in extract_tfrecords_features
StopIteration

Can you paste the result of below?
$ ll -sh /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/*

Converting Tfrecords for kitti trainval dataset
Using TensorFlow backend.
2020-03-09 11:32:20,875 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter
2020-03-09 11:32:20,875 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Num images in
Train: 52	Val: 5
2020-03-09 11:32:20,875 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2020-03-09 11:32:20,875 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 0
/usr/local/lib/python2.7/dist-packages/iva/detectnet_v2/dataio/kitti_converter_lib.py:266: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
2020-03-09 11:32:20,881 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 1
2020-03-09 11:32:20,884 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
person: 5

2020-03-09 11:32:20,884 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 0
2020-03-09 11:32:20,908 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 1
2020-03-09 11:32:20,932 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
person: 68

2020-03-09 11:32:20,932 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Cumulative object statistics
2020-03-09 11:32:20,932 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
person: 73

2020-03-09 11:32:20,932 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Class map. 
Label in GT: Label in tfrecords file 
person: person
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.

2020-03-09 11:32:20,932 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Tfrecords generation complete.

Hi marco,
I mean could you please paste the result when you run below command?

$ ll -sh /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/*

Because ll command is not found, I ran:

!ls -l /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/*

Here is the result:

-rw-r--r-- 1 root root  1251 Mar  9 11:32 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00000-of-00002
-rw-r--r-- 1 root root     0 Mar  9 11:31 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00000-of-00003
-rw-r--r-- 1 root root     0 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00000-of-00010
-rw-r--r-- 1 root root  1878 Mar  9 11:32 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00001-of-00002
-rw-r--r-- 1 root root     0 Mar  9 11:31 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00001-of-00003
-rw-r--r-- 1 root root     0 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00001-of-00010
-rw-r--r-- 1 root root  1252 Mar  9 11:31 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00002-of-00003
-rw-r--r-- 1 root root     0 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00002-of-00010
-rw-r--r-- 1 root root     0 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00003-of-00010
-rw-r--r-- 1 root root     0 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00004-of-00010
-rw-r--r-- 1 root root     0 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00005-of-00010
-rw-r--r-- 1 root root     0 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00006-of-00010
-rw-r--r-- 1 root root     0 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00007-of-00010
-rw-r--r-- 1 root root     0 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00008-of-00010
-rw-r--r-- 1 root root  4380 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-000-of-002-shard-00009-of-00010
-rw-r--r-- 1 root root 16876 Mar  9 11:32 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00000-of-00002
-rw-r--r-- 1 root root 11505 Mar  9 11:31 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00000-of-00003
-rw-r--r-- 1 root root  3249 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00000-of-00010
-rw-r--r-- 1 root root 16632 Mar  9 11:32 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00001-of-00002
-rw-r--r-- 1 root root 11503 Mar  9 11:31 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00001-of-00003
-rw-r--r-- 1 root root  3313 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00001-of-00010
-rw-r--r-- 1 root root 12377 Mar  9 11:31 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00002-of-00003
-rw-r--r-- 1 root root  3127 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00002-of-00010
-rw-r--r-- 1 root root  3188 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00003-of-00010
-rw-r--r-- 1 root root  3190 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00004-of-00010
-rw-r--r-- 1 root root  3251 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00005-of-00010
-rw-r--r-- 1 root root  3311 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00006-of-00010
-rw-r--r-- 1 root root  3128 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00007-of-00010
-rw-r--r-- 1 root root  3127 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00008-of-00010
-rw-r--r-- 1 root root  3373 Mar  9 08:05 /workspace/file/tlt/pc/experiment/tfrecords/kitti_trainval/kitti_trainval-fold-001-of-002-shard-00009-of-00010

From above log, some tfrecords have zero size. That’s not expected.They are the old ones generated by “num_shards=10”.
So please remove them. Or you can clear all the tfrecords inside the folder. Then generate again.

Yes it did successfully ran! Thanks a lot.