TFRecord creation process

I have created TFRecords for my training/validation data but these don’t appear to be sufficient for TLT in that required fields are missing. For example:

File "/usr/local/bin/tlt-train-g1", line 10, in <module>
  sys.exit(main())
File "./common/magnet_train.py", line 37, in main
File "</usr/local/lib/python2.7/dist-packages/decorator.pyc:decorator-gen-2>", line 2, in main
File "./detectnet_v2/utilities/timer.py", line 46, in wrapped_fn
File "./detectnet_v2/scripts/train.py", line 632, in main
File "./detectnet_v2/scripts/train.py", line 556, in run_experiment
File "./detectnet_v2/scripts/train.py", line 466, in train_gridbox
File "./detectnet_v2/scripts/train.py", line 296, in build_training_graph
File "./detectnet_v2/dataloader/default_dataloader.py", line 203, in get_dataset_tensors
File "./detectnet_v2/dataloader/default_dataloader.py", line 244, in _generate_images_and_ground_truth_labels
File "./detectnet_v2/dataloader/default_dataloader.py", line 384, in _load_input_tensors
KeyError: 'frame/id'

According to this it seems that we’re forced to use the tlt-dataset-convert tool to get valid TFRecords for use with TLT. Is this still the case or can I somehow cook up my own TFRecord creation code in a way that satisfies the TLT TFRecord requirements? If so where are these requirements spelled out?

My data is originally in PASCAL VOC format. Do I need to convert my annotations to KITTI format in order to use the tlt-dataset-convert` tool, or can it take PASCAL VOC as input?

BTW I tried to find the Python code that performs the conversion from KITTI to TFRecords, i.e. the module iva.detectnet_v2.scripts.dataset_convert referenced in the tlt-dataset-convert script, but it seems that the Python source for that is somehow obfuscated. Can anyone suggest how I’d go about finding that Python code so I can see what’s being done to create the TFRecord structure required by TLT?

Thanks in advance for any comments or suggestions.

Hi monocongo,
Could you please paste your command or step which resulted in your attached error?

Thanks @Morganh.

Here’s the command with the complete output:

$ tlt-train detectnet_v2 -r output -e detectnet2_resnet18_train.txt -k MY_API_KEY

Using TensorFlow backend.
2019-10-10 14:31:49.783663: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-10-10 14:31:49.911387: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-10-10 14:31:49.911900: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x6d3c2f0 executing computations on platform CUDA. Devices:
2019-10-10 14:31:49.911927: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): GeForce GTX 1050 Ti with Max-Q Design, Compute Capability 6.1
2019-10-10 14:31:49.913621: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2208000000 Hz
2019-10-10 14:31:49.914307: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x6da5fd0 executing computations on platform Host. Devices:
2019-10-10 14:31:49.914331: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-10-10 14:31:49.914462: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: GeForce GTX 1050 Ti with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.4175
pciBusID: 0000:01:00.0
totalMemory: 3.95GiB freeMemory: 3.32GiB
2019-10-10 14:31:49.914481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-10-10 14:31:49.915109: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-10 14:31:49.915123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-10-10 14:31:49.915131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-10-10 14:31:49.915202: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3099 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-10-10 14:31:49,916 [INFO] iva.detectnet_v2.scripts.train: Loading experiment spec at detectnet2_resnet18_train.txt.
2019-10-10 14:31:49,916 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from detectnet2_resnet18_train.txt
WARNING:tensorflow:From ./detectnet_v2/dataloader/utilities.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
2019-10-10 14:31:49,923 [WARNING] tensorflow: From ./detectnet_v2/dataloader/utilities.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
2019-10-10 14:31:50,040 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 5219 samples with a batch size of 4; each epoch will therefore take one extra step.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2019-10-10 14:31:50,045 [WARNING] tensorflow: From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/horovod/tensorflow/__init__.py:91: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
2019-10-10 14:31:50,058 [WARNING] tensorflow: From /usr/local/lib/python2.7/dist-packages/horovod/tensorflow/__init__.py:91: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 3, 384, 1248) 0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 192, 624) 9472        input_1[0][0]                    
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 64, 192, 624) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 64, 192, 624) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D)        (None, 64, 96, 312)  36928       activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, 96, 312)  256         block_1a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 64, 96, 312)  0           block_1a_bn_1[0][0]              
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D)        (None, 64, 96, 312)  36928       activation_2[0][0]               
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 96, 312)  4160        activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, 96, 312)  256         block_1a_conv_2[0][0]            
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, 96, 312)  256         block_1a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_1 (Add)                     (None, 64, 96, 312)  0           block_1a_bn_2[0][0]              
                                                                 block_1a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 64, 96, 312)  0           add_1[0][0]                      
__________________________________________________________________________________________________
block_1b_conv_1 (Conv2D)        (None, 64, 96, 312)  36928       activation_3[0][0]               
__________________________________________________________________________________________________
block_1b_bn_1 (BatchNormalizati (None, 64, 96, 312)  256         block_1b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 64, 96, 312)  0           block_1b_bn_1[0][0]              
__________________________________________________________________________________________________
block_1b_conv_2 (Conv2D)        (None, 64, 96, 312)  36928       activation_4[0][0]               
__________________________________________________________________________________________________
block_1b_bn_2 (BatchNormalizati (None, 64, 96, 312)  256         block_1b_conv_2[0][0]            
__________________________________________________________________________________________________
add_2 (Add)                     (None, 64, 96, 312)  0           block_1b_bn_2[0][0]              
                                                                 activation_3[0][0]               
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 64, 96, 312)  0           add_2[0][0]                      
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D)        (None, 128, 48, 156) 73856       activation_5[0][0]               
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, 48, 156) 512         block_2a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 128, 48, 156) 0           block_2a_bn_1[0][0]              
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D)        (None, 128, 48, 156) 147584      activation_6[0][0]               
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 48, 156) 8320        activation_5[0][0]               
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, 48, 156) 512         block_2a_conv_2[0][0]            
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, 48, 156) 512         block_2a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_3 (Add)                     (None, 128, 48, 156) 0           block_2a_bn_2[0][0]              
                                                                 block_2a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 128, 48, 156) 0           add_3[0][0]                      
__________________________________________________________________________________________________
block_2b_conv_1 (Conv2D)        (None, 128, 48, 156) 147584      activation_7[0][0]               
__________________________________________________________________________________________________
block_2b_bn_1 (BatchNormalizati (None, 128, 48, 156) 512         block_2b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 128, 48, 156) 0           block_2b_bn_1[0][0]              
__________________________________________________________________________________________________
block_2b_conv_2 (Conv2D)        (None, 128, 48, 156) 147584      activation_8[0][0]               
__________________________________________________________________________________________________
block_2b_bn_2 (BatchNormalizati (None, 128, 48, 156) 512         block_2b_conv_2[0][0]            
__________________________________________________________________________________________________
add_4 (Add)                     (None, 128, 48, 156) 0           block_2b_bn_2[0][0]              
                                                                 activation_7[0][0]               
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 128, 48, 156) 0           add_4[0][0]                      
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D)        (None, 256, 24, 78)  295168      activation_9[0][0]               
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, 24, 78)  1024        block_3a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 256, 24, 78)  0           block_3a_bn_1[0][0]              
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D)        (None, 256, 24, 78)  590080      activation_10[0][0]              
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 24, 78)  33024       activation_9[0][0]               
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, 24, 78)  1024        block_3a_conv_2[0][0]            
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, 24, 78)  1024        block_3a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_5 (Add)                     (None, 256, 24, 78)  0           block_3a_bn_2[0][0]              
                                                                 block_3a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 256, 24, 78)  0           add_5[0][0]                      
__________________________________________________________________________________________________
block_3b_conv_1 (Conv2D)        (None, 256, 24, 78)  590080      activation_11[0][0]              
__________________________________________________________________________________________________
block_3b_bn_1 (BatchNormalizati (None, 256, 24, 78)  1024        block_3b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 256, 24, 78)  0           block_3b_bn_1[0][0]              
__________________________________________________________________________________________________
block_3b_conv_2 (Conv2D)        (None, 256, 24, 78)  590080      activation_12[0][0]              
__________________________________________________________________________________________________
block_3b_bn_2 (BatchNormalizati (None, 256, 24, 78)  1024        block_3b_conv_2[0][0]            
__________________________________________________________________________________________________
add_6 (Add)                     (None, 256, 24, 78)  0           block_3b_bn_2[0][0]              
                                                                 activation_11[0][0]              
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 256, 24, 78)  0           add_6[0][0]                      
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D)        (None, 512, 24, 78)  1180160     activation_13[0][0]              
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (None, 512, 24, 78)  2048        block_4a_conv_1[0][0]            
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 512, 24, 78)  0           block_4a_bn_1[0][0]              
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D)        (None, 512, 24, 78)  2359808     activation_14[0][0]              
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (None, 512, 24, 78)  131584      activation_13[0][0]              
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (None, 512, 24, 78)  2048        block_4a_conv_2[0][0]            
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (None, 512, 24, 78)  2048        block_4a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_7 (Add)                     (None, 512, 24, 78)  0           block_4a_bn_2[0][0]              
                                                                 block_4a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 512, 24, 78)  0           add_7[0][0]                      
__________________________________________________________________________________________________
block_4b_conv_1 (Conv2D)        (None, 512, 24, 78)  2359808     activation_15[0][0]              
__________________________________________________________________________________________________
block_4b_bn_1 (BatchNormalizati (None, 512, 24, 78)  2048        block_4b_conv_1[0][0]            
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 512, 24, 78)  0           block_4b_bn_1[0][0]              
__________________________________________________________________________________________________
block_4b_conv_2 (Conv2D)        (None, 512, 24, 78)  2359808     activation_16[0][0]              
__________________________________________________________________________________________________
block_4b_bn_2 (BatchNormalizati (None, 512, 24, 78)  2048        block_4b_conv_2[0][0]            
__________________________________________________________________________________________________
add_8 (Add)                     (None, 512, 24, 78)  0           block_4b_bn_2[0][0]              
                                                                 activation_15[0][0]              
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 512, 24, 78)  0           add_8[0][0]                      
__________________________________________________________________________________________________
output_bbox (Conv2D)            (None, 8, 24, 78)    4104        activation_17[0][0]              
__________________________________________________________________________________________________
output_cov (Conv2D)             (None, 2, 24, 78)    1026        activation_17[0][0]              
==================================================================================================
Total params: 11,200,458
Trainable params: 11,181,258
Non-trainable params: 19,200
__________________________________________________________________________________________________
Traceback (most recent call last):
  File "/usr/local/bin/tlt-train-g1", line 10, in <module>
    sys.exit(main())
  File "./common/magnet_train.py", line 37, in main
  File "</usr/local/lib/python2.7/dist-packages/decorator.pyc:decorator-gen-2>", line 2, in main
  File "./detectnet_v2/utilities/timer.py", line 46, in wrapped_fn
  File "./detectnet_v2/scripts/train.py", line 632, in main
  File "./detectnet_v2/scripts/train.py", line 556, in run_experiment
  File "./detectnet_v2/scripts/train.py", line 466, in train_gridbox
  File "./detectnet_v2/scripts/train.py", line 296, in build_training_graph
  File "./detectnet_v2/dataloader/default_dataloader.py", line 203, in get_dataset_tensors
  File "./detectnet_v2/dataloader/default_dataloader.py", line 244, in _generate_images_and_ground_truth_labels
  File "./detectnet_v2/dataloader/default_dataloader.py", line 384, in _load_input_tensors
KeyError: 'frame/id'

From what I can tell there are significantly different keys used for the TFRecords produced by the tlt-dataset-convert tool than those I’ve used in the TFRecord creation script that I’ve used to convert my data’s original annotations which are in PASCAL VOC format. My code is based on this code from the TensorFlow object detection models API. If I could access the Python code driving the tlt-dataset-convert tool then I could probably surmount this issue, but as it stands it appears to be squirreled away somewhere inaccessible, at least I’ve not managed to find where it lives in the Docker container.

Hi monocongo,
As mentioned in https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#input_gridbox_ssd ,
The object detection apps(Detectnet_v2,SSD,Faster_rcnn) in TLT expect data in KITTI file format. For DetectNet_v2 and SSD, , this data is converted to TFRecords for training. That means, DetectNet_v2 requires KITTI format data to be converted to TFRecords.

Yes, only TFRecords generated using converter tool will be compatible with TLT training. To do so, TLT includes the tlt-dataset-convert tool.

So, it is necessary for you to convert PASCAL VOC format to kitti format.

Also, as mentioned in https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#requirements,
the tlt-train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly.

So, it is necessary to “resize the image” in the step “convert PASCAL VOC format to kitti format”.

Thanks for your help, @Morganh