Hi,
I am trying to do a resnet10 detection model training. And I have already create a tfrecord in tensorflow object detection API, instead of using convertion tool from TLT.
Is it okay for TLT to train a model with this kind of tfrecord?
Cause now it occured an error when start doing the training.
Using TensorFlow backend.
2019-04-10 03:42:50.027390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: Tesla P4 major: 6 minor: 1 memoryClockRate(GHz): 1.1135
pciBusID: 0000:05:00.0
totalMemory: 7.43GiB freeMemory: 7.31GiB
2019-04-10 03:42:50.027470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2019-04-10 03:42:54.552814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-10 03:42:54.552890: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2019-04-10 03:42:54.552900: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2019-04-10 03:42:54.553193: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7063 MB memory) -> physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:05:00.0, compute capability: 6.1)
2019-04-10 03:42:54,709 [INFO] /usr/local/lib/python2.7/dist-packages/iva/dashnet/scripts/train.pyc: Loading experiment spec at /workspace/tlt-experiments//nvidia_tlt/spec.txt.
2019-04-10 03:42:54,711 [INFO] /usr/local/lib/python2.7/dist-packages/dlav/drivenet/spec_handling/spec_loader.pyc: Merging specification from /workspace/tlt-experiments//nvidia_tlt/spec.txt
2019-04-10 03:43:47,074 [INFO] /usr/local/lib/python2.7/dist-packages/iva/dashnet/scripts/train.pyc: Cannot iterate over exactly 330258 samples with a batch size of 16; each epoch will therefore take one extra step.
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 3, 544, 960) 0
__________________________________________________________________________________________________
conv1 (Conv2D) (None, 64, 272, 480) 9472 input_1[0][0]
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization) (None, 64, 272, 480) 256 conv1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation) (None, 64, 272, 480) 0 bn_conv1[0][0]
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D) (None, 64, 136, 240) 36928 activation_1[0][0]
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, 136, 240) 256 block_1a_conv_1[0][0]
__________________________________________________________________________________________________
activation_2 (Activation) (None, 64, 136, 240) 0 block_1a_bn_1[0][0]
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D) (None, 64, 136, 240) 36928 activation_2[0][0]
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 136, 240) 4160 activation_1[0][0]
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, 136, 240) 256 block_1a_conv_2[0][0]
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, 136, 240) 256 block_1a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_1 (Add) (None, 64, 136, 240) 0 block_1a_bn_2[0][0]
block_1a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_3 (Activation) (None, 64, 136, 240) 0 add_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D) (None, 128, 68, 120) 73856 activation_3[0][0]
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, 68, 120) 512 block_2a_conv_1[0][0]
__________________________________________________________________________________________________
activation_4 (Activation) (None, 128, 68, 120) 0 block_2a_bn_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D) (None, 128, 68, 120) 147584 activation_4[0][0]
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 68, 120) 8320 activation_3[0][0]
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, 68, 120) 512 block_2a_conv_2[0][0]
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, 68, 120) 512 block_2a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_2 (Add) (None, 128, 68, 120) 0 block_2a_bn_2[0][0]
block_2a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_5 (Activation) (None, 128, 68, 120) 0 add_2[0][0]
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D) (None, 256, 34, 60) 295168 activation_5[0][0]
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3a_conv_1[0][0]
__________________________________________________________________________________________________
activation_6 (Activation) (None, 256, 34, 60) 0 block_3a_bn_1[0][0]
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D) (None, 256, 34, 60) 590080 activation_6[0][0]
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 34, 60) 33024 activation_5[0][0]
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3a_conv_2[0][0]
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, 34, 60) 1024 block_3a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_3 (Add) (None, 256, 34, 60) 0 block_3a_bn_2[0][0]
block_3a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_7 (Activation) (None, 256, 34, 60) 0 add_3[0][0]
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D) (None, 512, 34, 60) 1180160 activation_7[0][0]
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (None, 512, 34, 60) 2048 block_4a_conv_1[0][0]
__________________________________________________________________________________________________
activation_8 (Activation) (None, 512, 34, 60) 0 block_4a_bn_1[0][0]
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D) (None, 512, 34, 60) 2359808 activation_8[0][0]
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (None, 512, 34, 60) 131584 activation_7[0][0]
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (None, 512, 34, 60) 2048 block_4a_conv_2[0][0]
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (None, 512, 34, 60) 2048 block_4a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_4 (Add) (None, 512, 34, 60) 0 block_4a_bn_2[0][0]
block_4a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_9 (Activation) (None, 512, 34, 60) 0 add_4[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 512, 34, 60) 0 activation_9[0][0]
__________________________________________________________________________________________________
output_bbox (Conv2D) (None, 12, 34, 60) 6156 dropout_1[0][0]
__________________________________________________________________________________________________
output_cov (Conv2D) (None, 3, 34, 60) 1539 dropout_1[0][0]
==================================================================================================
Total params: 4,926,543
Trainable params: 4,920,655
Non-trainable params: 5,888
__________________________________________________________________________________________________
Traceback (most recent call last):
File "/usr/local/bin/tlt-train-g1", line 10, in <module>
sys.exit(main())
File "./common/magnet_train.py", line 24, in main
File "</usr/local/lib/python2.7/dist-packages/decorator.pyc:decorator-gen-106>", line 2, in main
File "./drivenet/common/timer.py", line 46, in wrapped_fn
File "./dashnet/scripts/train.py", line 627, in main
File "./dashnet/scripts/train.py", line 552, in run_experiment
File "./dashnet/scripts/train.py", line 457, in train_dashnet
File "./dashnet/scripts/train.py", line 291, in build_training_graph
File "./drivenet/common/dataloader/default_dataloader.py", line 199, in get_dataset_tensors
File "./drivenet/common/dataloader/default_dataloader.py", line 240, in _generate_images_and_ground_truth_labels
File "./drivenet/common/dataloader/default_dataloader.py", line 375, in _load_input_tensors
KeyError: 'frame/id'
Hoping to get a reply.
Thx in advance