Hi Morgan,
Here you go…
root@7c8a1e37c38d:/workspace# tlt-train faster_rcnn -e /workspace/nvidia_experiment/frcnn.config
Using TensorFlow backend.
2019-10-28 14:41:43.653385: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-10-28 14:41:43.739853: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-10-28 14:41:43.740640: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x62110a0 executing computations on platform CUDA. Devices:
2019-10-28 14:41:43.740658: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): GeForce GTX 1070 with Max-Q Design, Compute Capability 6.1
2019-10-28 14:41:43.742567: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2208000000 Hz
2019-10-28 14:41:43.743261: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x627c3e0 executing computations on platform Host. Devices:
2019-10-28 14:41:43.743279: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): <undefined>, <undefined>
2019-10-28 14:41:43.743585: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce GTX 1070 with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.379
pciBusID: 0000:01:00.0
totalMemory: 7.93GiB freeMemory: 7.47GiB
2019-10-28 14:41:43.743601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-10-28 14:41:43.744450: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-28 14:41:43.744463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2019-10-28 14:41:43.744470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2019-10-28 14:41:43.744725: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7266 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-10-28 14:41:43,749 [INFO] /usr/local/lib/python2.7/dist-packages/iva/faster_rcnn/scripts/train.pyc: valid_class_mapping: {u'customer': 0}
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2019-10-28 14:41:43,755 [WARNING] tensorflow: From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
/usr/local/lib/python2.7/dist-packages/iva/faster_rcnn/models/custom_layers.py:119: RuntimeWarning: divide by zero encountered in long_scalars
/usr/local/lib/python2.7/dist-packages/iva/faster_rcnn/models/custom_layers.py:121: RuntimeWarning: invalid value encountered in long_scalars
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 3, None, None 0
__________________________________________________________________________________________________
conv1 (Conv2D) (None, 64, None, Non 9472 input_1[0][0]
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization) (None, 64, None, Non 256 conv1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation) (None, 64, None, Non 0 bn_conv1[0][0]
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D) (None, 64, None, Non 36864 activation_1[0][0]
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, None, Non 256 block_1a_conv_1[0][0]
__________________________________________________________________________________________________
activation_2 (Activation) (None, 64, None, Non 0 block_1a_bn_1[0][0]
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D) (None, 64, None, Non 36864 activation_2[0][0]
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, None, Non 4096 activation_1[0][0]
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, None, Non 256 block_1a_conv_2[0][0]
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, None, Non 256 block_1a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_1 (Add) (None, 64, None, Non 0 block_1a_bn_2[0][0]
block_1a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_3 (Activation) (None, 64, None, Non 0 add_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D) (None, 128, None, No 73728 activation_3[0][0]
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, None, No 512 block_2a_conv_1[0][0]
__________________________________________________________________________________________________
activation_4 (Activation) (None, 128, None, No 0 block_2a_bn_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D) (None, 128, None, No 147456 activation_4[0][0]
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, None, No 8192 activation_3[0][0]
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, None, No 512 block_2a_conv_2[0][0]
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, None, No 512 block_2a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_2 (Add) (None, 128, None, No 0 block_2a_bn_2[0][0]
block_2a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_5 (Activation) (None, 128, None, No 0 add_2[0][0]
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D) (None, 256, None, No 294912 activation_5[0][0]
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, None, No 1024 block_3a_conv_1[0][0]
__________________________________________________________________________________________________
activation_6 (Activation) (None, 256, None, No 0 block_3a_bn_1[0][0]
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D) (None, 256, None, No 589824 activation_6[0][0]
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, None, No 32768 activation_5[0][0]
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, None, No 1024 block_3a_conv_2[0][0]
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, None, No 1024 block_3a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_3 (Add) (None, 256, None, No 0 block_3a_bn_2[0][0]
block_3a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_7 (Activation) (None, 256, None, No 0 add_3[0][0]
__________________________________________________________________________________________________
input_2 (InputLayer) (None, 256, 4, 1) 0
__________________________________________________________________________________________________
crop_and_resize_1 (CropAndResiz (256, 256, 14, 14) 0 activation_7[0][0]
input_2[0][0]
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D) (256, 512, 7, 7) 1179648 crop_and_resize_1[0][0]
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (256, 512, 7, 7) 2048 block_4a_conv_1[0][0]
__________________________________________________________________________________________________
activation_8 (Activation) (256, 512, 7, 7) 0 block_4a_bn_1[0][0]
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D) (256, 512, 7, 7) 2359296 activation_8[0][0]
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (256, 512, 7, 7) 131072 crop_and_resize_1[0][0]
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (256, 512, 7, 7) 2048 block_4a_conv_2[0][0]
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (256, 512, 7, 7) 2048 block_4a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_4 (Add) (256, 512, 7, 7) 0 block_4a_bn_2[0][0]
block_4a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_9 (Activation) (256, 512, 7, 7) 0 add_4[0][0]
__________________________________________________________________________________________________
avg_pool (AveragePooling2D) (256, 512, 1, 1) 0 activation_9[0][0]
__________________________________________________________________________________________________
classifier_flatten (Flatten) (256, 512) 0 avg_pool[0][0]
__________________________________________________________________________________________________
rpn_conv1 (Conv2D) (None, 512, None, No 1180160 activation_7[0][0]
__________________________________________________________________________________________________
dense_class (Dense) (256, 1) 513 classifier_flatten[0][0]
__________________________________________________________________________________________________
dense_regress (Dense) (256, 0) 0 classifier_flatten[0][0]
__________________________________________________________________________________________________
rpn_out_class (Conv2D) (None, 9, None, None 4617 rpn_conv1[0][0]
__________________________________________________________________________________________________
rpn_out_regress (Conv2D) (None, 36, None, Non 18468 rpn_conv1[0][0]
__________________________________________________________________________________________________
TF_reshape_2_class (TFReshape) (1, 256, 1) 0 dense_class[0][0]
__________________________________________________________________________________________________
TF_reshape_3_regr (TFReshape) (1, 256, 0) 0 dense_regress[0][0]
==================================================================================================
Total params: 6,119,726
Trainable params: 5,806,638
Non-trainable params: 313,088
__________________________________________________________________________________________________
2019-10-28 14:41:44,209 [INFO] /usr/local/lib/python2.7/dist-packages/iva/faster_rcnn/scripts/train.pyc: Loading pretrained weights from /workspace/nvidia_experiment/tlt_resnet10_faster_rcnn_v1/resnet10.h5
Traceback (most recent call last):
File "/usr/local/bin/tlt-train-g1", line 10, in <module>
sys.exit(main())
File "./common/magnet_train.py", line 30, in main
File "./faster_rcnn/scripts/train.py", line 232, in main
File "/usr/local/lib/python2.7/dist-packages/keras/engine/network.py", line 1163, in load_weights
reshape=reshape)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/saving.py", line 1130, in load_weights_from_hdf5_group_by_name
' element(s).')
ValueError: Layer #4 (named "block_1a_conv_1") expects 1 weight(s), but the saved weights have 2 element(s).