Hi, I’m facing two issues when I tried to use the Nvidia Retail Object Detection as the pre-trained weights:
- I’m not able to use multi gpus
- with
--gpus 1
, it could complete the first epoch, but failed in the beginning of the second epoch. seems out of memory…
Any suggestions would be greatly appreciated. Thanks!
• Machine: GCP Vertex Notebook (Debian 10 + python 3.7 + Driver Version: 510.47.03 + CUDA Version: 11.6)
• Hardware: V100 x 2
• Network Type: EfficientDet TF2
• TLT Version: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf2.9.1
• Training spec file
data:
loader:
prefetch_size: 4
shuffle_file: True
max_instances_per_image: 100
skip_crowd_during_training: True
image_size: '640x640'
num_classes: 4
train_tfrecords:
- '/workspace/efficientdet/tfrecords/train/train-*'
val_tfrecords:
- '/workspace/efficientdet/tfrecords/val/val-*'
val_json_file: '/workspace/efficientdet/datasets/ap_od_03292023_val/labels.json'
train:
optimizer:
name: 'sgd'
momentum: 0.9
lr_schedule:
name: 'cosine'
warmup_epoch: 5
warmup_init: 0.0001
learning_rate: 0.2
amp: True
checkpoint: "/workspace/efficientdet/efficientdet-d5_038.tlt"
num_examples_per_epoch: 26972
moving_average_decay: 0.999
batch_size: 1
checkpoint_interval: 10
l2_weight_decay: 0.00004
l1_weight_decay: 0.0
clip_gradients_norm: 10.0
image_preview: True
qat: False
random_seed: 42
pruned_model_path: ''
num_epochs: 200
model:
name: 'efficientdet-d5'
aspect_ratios: '[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]'
anchor_scale: 4
min_level: 3
max_level: 7
num_scales: 3
freeze_bn: False
freeze_blocks: []
augment:
rand_hflip: True
random_crop_min_scale: 0.1
random_crop_max_scale: 2
auto_color_distortion: False
auto_translate_xy: True
evaluate:
batch_size: 1
num_samples: 4391
max_detections_per_image: 100
model_path: ''
export:
max_batch_size: 8
dynamic_batch_size: True
min_score_thresh: 0.4
model_path: ""
output_path: ""
inference:
model_path: ""
image_dir: ""
output_dir: ""
dump_label: False
batch_size: 1
prune:
model_path: ""
normalizer: 'max'
output_path: ""
equalization_criterion: 'union'
granularity: 8
threshold: 0.5
min_num_filters: 16
excluded_layers: []
key: 'nvidia-tlt'
results_dir: '/workspace/efficientdet/experiment_dir_unpruned'
• How to reproduce the issue ?
- command:
docker run -it --rm --gpus all -v /home/jupyter:/workspace --shm-size=32gb nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf2.9.1 efficientdet_tf2 train -e /workspace/efficientdet/specs/train.yaml --gpus 1
- multi gpus error log:
class-3-bn-4 (BatchNormalizati (None, 40, 40, 288) 1152 ['class-3[1][0]']
on) class-3-bn-5 (BatchNormalizati (None, 20, 20, 288) 1152 ['class-3[2][0]'] on)
class-3-bn-6 (BatchNormalizati (None, 10, 10, 288) 1152 ['class-3[3][0]'] on) class-3-bn-7 (BatchNormalizati (None, 5, 5, 288) 1152 ['class-3[4][0]']
on)
box-3-bn-3 (BatchNormalization (None, 80, 80, 288) 1152 ['box-3[0][0]'] )
box-3-bn-4 (BatchNormalization (None, 40, 40, 288) 1152 ['box-3[1][0]']
) box-3-bn-5 (BatchNormalization (None, 20, 20, 288) 1152 ['box-3[2][0]'] )
box-3-bn-6 (BatchNormalization (None, 10, 10, 288) 1152 ['box-3[3][0]'] ) box-3-bn-7 (BatchNormalization (None, 5, 5, 288) 1152 ['box-3[4][0]']
)
activation_59 (Activation) (None, 80, 80, 288) 0 ['class-3-bn-3[0][0]'] activation_63 (Activation) (None, 40, 40, 288) 0 ['class-3-bn-4[0][0]']
activation_67 (Activation) (None, 20, 20, 288) 0 ['class-3-bn-5[0][0]'] activation_71 (Activation) (None, 10, 10, 288) 0 ['class-3-bn-6[0][0]']
activation_75 (Activation) (None, 5, 5, 288) 0 ['class-3-bn-7[0][0]']
activation_79 (Activation) (None, 80, 80, 288) 0 ['box-3-bn-3[0][0]'] activation_83 (Activation) (None, 40, 40, 288) 0 ['box-3-bn-4[0][0]']
activation_87 (Activation) (None, 20, 20, 288) 0 ['box-3-bn-5[0][0]'] activation_91 (Activation) (None, 10, 10, 288) 0 ['box-3-bn-6[0][0]']
activation_95 (Activation) (None, 5, 5, 288) 0 ['box-3-bn-7[0][0]']
class-predict (SeparableConv2D multiple 12996 ['activation_59[0][0]',
) 'activation_63[0][0]',
'activation_67[0][0]',
'activation_71[0][0]',
'activation_75[0][0]']
box-predict (SeparableConv2D) multiple 12996 ['activation_79[0][0]',
'activation_83[0][0]',
'activation_87[0][0]',
'activation_91[0][0]',
'activation_95[0][0]'] ================================================================================================== Total params: 33,657,021
Trainable params: 33,429,629
Non-trainable params: 227,392 __________________________________________________________________________________________________ LR schedule method: cosine Use SGD optimizer
/usr/local/lib/python3.8/dist-packages/keras/backend.py:450: UserWarning: `tf.keras.backend.set_learning_phase` is deprecated and will be removed after 2020-10-11. T
o update it, simply pass a True/False value to the `training` argument of the `__call__` method of your layer or model. warnings.warn('`tf.keras.backend.set_learning_phase` is deprecated and ' WARNING:tensorflow:`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen. `period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
WARNING:tensorflow:`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen. Epoch 1/200 /usr/local/lib/python3.8/dist-packages/keras/backend.py:450: UserWarning: `tf.keras.backend.set_learning_phase` is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to the `training` argument of the `__call__` method of your layer or model.
warnings.warn('`tf.keras.backend.set_learning_phase` is deprecated and '
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7f89f00b0e50> and will run it as-is. Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7f89f00b0e50>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7f89f00b0e50> and will run it as-is. Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7f89f00b0e50>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f89f00b0670> and will run it as-is. Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f89f00b0670>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f89f00b0670> and will run it as-is. Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f89f00b0670>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7f6534255430> and will run it as-is. Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7f6534255430>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain th
e code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7f6534255430> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7f6534255430>. Note that functions defined in certain envi
ronments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain th
e code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f655b1dc8b0> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f655b1dc8b0>. Note that functions defined in certai
n environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are cert
ain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f655b1dc8b0> and will run it as-is.
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f655b1dc8b0> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f655b1dc8b0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7f88be7adaf0> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7f88be7adaf0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7f88be7adaf0> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7f88be7adaf0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f88be7adee0> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f88be7adee0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f88be7adee0> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f88be7adee0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7f655ac5ce50> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7f655ac5ce50>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7f655ac5ce50> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7f655ac5ce50>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f655ab83430> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f655ab83430>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f655ab83430> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7f655ab83430>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 0 on node 30fb12a8e965 exited on signal 9 (Killed).
--------------------------------------------------------------------------
Sending telemetry data.
Telemetry data couldn't be sent, but the command ran successfully.
[Error]: <urlopen error [Errno -2] Name or service not known>
Execution status: FAIL
- epoch 2 error log:
activation_58 (Activation) (None, 80, 80, 288) 0 ['class-2-bn-3[0][0]']
activation_62 (Activation) (None, 40, 40, 288) 0 ['class-2-bn-4[0][0]']
activation_66 (Activation) (None, 20, 20, 288) 0 ['class-2-bn-5[0][0]']
activation_70 (Activation) (None, 10, 10, 288) 0 ['class-2-bn-6[0][0]']
activation_74 (Activation) (None, 5, 5, 288) 0 ['class-2-bn-7[0][0]']
activation_78 (Activation) (None, 80, 80, 288) 0 ['box-2-bn-3[0][0]']
activation_82 (Activation) (None, 40, 40, 288) 0 ['box-2-bn-4[0][0]']
activation_86 (Activation) (None, 20, 20, 288) 0 ['box-2-bn-5[0][0]']
activation_90 (Activation) (None, 10, 10, 288) 0 ['box-2-bn-6[0][0]']
activation_94 (Activation) (None, 5, 5, 288) 0 ['box-2-bn-7[0][0]']
class-3 (SeparableConv2D) multiple 85824 ['activation_58[0][0]',
'activation_62[0][0]',
'activation_66[0][0]',
'activation_70[0][0]',
'activation_74[0][0]']
box-3 (SeparableConv2D) multiple 85824 ['activation_78[0][0]',
'activation_82[0][0]',
'activation_86[0][0]',
'activation_90[0][0]',
'activation_94[0][0]']
class-3-bn-3 (BatchNormalizati (None, 80, 80, 288) 1152 ['class-3[0][0]']
on)
class-3-bn-4 (BatchNormalizati (None, 40, 40, 288) 1152 ['class-3[1][0]']
on)
class-3-bn-5 (BatchNormalizati (None, 20, 20, 288) 1152 ['class-3[2][0]']
on)
class-3-bn-6 (BatchNormalizati (None, 10, 10, 288) 1152 ['class-3[3][0]']
on)
class-3-bn-7 (BatchNormalizati (None, 5, 5, 288) 1152 ['class-3[4][0]']
on)
box-3-bn-3 (BatchNormalization (None, 80, 80, 288) 1152 ['box-3[0][0]']
)
box-3-bn-4 (BatchNormalization (None, 40, 40, 288) 1152 ['box-3[1][0]']
)
box-3-bn-5 (BatchNormalization (None, 20, 20, 288) 1152 ['box-3[2][0]']
)
box-3-bn-6 (BatchNormalization (None, 10, 10, 288) 1152 ['box-3[3][0]']
)
box-3-bn-7 (BatchNormalization (None, 5, 5, 288) 1152 ['box-3[4][0]']
)
activation_59 (Activation) (None, 80, 80, 288) 0 ['class-3-bn-3[0][0]']
activation_63 (Activation) (None, 40, 40, 288) 0 ['class-3-bn-4[0][0]']
activation_67 (Activation) (None, 20, 20, 288) 0 ['class-3-bn-5[0][0]']
activation_71 (Activation) (None, 10, 10, 288) 0 ['class-3-bn-6[0][0]']
activation_75 (Activation) (None, 5, 5, 288) 0 ['class-3-bn-7[0][0]']
activation_79 (Activation) (None, 80, 80, 288) 0 ['box-3-bn-3[0][0]']
activation_83 (Activation) (None, 40, 40, 288) 0 ['box-3-bn-4[0][0]']
activation_87 (Activation) (None, 20, 20, 288) 0 ['box-3-bn-5[0][0]']
activation_91 (Activation) (None, 10, 10, 288) 0 ['box-3-bn-6[0][0]']
activation_95 (Activation) (None, 5, 5, 288) 0 ['box-3-bn-7[0][0]']
class-predict (SeparableConv2D multiple 12996 ['activation_59[0][0]',
) 'activation_63[0][0]',
'activation_67[0][0]',
'activation_71[0][0]',
'activation_75[0][0]']
box-predict (SeparableConv2D) multiple 12996 ['activation_79[0][0]',
'activation_83[0][0]',
'activation_87[0][0]',
'activation_91[0][0]',
'activation_95[0][0]']
==================================================================================================
Total params: 33,657,021
Trainable params: 33,429,629
Non-trainable params: 227,392
__________________________________________________________________________________________________
LR schedule method: cosine
Use SGD optimizer
WARNING:tensorflow:`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
WARNING:tensorflow:`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
Epoch 1/200
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7fac1430e430> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7fac1430e430>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7fac1430e430> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7fac1430e430>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7fac4da988b0> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7fac4da988b0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7fac4da988b0> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7fac4da988b0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7fac4d557e50> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7fac4d557e50>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._update at 0x7fac4d557e50> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._update at 0x7fac4d557e50>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7fac4d4fc430> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7fac4d4fc430>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
AutoGraph could not transform <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7fac4d4fc430> and will run it as-is.
Cause: Unable to locate the source code of <function HvdMovingAverage.update_average.<locals>._apply_moving at 0x7fac4d4fc430>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
6/26972 [..............................] - ETA: 3:59:32 - det_loss: 1.3752 - cls_loss: 0.6926 - box_loss: 0.0137 - reg_l2_loss: 0.2557 - reg_l1_loss: 0.0000e+00
- loss: 1.6308 - learning_rate: 1.0371e-04 - gradient_norm: nanWARNING:tensorflow:Callback method `on_train_batch_end` is slow compared to the batch time (batch time: 0.5248s vs `on_train_batch_end` time: 3.8910s). Check your callbacks.
Callback method `on_train_batch_end` is slow compared to the batch time (batch time: 0.5248s vs `on_train_batch_end` time: 3.8910s). Check your callbacks.
26972/26972 [==============================] - ETA: 0s - det_loss: 0.8992 - cls_loss: 0.4544 - box_loss: 0.0089 - reg_l2_loss: 0.2541 - reg_l1_loss: 0.0000e+00 - loss: 1.1534 - learning_rate: 0.0201 - gradient_norm: nanNone
26972/26972 [==============================] - 11382s 403ms/step - det_loss: 0.8992 - cls_loss: 0.4544 - box_loss: 0.0089 - reg_l2_loss: 0.2541 - reg_l1_loss: 0.0000e+00 - loss: 1.1534 - learning_rate: 0.0201 - gradient_norm: nan - val_det_loss: 0.7016 - val_cls_loss: 0.3337 - val_box_loss: 0.0074 - val_loss: 0.9463
Epoch 2/200
2088/26972 [=>............................] - ETA: 2:40:13 - det_loss: 0.8325 - cls_loss: 0.3928 - box_loss: 0.0088 - reg_l2_loss: 0.2435 - reg_l1_loss: 0.0000e+00
- loss: 1.0760 - learning_rate: 0.0416 - gradient_norm: 1.3853Killed
Sending telemetry data.
Telemetry data couldn't be sent, but the command ran successfully.
[Error]: <urlopen error [Errno -2] Name or service not known>
Execution status: FAIL