While training mask_rcnn on a custom dataset, getting the following error.
[MaskRCNN] INFO : # ============================================= #
[MaskRCNN] INFO : Start Training
[MaskRCNN] INFO : # %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% #
[GPU 00] Restoring pretrained weights (265 Tensors)
[MaskRCNN] INFO : Pretrained weights loaded with success...
[MaskRCNN] INFO : Saving checkpoints for 0 into /mnt/flare/results/model.step-0.tlt.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: {{function_node __inference_Dataset_map__map_func_set_random_wrapper_15633}} Requested more than 0 entries, but params is empty. Params shape: [0,1920,1080]
[[{{node parser/process_boxes_classes_indices_for_training/GatherNd_2}}]]
[[IteratorGetNext]]
[[MLP/multilevel_propose_rois/level_2/combined_non_max_suppression/CombinedNonMaxSuppression/_3701]]
(1) Invalid argument: {{function_node __inference_Dataset_map__map_func_set_random_wrapper_15633}} Requested more than 0 entries, but params is empty. Params shape: [0,1920,1080]
[[{{node parser/process_boxes_classes_indices_for_training/GatherNd_2}}]]
[[IteratorGetNext]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/train.py", line 222, i
n <module>
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/train.py", line 218, i
n main
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/train.py", line 85, in
run_executer
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/executer/distributed_executer.
py", line 399, in train_and_eval
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 370, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1161, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1195, in _train_model_default
saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1494, in _train_with_estimator_spec
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 754, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1259, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1360, in run
raise six.reraise(*original_exc_info)
File "/usr/local/lib/python3.6/dist-packages/six.py", line 696, in reraise
raise value
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1345, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1418, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1176, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Requested more than 0 entries, but params is empty. Params shape: [0,1920,1080]
[[{{node parser/process_boxes_classes_indices_for_training/GatherNd_2}}]]
[[IteratorGetNext]]
[[MLP/multilevel_propose_rois/level_2/combined_non_max_suppression/CombinedNonMaxSuppression/_3701]]
(1) Invalid argument: Requested more than 0 entries, but params is empty. Params shape: [0,1920,1080]
[[{{node parser/process_boxes_classes_indices_for_training/GatherNd_2}}]]
[[IteratorGetNext]]
0 successful operations.
0 derived errors ignored.
specs for training
seed: 123
use_amp: False
warmup_steps: 1000
checkpoint: “/home/pretrained_resnet50/pretrained_instance_segmentation_vresnet50/resnet50.hdf5”
learning_rate_steps: “[10000, 15000, 20000]”
learning_rate_decay_levels: “[0.1, 0.02, 0.01]”
total_steps: 25000
train_batch_size: 2
eval_batch_size: 4
num_steps_per_eval: 5000
momentum: 0.9
l2_weight_decay: 0.0001
warmup_learning_rate: 0.0001
init_learning_rate: 0.01data_config{
image_size: “(832, 1344)”
augment_input_data: True
eval_samples: 500
training_file_pattern: “/home/tfrecords/coco_train*”
validation_file_pattern: “/home/tfrecords/coco_val*”
val_json_file: “/home/val_seg_annotations.json”# dataset specific parameters num_classes: 3 skip_crowd_during_training: True
}
maskrcnn_config {
nlayers: 50
arch: “resnet”
freeze_bn: True
freeze_blocks: “[0,1]”
gt_mask_size: 112# Region Proposal Network rpn_positive_overlap: 0.7 rpn_negative_overlap: 0.3 rpn_batch_size_per_im: 256 rpn_fg_fraction: 0.5 rpn_min_size: 0. # Proposal layer. batch_size_per_im: 512 fg_fraction: 0.25 fg_thresh: 0.5 bg_thresh_hi: 0.5 bg_thresh_lo: 0. # Faster-RCNN heads. fast_rcnn_mlp_head_dim: 1024 bbox_reg_weights: "(10., 10., 5., 5.)" # Mask-RCNN heads. include_mask: True mrcnn_resolution: 28 # training train_rpn_pre_nms_topn: 2000 train_rpn_post_nms_topn: 1000 train_rpn_nms_threshold: 0.7 # evaluation test_detections_per_image: 100 test_nms: 0.5 test_rpn_pre_nms_topn: 1000 test_rpn_post_nms_topn: 1000 test_rpn_nms_thresh: 0.7 # model architecture min_level: 2 max_level: 6 num_scales: 1 aspect_ratios: "[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]" anchor_scale: 8 # localization loss rpn_box_loss_weight: 1.0 fast_rcnn_box_loss_weight: 1.0 mrcnn_weight_loss_mask: 1.0
}