Mask rcnn export needs validation dataset tfrecords

Hi,
Refer to Mask Rcnn export validation_file_pattern assertion failed
I created dummy tfrecord file but I get the following error:

Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/export.py", line 12, in <module>
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py", line 268, in launch_export
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py", line 250, in run_export
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py", line 595, in export
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py", line 286, in save_etlt_file
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py", line 239, in _train_ckpt_to_eval_ckpt
StopIteration
Using TensorFlow backend.

Please make sure there are not any zero size tfrecords files.

I checked again. I use this tfrecord file which has a size of 97.4 KB.
-00000-of-00001.tfrecord (97.4 KB)

Can you share the full command and full log when you run into the error?

full command:
mask_rcnn export -m /app/results/mm/unpruned_model/model.step-245.tlt -k $KEY -o /app/results/mm/export/model.step-245.etlt -e /app/results/mm/unpruned_model/final_spec.txt --data_type fp32 --gpu_index 1

full log:

Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/numba/cuda/envvars.py:17: NumbaWarning:
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('http://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')
warnings.warn(errors.NumbaWarning(msg))
/usr/local/lib/python3.6/dist-packages/numba/cuda/envvars.py:17: NumbaWarning:
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('http://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')
warnings.warn(errors.NumbaWarning(msg))
2022-09-07 16:36:39,685 [INFO] iva.mask_rcnn.utils.spec_loader: Loading specification from /app/results/mm/unpruned_model/final_spec.txt
2022-09-07 16:36:39,688 [INFO] root: Loading weights from /app/results/mm/unpruned_model/model.step-245.tlt
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpeft_f09q', '_tf_random_seed': 123, '_save_summary_steps': None, '_save_checkpoints_steps': None, '_save_checkpoints_secs': None, '_session_config': gpu_options {
allow_growth: true
force_gpu_compatible: true
}
allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: TWO
}
}
, '_keep_checkpoint_max': 20, '_keep_checkpoint_every_n_hours': None, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f740f2dc5f8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0,'_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
2022-09-07 16:36:41,372 [INFO] tensorflow: Using config: {'_model_dir': '/tmp/tmpeft_f09q', '_tf_random_seed': 123, '_save_summary_steps': None, '_save_checkpoints_steps': None, '_save_checkpoints_secs': None, '_session_config': gpu_options {
allow_growth: true
force_gpu_compatible: true
}
allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: TWO
}
}
, '_keep_checkpoint_max': 20, '_keep_checkpoint_every_n_hours': None, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f740f2dc5f8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0,'_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Create CheckpointSaverHook.
2022-09-07 16:36:41,373 [INFO] tensorflow: Create CheckpointSaverHook.
WARNING:tensorflow:Entity <function InputReader.__call__.<locals>._prefetch_dataset at 0x7f740f2dfd08> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function InputReader.__call__.<locals>._prefetch_dataset at 0x7f740f2dfd08>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:41,390 [WARNING] tensorflow: Entity <function InputReader.__call__.<locals>._prefetch_dataset at 0x7f740f2dfd08> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug,set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function InputReader.__call__.<locals>._prefetch_dataset at 0x7f740f2dfd08>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you shouldto define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
[MaskRCNN] INFO    : [*] Limiting the amount of sample to: 20
WARNING:tensorflow:From /opt/nvidia/third_party/keras/tensorflow_backend.py:349: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

2022-09-07 16:36:41,403 [WARNING] tensorflow: From /opt/nvidia/third_party/keras/tensorflow_backend.py:349: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

2022-09-07 16:36:41,417 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

WARNING:tensorflow:Entity <function dataset_parser at 0x7f72d2900840> could not be transformed and will be executed as-is.Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function dataset_parser at 0x7f72d2900840>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:41,447 [WARNING] tensorflow: Entity <function dataset_parser at 0x7f72d2900840> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function dataset_parser at 0x7f72d2900840>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:The operation `tf.image.convert_image_dtype` will be skipped since the input and output dtypes are identical.
2022-09-07 16:36:41,460 [WARNING] tensorflow: The operation `tf.image.convert_image_dtype` will be skipped since the inputand output dtypes are identical.
WARNING:tensorflow:The operation `tf.image.convert_image_dtype` will be skipped since the input and output dtypes are identical.
2022-09-07 16:36:41,463 [WARNING] tensorflow: The operation `tf.image.convert_image_dtype` will be skipped since the inputand output dtypes are identical.
WARNING:tensorflow:The operation `tf.image.convert_image_dtype` will be skipped since the input and output dtypes are identical.
2022-09-07 16:36:41,467 [WARNING] tensorflow: The operation `tf.image.convert_image_dtype` will be skipped since the inputand output dtypes are identical.
WARNING:tensorflow:The operation `tf.image.convert_image_dtype` will be skipped since the input and output dtypes are identical.
2022-09-07 16:36:41,471 [WARNING] tensorflow: The operation `tf.image.convert_image_dtype` will be skipped since the inputand output dtypes are identical.
INFO:tensorflow:Calling model_fn.
2022-09-07 16:36:41,577 [INFO] tensorflow: Calling model_fn.
[MaskRCNN] INFO    : ***********************
[MaskRCNN] INFO    : Loading model graph...
[MaskRCNN] INFO    : ***********************
WARNING:tensorflow:Entity <bound method AnchorLayer.call of <iva.mask_rcnn.layers.anchor_layer.AnchorLayer object at 0x7f740efef550>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method AnchorLayer.call of <iva.mask_rcnn.layers.anchor_layer.AnchorLayer object at 0x7f740efef550>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:42,002 [WARNING] tensorflow: Entity <bound method AnchorLayer.call of <iva.mask_rcnn.layers.anchor_layer.AnchorLayer object at 0x7f740efef550>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method AnchorLayer.call of <iva.mask_rcnn.layers.anchor_layer.AnchorLayer object at 0x7f740efef550>>. Note that functions defined in certain environments, like the interactive Python shell donot expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method MultilevelProposal.call of <iva.mask_rcnn.layers.multilevel_proposal_layer.MultilevelProposal object at 0x7f740efef748>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method MultilevelProposal.call of <iva.mask_rcnn.layers.multilevel_proposal_layer.MultilevelProposal object at 0x7f740efef748>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:42,020 [WARNING] tensorflow: Entity <bound method MultilevelProposal.call of <iva.mask_rcnn.layers.multilevel_proposal_layer.MultilevelProposal object at 0x7f740efef748>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method MultilevelProposal.call of <iva.mask_rcnn.layers.multilevel_proposal_layer.MultilevelProposal object at 0x7f740efef748>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
[MaskRCNN] INFO    : [ROI OPs] Using Batched NMS... Scope: MLP/multilevel_propose_rois/level_2/
[MaskRCNN] INFO    : [ROI OPs] Using Batched NMS... Scope: MLP/multilevel_propose_rois/level_3/
[MaskRCNN] INFO    : [ROI OPs] Using Batched NMS... Scope: MLP/multilevel_propose_rois/level_4/
[MaskRCNN] INFO    : [ROI OPs] Using Batched NMS... Scope: MLP/multilevel_propose_rois/level_5/
[MaskRCNN] INFO    : [ROI OPs] Using Batched NMS... Scope: MLP/multilevel_propose_rois/level_6/
WARNING:tensorflow:Entity <bound method MultilevelCropResize.call of <iva.mask_rcnn.layers.multilevel_crop_resize_layer.MultilevelCropResize object at 0x7f740efef940>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method MultilevelCropResize.call of <iva.mask_rcnn.layers.multilevel_crop_resize_layer.MultilevelCropResize object at 0x7f740efef940>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a.py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:42,140 [WARNING] tensorflow: Entity <bound method MultilevelCropResize.call of <iva.mask_rcnn.layers.multilevel_crop_resize_layer.MultilevelCropResize object at 0x7f740efef940>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method MultilevelCropResize.call of <iva.mask_rcnn.layers.multilevel_crop_resize_layer.MultilevelCropResize object at 0x7f740efef940>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, youshould to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740efefdd8>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740efefdd8>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:42,228 [WARNING] tensorflow: Entity <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740efefdd8>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740efefdd8>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740effab38>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740effab38>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:42,257 [WARNING] tensorflow: Entity <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740effab38>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740effab38>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740effac50>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740effac50>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:42,260 [WARNING] tensorflow: Entity <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740effac50>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740effac50>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method GPUDetections.call of <iva.mask_rcnn.layers.gpu_detection_layer.GPUDetections object at 0x7f740effad68>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method GPUDetections.call of <iva.mask_rcnn.layers.gpu_detection_layer.GPUDetectionsobject at 0x7f740effad68>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:42,263 [WARNING] tensorflow: Entity <bound method GPUDetections.call of <iva.mask_rcnn.layers.gpu_detection_layer.GPUDetections object at 0x7f740effad68>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method GPUDetections.call of <iva.mask_rcnn.layers.gpu_detection_layer.GPUDetections object at 0x7f740effad68>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could notget source code
WARNING:tensorflow:Entity <bound method MultilevelCropResize.call of <iva.mask_rcnn.layers.multilevel_crop_resize_layer.MultilevelCropResize object at 0x7f740f0040b8>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method MultilevelCropResize.call of <iva.mask_rcnn.layers.multilevel_crop_resize_layer.MultilevelCropResize object at 0x7f740f0040b8>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a.py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:42,292 [WARNING] tensorflow: Entity <bound method MultilevelCropResize.call of <iva.mask_rcnn.layers.multilevel_crop_resize_layer.MultilevelCropResize object at 0x7f740f0040b8>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method MultilevelCropResize.call of <iva.mask_rcnn.layers.multilevel_crop_resize_layer.MultilevelCropResize object at 0x7f740f0040b8>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, youshould to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740f0041d0>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740f0041d0>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:42,419 [WARNING] tensorflow: Entity <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740f0041d0>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method ReshapeLayer.call of <iva.mask_rcnn.layers.reshape_layer.ReshapeLayer object at 0x7f740f0041d0>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method MaskPostprocess.call of <iva.mask_rcnn.layers.mask_postprocess_layer.MaskPostprocess object at 0x7f740f00ae80>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method MaskPostprocess.call of <iva.mask_rcnn.layers.mask_postprocess_layer.MaskPostprocess object at 0x7f740f00ae80>>. Note that functions defined in certain environments, like the interactive Pythonshell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-09-07 16:36:43,185 [WARNING] tensorflow: Entity <bound method MaskPostprocess.call of <iva.mask_rcnn.layers.mask_postprocess_layer.MaskPostprocess object at 0x7f740f00ae80>> could not be transformed and will be executed as-is. Please reportthis to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method MaskPostprocess.call of <iva.mask_rcnn.layers.mask_postprocess_layer.MaskPostprocess object at 0x7f740f00ae80>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
image_input (ImageInput)        [(2, 3, 640, 640)]   0
__________________________________________________________________________________________________
conv1 (Conv2D)                  (2, 64, 320, 320)    9408        image_input[0][0]
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (2, 64, 320, 320)    256         conv1[0][0]
__________________________________________________________________________________________________
activation (Activation)         (2, 64, 320, 320)    0           bn_conv1[0][0]
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (2, 64, 160, 160)    0           activation[0][0]
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D)        (2, 64, 160, 160)    36864       max_pooling2d[0][0]
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (2, 64, 160, 160)    256         block_1a_conv_1[0][0]
__________________________________________________________________________________________________
block_1a_relu_1 (Activation)    (2, 64, 160, 160)    0           block_1a_bn_1[0][0]
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D)        (2, 64, 160, 160)    36864       block_1a_relu_1[0][0]
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (2, 64, 160, 160)    4096        max_pooling2d[0][0]
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (2, 64, 160, 160)    256         block_1a_conv_2[0][0]
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (2, 64, 160, 160)    256         block_1a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add (Add)                       (2, 64, 160, 160)    0           block_1a_bn_2[0][0]
block_1a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_1a_relu (Activation)      (2, 64, 160, 160)    0           add[0][0]
__________________________________________________________________________________________________
block_1b_conv_1 (Conv2D)        (2, 64, 160, 160)    36864       block_1a_relu[0][0]
__________________________________________________________________________________________________
block_1b_bn_1 (BatchNormalizati (2, 64, 160, 160)    256         block_1b_conv_1[0][0]
__________________________________________________________________________________________________
block_1b_relu_1 (Activation)    (2, 64, 160, 160)    0           block_1b_bn_1[0][0]
__________________________________________________________________________________________________
block_1b_conv_2 (Conv2D)        (2, 64, 160, 160)    36864       block_1b_relu_1[0][0]
__________________________________________________________________________________________________
block_1b_bn_2 (BatchNormalizati (2, 64, 160, 160)    256         block_1b_conv_2[0][0]
__________________________________________________________________________________________________
add_1 (Add)                     (2, 64, 160, 160)    0           block_1b_bn_2[0][0]
block_1a_relu[0][0]
__________________________________________________________________________________________________
block_1b_relu (Activation)      (2, 64, 160, 160)    0           add_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D)        (2, 128, 80, 80)     73728       block_1b_relu[0][0]
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (2, 128, 80, 80)     512         block_2a_conv_1[0][0]
__________________________________________________________________________________________________
block_2a_relu_1 (Activation)    (2, 128, 80, 80)     0           block_2a_bn_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D)        (2, 128, 80, 80)     147456      block_2a_relu_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (2, 128, 80, 80)     8192        block_1b_relu[0][0]
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (2, 128, 80, 80)     512         block_2a_conv_2[0][0]
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (2, 128, 80, 80)     512         block_2a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_2 (Add)                     (2, 128, 80, 80)     0           block_2a_bn_2[0][0]
block_2a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_2a_relu (Activation)      (2, 128, 80, 80)     0           add_2[0][0]
__________________________________________________________________________________________________
block_2b_conv_1 (Conv2D)        (2, 128, 80, 80)     147456      block_2a_relu[0][0]
__________________________________________________________________________________________________
block_2b_bn_1 (BatchNormalizati (2, 128, 80, 80)     512         block_2b_conv_1[0][0]
__________________________________________________________________________________________________
block_2b_relu_1 (Activation)    (2, 128, 80, 80)     0           block_2b_bn_1[0][0]
__________________________________________________________________________________________________
block_2b_conv_2 (Conv2D)        (2, 128, 80, 80)     147456      block_2b_relu_1[0][0]
__________________________________________________________________________________________________
block_2b_bn_2 (BatchNormalizati (2, 128, 80, 80)     512         block_2b_conv_2[0][0]
__________________________________________________________________________________________________
add_3 (Add)                     (2, 128, 80, 80)     0           block_2b_bn_2[0][0]
block_2a_relu[0][0]
__________________________________________________________________________________________________
block_2b_relu (Activation)      (2, 128, 80, 80)     0           add_3[0][0]
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D)        (2, 256, 40, 40)     294912      block_2b_relu[0][0]
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (2, 256, 40, 40)     1024        block_3a_conv_1[0][0]
__________________________________________________________________________________________________
block_3a_relu_1 (Activation)    (2, 256, 40, 40)     0           block_3a_bn_1[0][0]
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D)        (2, 256, 40, 40)     589824      block_3a_relu_1[0][0]
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (2, 256, 40, 40)     32768       block_2b_relu[0][0]
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (2, 256, 40, 40)     1024        block_3a_conv_2[0][0]
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (2, 256, 40, 40)     1024        block_3a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_4 (Add)                     (2, 256, 40, 40)     0           block_3a_bn_2[0][0]
block_3a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_3a_relu (Activation)      (2, 256, 40, 40)     0           add_4[0][0]
__________________________________________________________________________________________________
block_3b_conv_1 (Conv2D)        (2, 256, 40, 40)     589824      block_3a_relu[0][0]
__________________________________________________________________________________________________
block_3b_bn_1 (BatchNormalizati (2, 256, 40, 40)     1024        block_3b_conv_1[0][0]
__________________________________________________________________________________________________
block_3b_relu_1 (Activation)    (2, 256, 40, 40)     0           block_3b_bn_1[0][0]
__________________________________________________________________________________________________
block_3b_conv_2 (Conv2D)        (2, 256, 40, 40)     589824      block_3b_relu_1[0][0]
__________________________________________________________________________________________________
block_3b_bn_2 (BatchNormalizati (2, 256, 40, 40)     1024        block_3b_conv_2[0][0]
__________________________________________________________________________________________________
add_5 (Add)                     (2, 256, 40, 40)     0           block_3b_bn_2[0][0]
block_3a_relu[0][0]
__________________________________________________________________________________________________
block_3b_relu (Activation)      (2, 256, 40, 40)     0           add_5[0][0]
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D)        (2, 512, 20, 20)     1179648     block_3b_relu[0][0]
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (2, 512, 20, 20)     2048        block_4a_conv_1[0][0]
__________________________________________________________________________________________________
block_4a_relu_1 (Activation)    (2, 512, 20, 20)     0           block_4a_bn_1[0][0]
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D)        (2, 512, 20, 20)     2359296     block_4a_relu_1[0][0]
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (2, 512, 20, 20)     131072      block_3b_relu[0][0]
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (2, 512, 20, 20)     2048        block_4a_conv_2[0][0]
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (2, 512, 20, 20)     2048        block_4a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_6 (Add)                     (2, 512, 20, 20)     0           block_4a_bn_2[0][0]
block_4a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_4a_relu (Activation)      (2, 512, 20, 20)     0           add_6[0][0]
__________________________________________________________________________________________________
block_4b_conv_1 (Conv2D)        (2, 512, 20, 20)     2359296     block_4a_relu[0][0]
__________________________________________________________________________________________________
block_4b_bn_1 (BatchNormalizati (2, 512, 20, 20)     2048        block_4b_conv_1[0][0]
__________________________________________________________________________________________________
block_4b_relu_1 (Activation)    (2, 512, 20, 20)     0           block_4b_bn_1[0][0]
__________________________________________________________________________________________________
block_4b_conv_2 (Conv2D)        (2, 512, 20, 20)     2359296     block_4b_relu_1[0][0]
__________________________________________________________________________________________________
block_4b_bn_2 (BatchNormalizati (2, 512, 20, 20)     2048        block_4b_conv_2[0][0]
__________________________________________________________________________________________________
add_7 (Add)                     (2, 512, 20, 20)     0           block_4b_bn_2[0][0]
block_4a_relu[0][0]
__________________________________________________________________________________________________
block_4b_relu (Activation)      (2, 512, 20, 20)     0           add_7[0][0]
__________________________________________________________________________________________________
l5 (Conv2D)                     (2, 256, 20, 20)     131328      block_4b_relu[0][0]
__________________________________________________________________________________________________
l4 (Conv2D)                     (2, 256, 40, 40)     65792       block_3b_relu[0][0]
__________________________________________________________________________________________________
FPN_up_4 (UpSampling2D)         (2, 256, 40, 40)     0           l5[0][0]
__________________________________________________________________________________________________
FPN_add_4 (Add)                 (2, 256, 40, 40)     0           l4[0][0]
FPN_up_4[0][0]
__________________________________________________________________________________________________
l3 (Conv2D)                     (2, 256, 80, 80)     33024       block_2b_relu[0][0]
__________________________________________________________________________________________________
FPN_up_3 (UpSampling2D)         (2, 256, 80, 80)     0           FPN_add_4[0][0]
__________________________________________________________________________________________________
FPN_add_3 (Add)                 (2, 256, 80, 80)     0           l3[0][0]
FPN_up_3[0][0]
__________________________________________________________________________________________________
l2 (Conv2D)                     (2, 256, 160, 160)   16640       block_1b_relu[0][0]
__________________________________________________________________________________________________
FPN_up_2 (UpSampling2D)         (2, 256, 160, 160)   0           FPN_add_3[0][0]
__________________________________________________________________________________________________
FPN_add_2 (Add)                 (2, 256, 160, 160)   0           l2[0][0]
FPN_up_2[0][0]
__________________________________________________________________________________________________
post_hoc_d5 (Conv2D)            (2, 256, 20, 20)     590080      l5[0][0]
__________________________________________________________________________________________________
post_hoc_d2 (Conv2D)            (2, 256, 160, 160)   590080      FPN_add_2[0][0]
__________________________________________________________________________________________________
post_hoc_d3 (Conv2D)            (2, 256, 80, 80)     590080      FPN_add_3[0][0]
__________________________________________________________________________________________________
post_hoc_d4 (Conv2D)            (2, 256, 40, 40)     590080      FPN_add_4[0][0]
__________________________________________________________________________________________________
p6 (MaxPooling2D)               (2, 256, 10, 10)     0           post_hoc_d5[0][0]
__________________________________________________________________________________________________
rpn (Conv2D)                    multiple             590080      post_hoc_d2[0][0]
post_hoc_d3[0][0]
post_hoc_d4[0][0]
post_hoc_d5[0][0]
p6[0][0]
__________________________________________________________________________________________________
rpn-class (Conv2D)              multiple             771         rpn[0][0]
rpn[1][0]
rpn[2][0]
rpn[3][0]
rpn[4][0]
__________________________________________________________________________________________________
rpn-box (Conv2D)                multiple             3084        rpn[0][0]
rpn[1][0]
rpn[2][0]
rpn[3][0]
rpn[4][0]
__________________________________________________________________________________________________
permute (Permute)               (2, 160, 160, 3)     0           rpn-class[0][0]
__________________________________________________________________________________________________
permute_2 (Permute)             (2, 80, 80, 3)       0           rpn-class[1][0]
__________________________________________________________________________________________________
permute_4 (Permute)             (2, 40, 40, 3)       0           rpn-class[2][0]
__________________________________________________________________________________________________
permute_6 (Permute)             (2, 20, 20, 3)       0           rpn-class[3][0]
__________________________________________________________________________________________________
permute_8 (Permute)             (2, 10, 10, 3)       0           rpn-class[4][0]
__________________________________________________________________________________________________
permute_1 (Permute)             (2, 160, 160, 12)    0           rpn-box[0][0]
__________________________________________________________________________________________________
permute_3 (Permute)             (2, 80, 80, 12)      0           rpn-box[1][0]
__________________________________________________________________________________________________
permute_5 (Permute)             (2, 40, 40, 12)      0           rpn-box[2][0]
__________________________________________________________________________________________________
permute_7 (Permute)             (2, 20, 20, 12)      0           rpn-box[3][0]
__________________________________________________________________________________________________
permute_9 (Permute)             (2, 10, 10, 12)      0           rpn-box[4][0]
__________________________________________________________________________________________________
anchor_layer (AnchorLayer)      OrderedDict([(2, (16 0           image_input[0][0]
__________________________________________________________________________________________________
info_input (InfoInput)          [(2, 5)]             0
__________________________________________________________________________________________________
MLP (MultilevelProposal)        ((2, 1000), (2, 1000 0           permute[0][0]
permute_2[0][0]
permute_4[0][0]
permute_6[0][0]
permute_8[0][0]
permute_1[0][0]
permute_3[0][0]
permute_5[0][0]
permute_7[0][0]
permute_9[0][0]
anchor_layer[0][0]
anchor_layer[0][1]
anchor_layer[0][2]
anchor_layer[0][3]
anchor_layer[0][4]
info_input[0][0]
__________________________________________________________________________________________________
multilevel_crop_resize (Multile (2, 1000, 256, 7, 7) 0           post_hoc_d2[0][0]
post_hoc_d3[0][0]
post_hoc_d4[0][0]
post_hoc_d5[0][0]
p6[0][0]
MLP[0][1]
__________________________________________________________________________________________________
box_head_reshape1 (ReshapeLayer (2000, 12544)        0           multilevel_crop_resize[0][0]     INFO:tensorflow:Done calling model_fn.
2022-09-07 16:36:43,231 [INFO] tensorflow: Done calling model_fn.
INFO:tensorflow:Graph was finalized.
2022-09-07 16:36:43,727 [INFO] tensorflow: Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpeft_f09q/model.ckpt-245
2022-09-07 16:36:43,729 [INFO] tensorflow: Restoring parameters from /tmp/tmpeft_f09q/model.ckpt-245
INFO:tensorflow:Running local_init_op.
2022-09-07 16:36:44,089 [INFO] tensorflow: Running local_init_op.
INFO:tensorflow:Done running local_init_op.
2022-09-07 16:36:44,122 [INFO] tensorflow: Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 245 into /tmp/tmpaoeugian/model.ckpt.
2022-09-07 16:36:45,588 [INFO] tensorflow: Saving checkpoints for 245 into /tmp/tmpaoeugian/model.ckpt.

__________________________________________________________________________________________________
fc6 (Dense)                     (2000, 1024)         12846080    box_head_reshape1[0][0]
__________________________________________________________________________________________________
fc7 (Dense)                     (2000, 1024)         1049600     fc6[0][0]
__________________________________________________________________________________________________
class-predict (Dense)           (2000, 2)            2050        fc7[0][0]
__________________________________________________________________________________________________
box-predict (Dense)             (2000, 8)            8200        fc7[0][0]
__________________________________________________________________________________________________
box_head_reshape2 (ReshapeLayer (2, 1000, 2)         0           class-predict[0][0]
__________________________________________________________________________________________________
box_head_reshape3 (ReshapeLayer (2, 1000, 8)         0           box-predict[0][0]
__________________________________________________________________________________________________
gpu_detections (GPUDetections)  ((2,), (2, 100, 4),  0           box_head_reshape2[0][0]
box_head_reshape3[0][0]
MLP[0][1]
info_input[0][0]
__________________________________________________________________________________________________
multilevel_crop_resize_1 (Multi (2, 100, 256, 14, 14 0           post_hoc_d2[0][0]
post_hoc_d3[0][0]
post_hoc_d4[0][0]
post_hoc_d5[0][0]
p6[0][0]
gpu_detections[0][1]
__________________________________________________________________________________________________
mask_head_reshape_1 (ReshapeLay (200, 256, 14, 14)   0           multilevel_crop_resize_1[0][0]
__________________________________________________________________________________________________
mask-conv-l0 (Conv2D)           (200, 256, 14, 14)   590080      mask_head_reshape_1[0][0]
__________________________________________________________________________________________________
mask-conv-l1 (Conv2D)           (200, 256, 14, 14)   590080      mask-conv-l0[0][0]
__________________________________________________________________________________________________
mask-conv-l2 (Conv2D)           (200, 256, 14, 14)   590080      mask-conv-l1[0][0]
__________________________________________________________________________________________________
mask-conv-l3 (Conv2D)           (200, 256, 14, 14)   590080      mask-conv-l2[0][0]
__________________________________________________________________________________________________
conv5-mask (Conv2DTranspose)    (200, 256, 28, 28)   262400      mask-conv-l3[0][0]
__________________________________________________________________________________________________
mask_fcn_logits (Conv2D)        (200, 2, 28, 28)     514         conv5-mask[0][0]
__________________________________________________________________________________________________
mask_postprocess (MaskPostproce (2, 100, 28, 28)     0           mask_fcn_logits[0][0]
gpu_detections[0][2]
__________________________________________________________________________________________________
mask_sigmoid (Activation)       (2, 100, 28, 28)     0           mask_postprocess[0][0]
==================================================================================================
Total params: 30,920,667
Trainable params: 11,180,736
Non-trainable params: 19,739,931
__________________________________________________________________________________________________
Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/export.py", line 12, in <module>
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py", line 268, in launch_export
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py", line 250, in run_export
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py", line 595, in export
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py", line 286, in save_etlt_file
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py", line 239, in _train_ckpt_to_eval_ckpt
StopIteration
Using TensorFlow backend.

Can you share the full spec file? Thanks.

final_spec.txt (2.0 KB)

Can you change the tfrecords files and retry?

training_file_pattern: "/app/results/mrcnn/unpruned_model/mask_temp/tfrecords/train/*.tfrecord"
validation_file_pattern: "/app/results/mrcnn/unpruned_model/mask_temp/tfrecords/val/*.tfrecord"
val_json_file: "/app/results/mrcnn/unpruned_model/mask_temp/val/val_coco.json"

to

training_file_pattern: "/app/results/mrcnn/unpruned_model/mask_temp/tfrecords/train/*.tfrecord"
validation_file_pattern: "/app/results/mrcnn/unpruned_model/mask_temp/tfrecords/train/*.tfrecord"
val_json_file: "yourtraining_data.json"

HI,
I tried. But it gives the same error.

Can you follow the official notebook and run it well?

I also ran through the official notebook, it still gives an error.

tensorflow.python.framework.errors_impl.NotFoundError: /app/results/mrcnn/unpruned_model/mask_temp/tfrecords/train; No such file or directory

full log:

INFO:tensorflow:Done calling model_fn.
2022-09-16 15:02:54,178 [INFO] tensorflow: Done calling model_fn.
INFO:tensorflow:Graph was finalized.
2022-09-16 15:02:54,665 [INFO] tensorflow: Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmphi__b1kk/model.ckpt-245
2022-09-16 15:02:54,668 [INFO] tensorflow: Restoring parameters from /tmp/tmphi__b1kk/model.ckpt-245
INFO:tensorflow:Running local_init_op.
2022-09-16 15:02:55,005 [INFO] tensorflow: Running local_init_op.
INFO:tensorflow:Done running local_init_op.
2022-09-16 15:02:55,035 [INFO] tensorflow: Done running local_init_op.
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1365, in _do_call
return fn(*args)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1350, in _run_fn
target_list, run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: /app/results/mrcnn/unpruned_model/mask_temp/tfrecords/train; No such file or directory
[[{{node list_files/MatchingFiles}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/export.py”, line 12, in
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py”, line 268, in launch_export
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py”, line 250, in run_export
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py”, line 595, in export
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py”, line 286, in save_etlt_file
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py”, line 239, in _train_ckpt_to_eval_ckpt
File “/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py”, line 638, in predict
hooks=all_hooks) as mon_sess:
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py”, line 1014, in init
stop_grace_period_secs=stop_grace_period_secs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py”, line 725, in init
self._sess = _RecoverableSession(self._coordinated_creator)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py”, line 1207, in init
_WrappedSession.init(self, self._create_session())
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py”, line 1212, in _create_session
return self._sess_creator.create_session()
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py”, line 885, in create_session
hook.after_create_session(self.tf_sess, self.coord)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/util.py”, line 90, in after_create_session
session.run(self._initializer)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 956, in run
run_metadata_ptr)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1180, in _run
feed_dict_tensor, options, run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1359, in _do_run
run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: /app/results/mrcnn/unpruned_model/mask_temp/tfrecords/train; No such file or directory
[[node list_files/MatchingFiles (defined at usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for ‘list_files/MatchingFiles’:
File “root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/export.py”, line 12, in
File “root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py”, line 268, in launch_export
File “root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py”, line 250, in run_export
File “root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py”, line 595, in export
File “root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py”, line 286, in save_etlt_file
File “root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py”, line 239, in _train_ckpt_to_eval_ckpt
File “usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py”, line 620, in predict
input_fn, ModeKeys.PREDICT)
File “usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py”, line 996, in _get_features_from_input_fn
result = self._call_input_fn(input_fn, mode)
File “usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py”, line 1116, in _call_input_fn
return input_fn(**kwargs)
File “root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/dataloader/dataloader.py”, line 89, in call
File “usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py”, line 1955, in list_files
return DatasetV1Adapter(DatasetV2.list_files(file_pattern, shuffle, seed))
File “usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py”, line 877, in list_files
matching_files = gen_io_ops.matching_files(file_pattern)
File “usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_io_ops.py”, line 464, in matching_files
“MatchingFiles”, pattern=pattern, name=name)
File “usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py”, line 794, in _apply_op_helper
op_def=op_def)
File “usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py”, line 513, in new_func
return func(*args, **kwargs)
File “usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py”, line 3357, in create_op
attrs, op_def, compute_device)
File “usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py”, line 3426, in _create_op_internal
op_def=op_def)
File “usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py”, line 1748, in init
self._traceback = tf_stack.extract_stack()

I am afraid there is something wrong in the ~/.tao_mounts.json file.
Please make sure the mapping is correct.

As I said, I do not have the training and validation dataset but I want to export my model after training.
You told me to use dummy tfrecords in the validation path. But this solution did not work.

May I know how did you generate the tfrecord file -00000-of-00001.tfrecord ?

Using this code:
create_coco_tf_record.py (11.8 KB)

So, it is from official notebook, right?

I revisit the original topic, you mention that "I have already trained the model and now I do not have the dataset. ".

To narrow down, can you use the official notebook to run short training(1000 steps), then try to run export again with setting above tfreocrd file in the spec file?

Yes, it is.

I tried. But I get the following error again:

Traceback (most recent call last):
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/export.py”, line 12, in
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py”, line 268, in launch_export
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py”, line 250, in run_export
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py”, line 595, in export
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py”, line 286, in save_etlt_file
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/export/exporter.py”, line 239, in _train_ckpt_to_eval_ckpt
StopIteration

So, may I conclude that you meet above error even with default mask_rcnn jupyter notebook? Did you try to train a new model with short period(for example, 1000 steps ) and then export?