here is my configuration file:seed: 123
use_amp: False
warmup_steps: 5000
checkpoint: “/workspace/tao-experiments/mask_rcnn/pretrained_resnet10/pretrained_instance_segmentation_vresnet10/resnet10.hdf5”
learning_rate_steps: “[50000, 100000, 150000]”
learning_rate_decay_levels: “[0.1, 0.05, 0.01]”
total_steps: 800000
train_batch_size: 4
eval_batch_size: 4
num_steps_per_eval: 5000
momentum: 0.9
l2_weight_decay: 0.00004
warmup_learning_rate: 0.0001
init_learning_rate: 0.001
num_examples_per_epoch: 118288
data_config {
image_size: “(640, 640)”
augment_input_data: True
eval_samples: 5000
training_file_pattern: “/workspace/tao-experiments/data/maskrcnn/train*.tfrecord”
validation_file_pattern: “/workspace/tao-experiments/data/maskrcnn/val*.tfrecord”
val_json_file: “/workspace/tao-experiments/data/raw-data/annotations/coco_annotations_val_fixed_largeset.json”
# dataset specific parameters
num_classes: 2 # Including background
skip_crowd_during_training: True
}
maskrcnn_config {
nlayers: 10
arch: “resnet”
freeze_bn: True
freeze_blocks: “[0,1]”
gt_mask_size: 112
# Region Proposal Network
rpn_positive_overlap: 0.7
rpn_negative_overlap: 0.3
rpn_batch_size_per_im: 128
rpn_fg_fraction: 0.5
rpn_min_size: 0.
# Proposal layer.
batch_size_per_im: 256
fg_fraction: 0.25
fg_thresh: 0.5
bg_thresh_hi: 0.5
bg_thresh_lo: 0.
# Faster-RCNN heads.
fast_rcnn_mlp_head_dim: 1024
bbox_reg_weights: "(10., 10., 5., 5.)"
# Mask-RCNN heads.
include_mask: True
mrcnn_resolution: 28
# training
train_rpn_pre_nms_topn: 2000
train_rpn_post_nms_topn: 1000
train_rpn_nms_threshold: 0.7
# evaluation
test_detections_per_image: 100
test_nms: 0.5
test_rpn_pre_nms_topn: 1000
test_rpn_post_nms_topn: 1000
test_rpn_nms_thresh: 0.7
# model architecture
min_level: 2
max_level: 6
num_scales: 1
aspect_ratios: "[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]"
anchor_scale: 8
# localization loss
rpn_box_loss_weight: 1.0
fast_rcnn_box_loss_weight: 1.0
mrcnn_weight_loss_mask: 1.0
}
…and by running this code in jupytwer notebook !tao model mask_rcnn evaluate -e $SPECS_DIR/maskrcnn_train_resnet50.txt
-m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.epoch-$NUM_EPOCH.tlt…it is givng an error 2025-01-07 14:22:22,278 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’]
2025-01-07 14:22:22,378 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 361: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2025-01-07 14:22:22,452 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True
2025-01-07 08:52:23.595366: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2025-01-07 08:52:23,683 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
2025-01-07 08:52:26.716403: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
Using TensorFlow backend.
2025-01-07 08:52:26,975 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2025-01-07 08:52:27,045 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2025-01-07 08:52:27,049 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2025-01-07 08:52:27,509 [TAO Toolkit] [WARNING] matplotlib 500: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-5gwxtrr_ because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2025-01-07 08:52:27,934 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
2025-01-07 08:52:29.224203: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libnvinfer.so.8
2025-01-07 08:52:29.237784: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcuda.so.1
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2025-01-07 08:52:31,246 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2025-01-07 08:52:31,283 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2025-01-07 08:52:31,287 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
[INFO] Starting MaskRCNN evaluation.
[INFO] Loading specification from /workspace/tao-experiments/mask_rcnn/specs/maskrcnn_train_resnet50.txt
INFO:tensorflow:Using config: {‘_model_dir’: ‘/tmp/tmp6yjrqarm’, ‘_tf_random_seed’: 123, ‘_save_summary_steps’: None, ‘_save_checkpoints_steps’: None, ‘_save_checkpoints_secs’: None, ‘_session_config’: gpu_options {
allow_growth: true
force_gpu_compatible: true
}
allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: TWO
}
}
, ‘_keep_checkpoint_max’: 20, ‘_keep_checkpoint_every_n_hours’: None, ‘_log_step_count_steps’: None, ‘_train_distribute’: None, ‘_device_fn’: None, ‘_protocol’: None, ‘_eval_distribute’: None, ‘_experimental_distribute’: None, ‘_experimental_max_worker_delay_secs’: None, ‘_session_creation_timeout_secs’: 7200, ‘_service’: None, ‘_cluster_spec’: <tensorflow.python.training.server_lib.ClusterSpec object at 0x759a6ab61d60>, ‘_task_type’: ‘worker’, ‘_task_id’: 0, ‘_global_id_in_cluster’: 0, ‘_master’: ‘’, ‘_evaluation_master’: ‘’, ‘_is_chief’: True, ‘_num_ps_replicas’: 0, ‘_num_worker_replicas’: 1}
[MaskRCNN] INFO : Starting to evaluate.
[MaskRCNN] INFO : Loading weights from /workspace/tao-experiments/mask_rcnn/experiment_dir_unpruned/model.epoch-6.tlt
loading annotations into memory…
Done (t=2.44s)
creating index…
index created!
[MaskRCNN] INFO : [*] Limiting the amount of sample to: 5000
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/third_party/keras/tensorflow_backend.py:361: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.
WARNING:tensorflow:The operation tf.image.convert_image_dtype
will be skipped since the input and output dtypes are identical.
WARNING:tensorflow:The operation tf.image.convert_image_dtype
will be skipped since the input and output dtypes are identical.
WARNING:tensorflow:The operation tf.image.convert_image_dtype
will be skipped since the input and output dtypes are identical.
WARNING:tensorflow:The operation tf.image.convert_image_dtype
will be skipped since the input and output dtypes are identical.
INFO:tensorflow:Calling model_fn.
[MaskRCNN] INFO : ***********************
[MaskRCNN] INFO : Loading model graph…
[MaskRCNN] INFO : ***********************
[MaskRCNN] INFO : [ROI OPs] Using Batched NMS… Scope: MLP/multilevel_propose_rois/level_2/
[MaskRCNN] INFO : [ROI OPs] Using Batched NMS… Scope: MLP/multilevel_propose_rois/level_3/
[MaskRCNN] INFO : [ROI OPs] Using Batched NMS… Scope: MLP/multilevel_propose_rois/level_4/
[MaskRCNN] INFO : [ROI OPs] Using Batched NMS… Scope: MLP/multilevel_propose_rois/level_5/
[MaskRCNN] INFO : [ROI OPs] Using Batched NMS… Scope: MLP/multilevel_propose_rois/level_6/
[INFO] in converted code:
relative to /usr/local/lib/python3.8/dist-packages:
nvidia_tao_tf1/cv/mask_rcnn/layers/reshape_layer.py:37 call *
return tf.reshape(inputs, self.shape, name=self.name)
tensorflow_core/python/ops/array_ops.py:131 reshape
result = gen_array_ops.reshape(tensor, shape, name)
tensorflow_core/python/ops/gen_array_ops.py:8114 reshape
_, _, _op = _op_def_lib._apply_op_helper(
tensorflow_core/python/framework/op_def_library.py:792 _apply_op_helper
op = g.create_op(op_type_name, inputs, dtypes=None, name=scope,
tensorflow_core/python/util/deprecation.py:513 new_func
return func(*args, **kwargs)
tensorflow_core/python/framework/ops.py:3356 create_op
return self._create_op_internal(op_type, inputs, dtypes, input_types, name,
tensorflow_core/python/framework/ops.py:3418 _create_op_internal
ret = Operation(
tensorflow_core/python/framework/ops.py:1769 __init__
self._c_op = _create_c_op(self._graph, node_def, grouped_inputs,
tensorflow_core/python/framework/ops.py:1610 _create_c_op
raise ValueError(str(e))
ValueError: Cannot reshape a tensor with 50176000 elements to shape [8000,12544] (100352000 elements) for 'box_head_reshape1/box_head_reshape1' (op: 'Reshape') with input shapes: [4,1000,256,7,7], [2] and with input tensors computed as partial shapes: input[1] = [8000,12544].
Traceback (most recent call last):
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/scripts/evaluate.py”, line 227, in
raise e
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/scripts/evaluate.py”, line 215, in
main()
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/scripts/evaluate.py”, line 204, in main
eval_results = run_executer(RUN_CONFIG, None, eval_input_fn)
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/scripts/evaluate.py”, line 111, in run_executer
eval_results = executer.eval(eval_input_fn=eval_input_fn)
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/executer/distributed_executer.py”, line 469, in eval
eval_results, predictions = evaluation.evaluate(
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/utils/evaluation.py”, line 349, in evaluate
eval_results, predictions = compute_coco_eval_metric(
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/utils/evaluation.py”, line 127, in compute_coco_eval_metric
step_predictions = six.next(predictor)
File “/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/estimator.py”, line 621, in predict
estimator_spec = self._call_model_fn(
File “/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/estimator.py”, line 1149, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/models/mask_rcnn_model.py”, line 694, in mask_rcnn_model_fn
return _model_fn(
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/models/mask_rcnn_model.py”, line 531, in _model_fn
model_outputs = build_model_graph(features, labels,
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/models/mask_rcnn_model.py”, line 222, in build_model_graph
pruned_model = model_loader.get_model_with_input(
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/mask_rcnn/utils/model_loader.py”, line 121, in get_model_with_input
loaded_model = tf.keras.models.model_from_json(model_json,
File “/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/keras/saving/model_config.py”, line 92, in model_from_json
return deserialize(config, custom_objects=custom_objects)
File “/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/keras/layers/serialization.py”, line 101, in deserialize
return deserialize_keras_object(
File “/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/keras/utils/generic_utils.py”, line 187, in deserialize_keras_object
return cls.from_config(
File “/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/keras/engine/network.py”, line 1076, in from_config
process_node(layer, node_data)
File “/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/keras/engine/network.py”, line 1034, in process_node
layer(input_tensors, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/keras/engine/base_layer.py”, line 854, in call
outputs = call_fn(cast_inputs, *args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/autograph/impl/api.py”, line 237, in wrapper
raise e.ag_error_metadata.to_exception(e)
ValueError: in converted code:
relative to /usr/local/lib/python3.8/dist-packages:
nvidia_tao_tf1/cv/mask_rcnn/layers/reshape_layer.py:37 call *
return tf.reshape(inputs, self.shape, name=self.name)
tensorflow_core/python/ops/array_ops.py:131 reshape
result = gen_array_ops.reshape(tensor, shape, name)
tensorflow_core/python/ops/gen_array_ops.py:8114 reshape
_, _, _op = _op_def_lib._apply_op_helper(
tensorflow_core/python/framework/op_def_library.py:792 _apply_op_helper
op = g.create_op(op_type_name, inputs, dtypes=None, name=scope,
tensorflow_core/python/util/deprecation.py:513 new_func
return func(*args, **kwargs)
tensorflow_core/python/framework/ops.py:3356 create_op
return self._create_op_internal(op_type, inputs, dtypes, input_types, name,
tensorflow_core/python/framework/ops.py:3418 _create_op_internal
ret = Operation(
tensorflow_core/python/framework/ops.py:1769 __init__
self._c_op = _create_c_op(self._graph, node_def, grouped_inputs,
tensorflow_core/python/framework/ops.py:1610 _create_c_op
raise ValueError(str(e))
ValueError: Cannot reshape a tensor with 50176000 elements to shape [8000,12544] (100352000 elements) for 'box_head_reshape1/box_head_reshape1' (op: 'Reshape') with input shapes: [4,1000,256,7,7], [2] and with input tensors computed as partial shapes: input[1] = [8000,12544].
Execution status: FAIL
2025-01-07 14:22:52,194 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.