Hello. I’ve tried three pretrained models which are from PeopleSegNet NGC official website to model inference. (peoplesegnet_resnet50.etlt, peoplesegnet_resnet50.tlt, and peoplesegnet_resnet50.step-20000.tlt)
However. I got three errors of inference results:
-
peoplesegnet_resnet50.etlt: ValueError: Model extension needs to be either .engine or .tlt.
-
peoplesegnet_resnet50.tlt: AssertionError: The pruned model must be retrained first.
-
peoplesegnet_resnet50.step-20000.tlt:
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version!
RequestsDependencyWarning)
Using TensorFlow backend.
2023-03-28 09:38:42,813 [INFO] root: Starting MaskRCNN inference.
Label file does not exist. Skipping...
2023-03-28 09:38:42,813 [INFO] iva.mask_rcnn.utils.spec_loader: Loading specification from /workspace/tao-experiments/maskrcnn_retrain_resnet50.txt
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpuk9wazjk', '_tf_random_seed': 123, '_save_summary_steps': None, '_save_checkpoints_steps': None, '_save_checkpoints_secs': None, '_session_config': gpu_options {
allow_growth: true
force_gpu_compatible: true
}
allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: TWO
}
}
, '_keep_checkpoint_max': 20, '_keep_checkpoint_every_n_hours': None, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f89c34f7080>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
2023-03-28 09:38:42,816 [INFO] tensorflow: Using config: {'_model_dir': '/tmp/tmpuk9wazjk', '_tf_random_seed': 123, '_save_summary_steps': None, '_save_checkpoints_steps': None, '_save_checkpoints_secs': None, '_session_config': gpu_options {
allow_growth: true
force_gpu_compatible: true
}
allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: TWO
}
}
, '_keep_checkpoint_max': 20, '_keep_checkpoint_every_n_hours': None, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f89c34f7080>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
[MaskRCNN] INFO : Running inference...
[MaskRCNN] INFO : Loading weights from /workspace/tao-experiments/peoplesegnet_resnet50.step-20000.tlt
2023-03-28 09:38:45,094 [INFO] root: The last checkpoint file is not saved properly. Please delete it and rerun the script.
Traceback (most recent call last):
File "<frozen iva.mask_rcnn.executer.distributed_executer>", line 352, in extract_ckpt
File "/usr/lib/python3.6/zipfile.py", line 1131, in __init__
self._RealGetContents()
File "/usr/lib/python3.6/zipfile.py", line 1198, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "</usr/local/lib/python3.6/dist-packages/iva/mask_rcnn/scripts/inference.py>", line 3, in <module>
File "<frozen iva.mask_rcnn.scripts.inference>", line 390, in <module>
File "<frozen iva.mask_rcnn.scripts.inference>", line 378, in <module>
File "<frozen iva.mask_rcnn.scripts.inference>", line 365, in main
File "<frozen iva.mask_rcnn.scripts.inference>", line 311, in infer
File "<frozen iva.mask_rcnn.executer.distributed_executer>", line 503, in infer
File "<frozen iva.mask_rcnn.executer.distributed_executer>", line 357, in extract_ckpt
OSError: The last checkpoint file is not saved properly. Please delete it and rerun the script.
Telemetry data couldn't be sent, but the command ran successfully.
[WARNING]: <urlopen error [Errno -2] Name or service not known>
Execution status: FAIL
This is my command:
docker run -it --rm -v /home/ubuntu/tao_test_2023/tensorflow_train/model_zoo/peoplesegnet:/workspace/tao-experiments
nvcr.io/nvidia/tao/tao-toolkit:4.0.1-tf1.15.5
mask_rcnn inference
-i /workspace/tao-experiments/data -o /workspace/tao-experiments/
-e /workspace/tao-experiments/maskrcnn_retrain_resnet50.txt
-m /workspace/tao-experiments/peoplesegnet_resnet50.tlt
-l /workspace/tao-experiments/coco_labels.txt -t 0.5
-k nvidia_tao
--include_mask
How should I do generate inference result correctly ?