Please provide the following information when requesting support.
Hardware - GPU - Titan RTX
Hardware - CPU
Operating System - Ubuntu
Riva Version - N/A
TLT Version (if relevant) - nvidia-tao==0.1.24
How to reproduce the issue ? (This is for errors. Please share the command and the detailed log here)
I am attempting to use just the TAO toolkit (without Riva) to inference a pretrained model on a wav file.
I downloaded this model locally: speechtotext_en_us_conformer.tlt
and created a spec file containing only one block, a list of a single short wav file to inference, and I run this command:
tao speech_to_text_conformer infer -e /path/to/infer_spec.yaml -m /path/to/speechtotext_en_us_conformer.tlt -g 1 -r /path/to/infer_results/
which results in the following error after a number of what look like successful lines:
[NeMo I 2022-09-07 20:12:40 features:272] STFT using torch
Error executing job with overrides: ['exp_manager.explicit_log_dir=/home/kaleko/tao/conformer/kaleko_test_infer_results/', 'restore_from=/home/kaleko/tao/conformer/speechtotext_en_us_conformer.tlt']
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
return func()
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 368, in <lambda>
lambda: hydra.run(
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 110, in run
_ = ret.return_value
File "/opt/conda/lib/python3.8/site-packages/hydra/core/utils.py", line 233, in return_value
raise self._return_value
File "/opt/conda/lib/python3.8/site-packages/hydra/core/utils.py", line 160, in run_job
ret.return_value = task_function(task_cfg)
File "/home/jenkins/agent/workspace/tlt-pytorch-main-nightly/conv_ai/asr/speech_to_text_ctc/scripts/infer.py", line 63, in main
File "/opt/conda/lib/python3.8/site-packages/nemo/core/classes/modelPT.py", line 305, in restore_from
instance = cls._save_restore_connector.restore_from(
File "/home/jenkins/agent/workspace/tlt-pytorch-main-nightly/core/connectors/save_restore_connector.py", line 79, in restore_from
File "/home/jenkins/agent/workspace/tlt-pytorch-main-nightly/core/cookbooks/nemo_cookbook.py", line 417, in restore_from
File "<frozen src.eff.core.archive>", line 943, in retrieve_file_handle
PermissionError: Cannot access the encrypted file without the passphrase
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/jenkins/agent/workspace/tlt-pytorch-main-nightly/conv_ai/asr/speech_to_text_ctc/scripts/infer.py", line 79, in <module>
File "/opt/conda/lib/python3.8/site-packages/nemo/core/config/hydra_runner.py", line 104, in wrapper
_run_hydra(
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 367, in _run_hydra
run_and_report(
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 251, in run_and_report
assert mdl is not None
AssertionError
2022-09-07 15:12:47,767 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.
I suspected permission issues so I opened up all permissions to all files. I have all files within mounted directories according to ~/.tao_mounts.json
Any ideas what is causing this?