Nvidia Riva asr finetune KeyError: "filename './manifest.yaml' not found", pretrained model QuartzNet15x5NR-En.nemo

Hi,
I am trying to finetune nvidia riva asr using tao, and I am getting the error
KeyError: “filename ‘./manifest.yaml’ not found”.
Any hints on how to fix this.

I downloaded the QuartzNet15x5NR-En.nemo pretrained model from here
https://ngc.nvidia.com/catalog/models/nvidia:nemospeechmodels/files
and renamed the .nemo file to .tlt

Error:

Traceback (most recent call last):
File “”, line 730, in extract
File “/opt/conda/lib/python3.8/tarfile.py”, line 2058, in extract
tarinfo = self.getmember(member)
File “/opt/conda/lib/python3.8/tarfile.py”, line 1780, in getmember
raise KeyError(“filename %r not found” % name)
KeyError: “filename ‘manifest.yaml’ not found”

Finetune Command

!tao speech_to_text finetune
-e $SPECS_DIR/speech_to_text/finetune.yaml
-g 1
-k $KEY
-m $DATA_DIR/pretrained-models/QuartzNet15x5NR-En.tlt
-r $RESULTS_DIR/quartznet/3/finetune5
finetuning_ds.manifest_filepath=$DATA_DIR/training-data.json
validation_ds.manifest_filepath=$DATA_DIR/validation-data.json
trainer.max_epochs=1
finetuning_ds.num_workers=20
finetuning_ds.batch_size=16
validation_ds.num_workers=20
validation_ds.batch_size=16
trainer.gpus=1

Code for initialization

%env HOST_DATA_DIR=/home/ubuntu/training/data/1
%env HOST_SPECS_DIR=/home/ubuntu/training/specs/1
%env HOST_RESULTS_DIR=/home/ubuntu/training/results/1
! mkdir -p $HOST_DATA_DIR
! mkdir -p $HOST_SPECS_DIR
! mkdir -p $HOST_RESULTS_DIR

Mapping up the local directories to the TAO docker.

import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")
tlt_configs = {
“Mounts”:[
{
“source”: os.environ[“HOST_DATA_DIR”],
“destination”: “/data”
},
{
“source”: os.environ[“HOST_SPECS_DIR”],
“destination”: “/specs”
},
{
“source”: os.environ[“HOST_RESULTS_DIR”],
“destination”: “/results”
},
{
“source”: os.path.expanduser("~/.cache"),
“destination”: “/root/.cache”
}
],
“DockerOptions”: {
“shm_size”: “16G”,
“ulimits”: {
“memlock”: -1,
“stack”: 67108864
}
}
}

Writing the mounts file.

with open(mounts_file, “w”) as mfile:
json.dump(tlt_configs, mfile, indent=4)

NOTE: The following paths are set from the perspective of the TAO Docker.

The data is saved here

DATA_DIR = “/data”
SPECS_DIR = “/specs”
RESULTS_DIR = “/results”

Set your encryption key, and use the same key for all commands

KEY = ‘tlt_encode’

Hardware - GPU (A100/A30/T4/V100)
Hardware - CPU
Operating System
Riva Version
TLT Version (if relevant)
How to reproduce the issue ? (This is for errors. Please share the command and the detailed log here)

1 Like

Hi @yogpatri ,
Can you please check if this works for you?

Thanks!