Nemo Trained model not giving transcript when deployed on jarvis both offline and streaming

Please provide the following information when requesting support.

Hardware - GPU (V100)
Hardware - CPU
Operating System :
Riva Version
TLT Version (if relevant)
How to reproduce the issue ? (This is for errors. Please share the command and the detailed log here)

  1. Trained the asr jasper model using pre trained weights using Nemo 1.1

  2. converted the nemo model to ejrvs model using nemo2jarvis
    nemo2jarvis --out=./model.ejrvs ./model.nemo

  3. deployed the converted model in jarvis 1.0.0-b.2

When build offline: jarvis-build speech_recognition ./custommodel/model/5821/modeloffline.jmir ./custommodel/model/5821/5821/model.ejrvs --offline

Steaming: jarvis-build speech_recognition ./custommodel/model/5821/model.jmir ./custommodel/model/5821/5821/model.ejrvs

Model Deploy: jarvis-deploy ./custommodel/model/5821/modeloffline.jmir ./custommodel/model/offline/

  1. When audio file use to check the inferencing (using speech_to_text example) it is not giving transcript in results. Same working fine when tested using pre trained ejrvs model from ngc.

Hi @saurabh.sharma94
Could you please share the script and model file to reproduce the issue so we can help better?

Thanks

Please find Models in following link:

https://drive.google.com/drive/folders/1bcNJRuOerYusrP-vMKMLMcQTA0DdxXPv?usp=sharing

Use script of nemo example provided in github to train the model with following parameters: ./examples/asr/speech_to_text.py model.train_ds.manifest_filepath=/workspace/saurabh/asr/code/valid_with_corrected_labels_manifest_part_0e0f11.json model.validation_ds.manifest_filepath=/workspace/saurabh/asr/code/valid_with_corrected_labels_manifest_part_0e0f11.json trainer.max_epochs=10 trainer.gpus=-1 +init_from_pretrained_model=stt_en_jasper10x5dr --config-path=/workspace/saurabh/asr/code/examples/asr/conf/jasper/ --config-name=jasper_10x5dr

Please find Models in following link:

https://drive.google.com/drive/folders/1bcNJRuOerYusrP-vMKMLMcQTA0DdxXPv?usp=sharing

Use script of nemo example provided in github to train the model with following parameters: ./examples/asr/speech_to_text.py model.train_ds.manifest_filepath=/workspace/saurabh/asr/code/valid_with_corrected_labels_manifest_part_0e0f11.json model.validation_ds.manifest_filepath=/workspace/saurabh/asr/code/valid_with_corrected_labels_manifest_part_0e0f11.json trainer.max_epochs=10 trainer.gpus=-1 +init_from_pretrained_model=stt_en_jasper10x5dr --config-path=/workspace/saurabh/asr/code/examples/asr/conf/jasper/ --config-name=jasper_10x5dr

Hi @saurabh.sharma94
Sorry for delayed response.
It seems you are using old jarvis 1.0.0-b.2 release, could you please try the latest Riva (renamed from Jarvis) release.
Below are updated steps for custom model deployment in Riva.

https://docs.nvidia.com/deeplearning/riva/user-guide/docs/custom-model-deployment.html#
https://docs.nvidia.com/deeplearning/riva/user-guide/docs/service-asr.html#streaming-offline-configuration

For use cases where being able to support additional concurrent audio streams is more important, run:

riva-build speech_recognition \
    /servicemaker-dev/<rmir_filename>:<encryption_key> \
    /servicemaker-dev/<riva_filename>:<encryption_key> \
    --name=<pipeline_name> \
    --decoder_type=greedy \
    --chunk_size=0.8 \
    --padding_size=0.8

Thanks

Hi @saurabh.sharma94 ,
We tried reproducing the issue, and it seems to be with nemo model as we tried to load your model using nemo package & taken inference on 3-4 audio files but getting empty response.

I have retrained the model by creating new pods using nemo 1.1 image and deploy using riva 1.5 . But still facing same issue. Please find updated model:
https://drive.google.com/drive/folders/1UNLhumBWASkfZ_t95kX62W__5n6l1hjB?usp=sharing

Please suggest solution as not able to move forward in asr without it. As per my understanding the riva model is similar to onnx model which might be platform independent so their their might be no version issue( as suggested earlier). Also if possible please train model your end and check if same issue is persist when you are training the asr model.