Bean Search Language model for Conformer CTC (Hindi Model))

Hi,

I’ve trained a language model and I’m trying to use it over the pretrained model. Getting the following error:
ValueError: Decoding strategy must be one of [‘greedy’]. Given beam
I’ve trained the LM model by following the below link:
https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/asr_language_modeling.html
Also the link which was used for implementing the beam search:

Does the language model works on the stt_hi_conformer_ctc_medium (Hindi Model)? And if yes, can you provide the steps required to create the. language model as well as the resource to use it with the pre-trained/fine-tuned model?

Hi @iamgarimanarang

Thanks for your interest in Riva

I will check regarding the issue with the team and get back

Thanks

Hi @iamgarimanarang

Quick Updates,

This issue seems to be with Nemo, so i am internally moving the request from Riva and try to pitch with Nemo

Thanks

HI @iamgarimanarang

I have inputs from the Nemo team

  1. Kindly to use main branch, this feature is not for 1.15, and we have not released 1.16 yet

  2. Yes as long as steps are followed from the docs ASR Language Modeling — NVIDIA NeMo , the HindI ASR model will process the text and build manifest, beam search will work. But may need to do hyper parameters search to get good beam scores.

  3. for large scale grid search of hyper parameters you should still use the eval scripts mentioned in the documentation ASR Language Modeling — NVIDIA NeMo, Once you have good hyper parameters, this high level API is meant for use with final best beam alpha and beta to do beam search with model.transcribe()

  4. Once we have 1.17 release, things will be smooth, We will add docs for high level API

Thanks

Hi @rvinobha

Thanks for the updates. Can you confirm and let us know the tentative date for the new version release?

Kind Regards,
Garima Narang

Hi @rvinobha,

Any updates on the 1.17 release?

Kind Regards,
Garima Narang

Hi @iamgarimanarang

Apologies for the delay,

The Release is tentatively expected on first or second week of April

Thanks