Bean Search Language model for Conformer CTC (Hindi Model))

iamgarimanarang · February 17, 2023, 6:23am

Hi,

I’ve trained a language model and I’m trying to use it over the pretrained model. Getting the following error:
ValueError: Decoding strategy must be one of [‘greedy’]. Given beam
I’ve trained the LM model by following the below link:
https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/asr_language_modeling.html
Also the link which was used for implementing the beam search:

github.com/NVIDIA/NeMo

Add Beam Search support to ASR transcribe()

NVIDIA:main ← titu1994:asr_ctc_beam

opened 04:22AM - 17 Nov 22 UTC

titu1994

+1064 -214

# What does this PR do ? Add high level API for beam search using model.trans…cribe() and beam search strategy. Currently only supports DeepSpeed beam search library, eventually can support more advanced libraries like pyctcdecode. **Collection**: [ASR] # Changelog - Adds AbstractBeamCTCInfer and BeamCTCInfer classes to support beam search strategy - Updated CTCDecoding and CTCBPEDecoding classes to support beam search strategy - Adds tests for CTC Decoding classes # Usage ``` python import nemo.collections.asr as nemo_asr filepath = "/media/smajumdar/data/Datasets/Librispeech/LibriSpeech/test-other-processed/367-130732-0008.wav" kenlm_path = "/media/smajumdar/data/Datasets/ASR_SET_LM/ASR_SET_3.0/lm" model = nemo_asr.models.ASRModel.from_pretrained('stt_en_conformer_ctc_large') # type: nemo_asr.models.EncDecCTCModelBPE # Create default decoding config model.change_decoding_strategy(None) # Update decoding config for beam search decoding_cfg = model.cfg.decoding decoding_cfg.strategy = "beam" decoding_cfg.beam.beam_size = 128 decoding_cfg.beam.return_best_hypothesis = False decoding_cfg.beam.beam_alpha = 1.0 decoding_cfg.beam.beam_beta = 0.0 decoding_cfg.beam.kenlm_path = kenlm_path # Prepare beam search decoding strategy model.change_decoding_strategy(decoding_cfg) # Transcribe speech with beam search hypothesis = model.transcribe([filepath], return_hypotheses=True) print("Num hypothesis per sample :", len(hypothesis[0])) print("Result type :", hypothesis[0][0].__class__.__name__) for candidate in range(len(hypothesis[0])): print(f"Beam Search {candidate + 1} :", hypothesis[0][candidate].text, hypothesis[0][candidate].score) ``` # Before your PR is "Ready for review" **Pre checks**: - [x] Make sure you read and followed [Contributor guidelines](https://github.com/NVIDIA/NeMo/blob/main/CONTRIBUTING.md) - [x] Did you write any new necessary tests? - [x] Did you add or update any necessary documentation? - [x] Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc) - [ ] Reviewer: Does the PR have correct import guards for all optional libraries? **PR Type**: - [x] New Feature - [ ] Bugfix - [ ] Documentation

Does the language model works on the stt_hi_conformer_ctc_medium (Hindi Model)? And if yes, can you provide the steps required to create the. language model as well as the resource to use it with the pre-trained/fine-tuned model?

rvinobha · February 22, 2023, 12:58pm

Hi @iamgarimanarang

Thanks for your interest in Riva

I will check regarding the issue with the team and get back

Thanks

rvinobha · February 24, 2023, 9:36am

Hi @iamgarimanarang

Quick Updates,

This issue seems to be with Nemo, so i am internally moving the request from Riva and try to pitch with Nemo

Thanks

rvinobha · February 24, 2023, 9:54am

HI @iamgarimanarang

I have inputs from the Nemo team

Kindly to use main branch, this feature is not for 1.15, and we have not released 1.16 yet
Yes as long as steps are followed from the docs ASR Language Modeling — NVIDIA NeMo , the HindI ASR model will process the text and build manifest, beam search will work. But may need to do hyper parameters search to get good beam scores.
for large scale grid search of hyper parameters you should still use the eval scripts mentioned in the documentation ASR Language Modeling — NVIDIA NeMo, Once you have good hyper parameters, this high level API is meant for use with final best beam alpha and beta to do beam search with model.transcribe()
Once we have 1.17 release, things will be smooth, We will add docs for high level API

Thanks

iamgarimanarang · March 1, 2023, 11:29am

Hi @rvinobha

Thanks for the updates. Can you confirm and let us know the tentative date for the new version release?

Kind Regards,
Garima Narang

iamgarimanarang · March 10, 2023, 7:50am

Hi @rvinobha,

Any updates on the 1.17 release?

Kind Regards,
Garima Narang

rvinobha · March 10, 2023, 8:07am

Hi @iamgarimanarang

Apologies for the delay,

The Release is tentatively expected on first or second week of April

Thanks

Topic		Replies	Views
Adding a language model (LM) on top of the ASR - Conformer CTC Riva nemo	1	1555	October 26, 2021
N-gram LM and beams files related questions (audio transcription) Riva nemo , riva	1	638	October 27, 2021
Fine Tune the hind Nvidia Nemo Riva inception	25	1786	January 25, 2023
NeMo ModuleNotFoundError: No module named 'ctc_decoders' Riva	1	1485	August 28, 2023
Language model with citrinet model is not working Riva nemo , riva	2	679	September 6, 2022
Failed to convert Nemo model to Riva using nemo2riva for ASR Riva riva	1	71	January 24, 2025
List ASR models support by nemo2riva conversion Riva inception	5	163	May 14, 2025
Using beam search with the TensorRT compiled T5 model? TensorRT tensorrt , pytorch , python , onnx	1	1196	April 8, 2022
Error when starting Citrinet with language model Riva riva	6	821	October 12, 2021
Riva Citrinet Language Model Riva	4	1010	November 22, 2021

Bean Search Language model for Conformer CTC (Hindi Model))

Related topics