Migration from OS2S decoder to Flashlight

Hello,

I am currently using RIVA 2.10.0 and attempting to migrate my ASR model from OpenSeq2Seq decoder to Flashlight due to future deprecation. I am currently using the Nemo Conformer large as the acoustic model, which is fine-tuned on my data. Additionally, I use KenLM as the language model, which was trained using this script provided by NeMo.

However, the OpenSeq2Seq decoder does not support BPE tokens, so subwords are mapped to chars and KenLM is trained on these chars. Also I want to use a lexicon-free decoder because my word-level language models perform worse in comparison to subword models.

I have attempted to migrate directly by running the riva-build script with parameters --decoder_type=flashlight and --flashlight_decoder.use_lexicon_free_decoding=True. I also replaced the riva_decoder_vocabulary.txt in the model repository with the encoded version. However, regardless of the decoder settings, transcriptions contain anything other than what sounded in the audio.

My question is: Does a correct way to make migration exist? Alternatively, is it possible to train the LM and set up RIVA to work with subwords without encoding using a lexicon-free decoder?

Hi @A.Abugaliev

Thanks for your interest in Riva

I will check with the team further regarding this query and get back

Thanks

Hello @rvinobha Do you have any news about my question?

Hi @A.Abugaliev

My Sincere Apologies for the delay in my part,

I will check soon and provide updates

Thanks

HI @A.Abugaliev

Sincere Apologies for the delay

Cn you provide the riva_decoder_vocabulary.tx used at your end

Thanks

Sure. Here are original
riva_decoder_vocabulary.txt (7.8 KB)
and encoded
riva_decoder_vocabulary_encoded.txt (3.0 KB)
versions.