I am currently using RIVA 2.10.0 and attempting to migrate my ASR model from OpenSeq2Seq decoder to Flashlight due to future deprecation. I am currently using the Nemo Conformer large as the acoustic model, which is fine-tuned on my data. Additionally, I use KenLM as the language model, which was trained using this script provided by NeMo.
However, the OpenSeq2Seq decoder does not support BPE tokens, so subwords are mapped to chars and KenLM is trained on these chars. Also I want to use a lexicon-free decoder because my word-level language models perform worse in comparison to subword models.
I have attempted to migrate directly by running the riva-build script with parameters --decoder_type=flashlight and --flashlight_decoder.use_lexicon_free_decoding=True. I also replaced the riva_decoder_vocabulary.txt in the model repository with the encoded version. However, regardless of the decoder settings, transcriptions contain anything other than what sounded in the audio.
My question is: Does a correct way to make migration exist? Alternatively, is it possible to train the LM and set up RIVA to work with subwords without encoding using a lexicon-free decoder?