[RIVA][Jasper][Citrinet] Build and deploy ASR models with custom KenLM language model


• Hardware (T4)
• Network Type (speech_to_text)
• TLT Version ( tao: 3.21.08 | docker_tag: v3.21.08-py3)

Following this to build and deploy jasper with a KenLM model but I am not able to find the configuration to follow for best latency and best throughput like it’s mentioned for Citrinet. In citrinet also what does the --vocab_filename parameter imply ? Training KenLM model on custom corpus generates only kenlm_model binary and there is no vocab_file generated. Please help here. Thanks!

Can you share the exact link of “–vocab_filename”?

Speech Recognition — NVIDIA Riva Speech Skills v1.5.0-beta documentation - Here under the Citrinet acoustic model section --decoding_vocab=<vocabulary_filename> this parameter is required and I am not sure what to give to this parameter

Please refer to n-gram notebook and try to run it.


The train command produces 3 files called train_n_gram.arpa , train_n_gram.vocab and train_n_gram.kenlm_intermediate saved at $RESULTS_DIR/train/checkpoints .

See N-Gram Language Model — TAO Toolkit 3.0 documentation
vocab_file: string , Optional path to vocab file to limit vocabulary learned by model.

1 Like

Thanks I will check this out