[RIVA][Jasper][Citrinet] Build and deploy ASR models with custom KenLM language model

shiva4 · September 2, 2021, 10:02am

• Hardware (T4)
• Network Type (speech_to_text)
• TLT Version ( tao: 3.21.08 | docker_tag: v3.21.08-py3)

Following this to build and deploy jasper with a KenLM model but I am not able to find the configuration to follow for best latency and best throughput like it’s mentioned for Citrinet. In citrinet also what does the --vocab_filename parameter imply ? Training KenLM model on custom corpus generates only kenlm_model binary and there is no vocab_file generated. Please help here. Thanks!

Morganh · September 3, 2021, 3:44am

Can you share the exact link of “–vocab_filename”?

shiva4 · September 3, 2021, 5:06am

Riva — NVIDIA Riva - Here under the Citrinet acoustic model section --decoding_vocab=<vocabulary_filename> this parameter is required and I am not sure what to give to this parameter

Morganh · September 3, 2021, 5:31pm

Please refer to n-gram notebook and try to run it.

NGram Language Model Notebook | NVIDIA NGC
NGram Language Model Notebook | NVIDIA NGC
GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC

The train command produces 3 files called train_n_gram.arpa , train_n_gram.vocab and train_n_gram.kenlm_intermediate saved at $RESULTS_DIR/train/checkpoints .

See N-Gram Language Model - NVIDIA Docs
vocab_file: string , Optional path to vocab file to limit vocabulary learned by model.

shiva4 · September 6, 2021, 10:52am

Thanks I will check this out

Topic		Replies	Views
Riva Citrinet Language Model Riva	4	1010	November 22, 2021
Language model with citrinet model is not working Riva nemo , riva	2	679	September 6, 2022
Init. Jarvis with german model Riva riva	9	1505	November 4, 2021
Help with custom deploy and perform inference using citrinet-mandarin NGC pre-trained model in Riva Riva riva	6	1159	October 12, 2021
Error when starting Citrinet with language model Riva riva	6	821	October 12, 2021
Recreate QuickStart Stock Citrinet Model with Modified Parameters Riva	14	1764	August 4, 2022
Missing Information in the Docs Riva	5	816	October 12, 2021
Not able to run LM fine tuned qurtznet model Riva riva	13	1317	October 8, 2021
JARVIS throwing errors for offline ASR when using own model Riva riva	12	2907	September 4, 2021
Riva 1.8 riva_start.sh fail when build with language model Riva riva	3	1199	July 27, 2022

[RIVA][Jasper][Citrinet] Build and deploy ASR models with custom KenLM language model

Related topics