No output streaming mode when building model with custom vocabulary

Please provide the following information when requesting support.

Hardware - GPU (A100/A30/T4/V100) : Geforce 3080 RTX
Hardware - CPU : Intel i9
Operating System : WSL 2
Riva Version : 2.11.0
TLT Version (if relevant)
How to reproduce the issue ? (This is for errors. Please share the command and the detailed log here)

I tried to add a custom vocabulary to transcribe audio in streaming mode but It seems that the model is not built correctly, I don’t have any output after transcription.

The model used is a citrinet_1024 in English. I also tried with the citrinet_512 and conformer.
The offline model seems to run perfectly.

Here is the command I used to build the model :

riva-build speech_recognition stt_en_citrinet_1024_VOC_stream_v2.rmir stt_en_citrinet_1024.riva --streaming=True --name=citrinet_1024_ADDED_VOC_stream_v2 --decoder_type=flashlight --decoding_language_model_binary=speechtotext_en_us_LM/mixed-lower.binary --decoding_vocab=vocab_lm.txt  --language=en-US --nn.use_trt_fp32 -f

Language model used : Riva ASR English LM, version = deployable_v4.1
The vocabulary file “vocab_lm.txt” contains a list of words, one word = one line.
I also tried without the “nn.use_trt_fp32” flag.

config.sh (12.7 KB)

Hi @mel.adl

Thanks for your interest in Riva

Apologies for the delay

Can you kindly share the custom vocabulary used with us for reproduction

Thanks

Hi @rvinobha
Here is the custom vocabulary
vocab_lm.txt (213 Bytes)

Hello,

I’m still facing the issue when I try to build with decoder_type=flashlight
Are they any updates ?

Thank you