No output streaming mode when building model with custom vocabulary

mel.adl · July 7, 2023, 11:24am

Please provide the following information when requesting support.

Hardware - GPU (A100/A30/T4/V100) : Geforce 3080 RTX
Hardware - CPU : Intel i9
Operating System : WSL 2
Riva Version : 2.11.0
TLT Version (if relevant)
How to reproduce the issue ? (This is for errors. Please share the command and the detailed log here)

I tried to add a custom vocabulary to transcribe audio in streaming mode but It seems that the model is not built correctly, I don’t have any output after transcription.

The model used is a citrinet_1024 in English. I also tried with the citrinet_512 and conformer.
The offline model seems to run perfectly.

Here is the command I used to build the model :

riva-build speech_recognition stt_en_citrinet_1024_VOC_stream_v2.rmir stt_en_citrinet_1024.riva --streaming=True --name=citrinet_1024_ADDED_VOC_stream_v2 --decoder_type=flashlight --decoding_language_model_binary=speechtotext_en_us_LM/mixed-lower.binary --decoding_vocab=vocab_lm.txt  --language=en-US --nn.use_trt_fp32 -f

Language model used : Riva ASR English LM, version = deployable_v4.1
The vocabulary file “vocab_lm.txt” contains a list of words, one word = one line.
I also tried without the “nn.use_trt_fp32” flag.

config.sh (12.7 KB)

rvinobha · July 17, 2023, 1:52pm

Hi @mel.adl

Thanks for your interest in Riva

Apologies for the delay

Can you kindly share the custom vocabulary used with us for reproduction

Thanks

mel.adl · July 18, 2023, 7:23am

Hi @rvinobha
Here is the custom vocabulary
vocab_lm.txt (213 Bytes)

mel.adl · August 22, 2023, 1:52pm

Hello,

I’m still facing the issue when I try to build with decoder_type=flashlight
Are they any updates ?

Thank you

Topic		Replies	Views
Riva Citrinet Language Model Riva	4	984	November 22, 2021
No result when specifying offline mode and streaming=False Riva	3	721	July 23, 2023
Riva Quickstart 2.2.1 offline en-US models missing Riva	3	1051	July 4, 2022
Deploying Custom Model in RIVA via NodeJS GRPC Riva riva	3	742	April 6, 2022
Wrong outputs from our fine-tuned version of speechtotext_english_citrinet_1024.tlt after deploying using riva_init.sh Riva inception	3	780	August 12, 2022
Rebuilding the asrset3 citrinet offline pipeline but with larger chunk size Riva	10	1311	February 16, 2022
Final transcript is empty on streaming mode Riva	5	648	December 22, 2022
Final transcripts showing empty transcription Riva python	6	556	November 2, 2022
Not able to run LM fine tuned qurtznet model Riva riva	13	1266	October 8, 2021
Extending RIVA vocabulary Riva	1	424	November 1, 2022

No output streaming mode when building model with custom vocabulary

Related topics