ASR with hotwords

tomas.lysek · October 2, 2021, 7:57am

Hi,
i am trying to figure out how to create language model with support of hot words. Hot words (or speech context or speech boost) are great feature for dynamically tune ASR with given context in voice assistant. For instance you want to authorize person through their surname and you know with some probability (based on image, or phone number) that his surname is for instance “Lysek” so you give it to speech boost (Google ASR) and it will transcribe correctly with great probability (oposed to without boost it will transcribe Lesek). Custom LM is one solution but it is not suitable for every use case.

I have two questions:

Is this hotword feature (with BPE and kenlm model) on your roadmap? (If yes and in far future, will it be based on some opensource like nemo where we can contribute?)
Is it possible to get from Riva pipeline transcription logits from model to use it with this repository? GitHub - kensho-technologies/pyctcdecode: A fast and lightweight python-based CTC beam search decoder for speech recognition.

Thanks for great work!
Tomas

AakankshaS · October 4, 2021, 4:44pm

Hi @tomas.lysek ,
Please allow us some time to check on this.
Thanks!

ShantanuNair · December 21, 2021, 12:39am

Hey, did anyone get back to you about your query regarding obtaining logits? I think you could hit Triton directly and maybe have it output the logits, but I haven’t fleshed it out yet myself.

Regarding hotwords, if you weren’t updated yet - it’s now been added in the 1.8 release. See the SpeechContext message type under riva/proto/riva_asr.proto here

rleary · December 22, 2021, 3:21pm

Thanks for updating here @ShantanuNair and sorry for the delay in getting back to you @Tomas.lysek. Welcome to the Riva forum.

In Riva 1.8 we added limited support to boost likelihood of words already in the vocabulary. An upcoming release will allow boosting of any arbitrary words at request time.

We currently do not support returning the logits directly.

ShantanuNair · February 10, 2022, 12:06pm

Hi @rleary, this hasn’t been updated for 1.9b has it? Also I ran streamingRecognize() with a speechcontext containing a single phrase of 2 words - both in vocabulary. Something like “Rocket ship” and it said that it is out of vocabulary.

So does it not yet work for phrases and just individual words, or is this unexpected behavior?

rleary · February 11, 2022, 5:05pm

22.02 will add support for out-of-vocabulary terms and should post by the end of this month. Phrase support is currently on the backlog and will be considered based on feedback on the word boosting feature. I’ve noted your request!

Topic		Replies	Views
How to boost/extend vocabulary using the sample app Riva	3	510	February 7, 2023
Issue with Riva word boosting Riva	0	20	August 1, 2024
Extend Vocabulary Riva	1	447	December 13, 2022
Deploying NVIDIA Riva Multilingual ASR with Whisper and Canary Architectures While Selectively Deactivating NMT Technical Blog	1	23	February 20, 2025
RIVA ASR StreamingRecognition low confidence for word transcripts Riva	1	487	November 29, 2023
Accuracy issues with RIVA asr Riva riva	1	506	June 25, 2023
How to add custom words, medical terms to nvidia riva ASR Riva riva	5	1084	September 29, 2021
Update lexicon file - guide - for Citrinet Riva	6	1751	January 4, 2022
Riva Citrinet Language Model Riva	4	982	November 22, 2021
Riva Virtual Assisstant launch error Riva	5	1087	February 3, 2022

ASR with hotwords

Related topics