ASR with hotwords

i am trying to figure out how to create language model with support of hot words. Hot words (or speech context or speech boost) are great feature for dynamically tune ASR with given context in voice assistant. For instance you want to authorize person through their surname and you know with some probability (based on image, or phone number) that his surname is for instance “Lysek” so you give it to speech boost (Google ASR) and it will transcribe correctly with great probability (oposed to without boost it will transcribe Lesek). Custom LM is one solution but it is not suitable for every use case.

I have two questions:

Thanks for great work!

Hi @tomas.lysek ,
Please allow us some time to check on this.

Hey, did anyone get back to you about your query regarding obtaining logits? I think you could hit Triton directly and maybe have it output the logits, but I haven’t fleshed it out yet myself.

Regarding hotwords, if you weren’t updated yet - it’s now been added in the 1.8 release. See the SpeechContext message type under riva/proto/riva_asr.proto here

Thanks for updating here @ShantanuNair and sorry for the delay in getting back to you @Tomas.lysek. Welcome to the Riva forum.

In Riva 1.8 we added limited support to boost likelihood of words already in the vocabulary. An upcoming release will allow boosting of any arbitrary words at request time.

We currently do not support returning the logits directly.

Hi @rleary, this hasn’t been updated for 1.9b has it? Also I ran streamingRecognize() with a speechcontext containing a single phrase of 2 words - both in vocabulary. Something like “Rocket ship” and it said that it is out of vocabulary.

So does it not yet work for phrases and just individual words, or is this unexpected behavior?

22.02 will add support for out-of-vocabulary terms and should post by the end of this month. Phrase support is currently on the backlog and will be considered based on feedback on the word boosting feature. I’ve noted your request!

1 Like