Please provide the following information when requesting support.
Hardware - GPU: T4
Operating System: Ubuntu 20.04
Riva Version: 2.10
I"m using the CTC conformer models in Spanish (es-US) to do streaming recognition through a telephone line. However, when there is background noise, spurious words appear in ASR transcriptions. In the releases it is mentioned that there is an option to use the neural-based voice activity detector to avoid this problem, how can I use it? Is there any other way to suppress the noise without doing fine-tuning?
Thanks.