Can NVIDIA Riva’s text-to-speech feature be enhanced to include speech disfluencies like 'hmm’s and 'uh’s, to make the generated speech sound more natural, particularly while the system processes calculations?
Hi @ro.goab
Thanks for your interest in Riva
Apologies this is not supported currently
For Speech Recognition we have work boosting capability
https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tutorials/asr-improve-recognition-for-specific-words.html
Thanks