Getting a Real Time Factor Over 60 for Text-To-Speech Services Using NVIDIA Jarvis

Originally published at: https://developer.nvidia.com/blog/getting-real-time-factor-over-60-for-text-to-speech-using-jarvis/

Figure 1. The Jarvis Server and the TTS pipeline. NVIDIA Jarvis is an application framework that provides several pipelines for accomplishing conversational AI tasks. Generating high-quality, natural-sounding speech from text with low latency, also known as text-to-speech (TTS), can be one of the most computationally challenging of those tasks. In this post, we focus on…