Speeding Up Text-To-Speech Diffusion Models by Distillation

jwitsoe · September 1, 2023, 3:30pm

Originally published at: Speeding Up Text-To-Speech Diffusion Models by Distillation | NVIDIA Technical Blog

Every year, as part of their coursework, students from the University of Warsaw, Poland get to work under the supervision of engineers from the NVIDIA Warsaw office on challenging problems in deep learning and accelerated computing. We present the work of three M.Sc. students—Alicja Ziarko, Paweł Pawlik, and Michał Siennicki—who managed to significantly reduce the…

dkorzekwa · September 1, 2023, 4:21pm

Reducing distillation steps for 5x speed up in latency without compromising speech quality is cool, but will it be enough for real-time Diffusion TTS applications? Can we reach that level? If so, how?

jabborov1219 · October 31, 2024, 6:39am

Do you have any repository?

Topic		Replies	Views
LLM Model Pruning and Knowledge Distillation with NVIDIA NeMo Framework Technical Blog	1	31	February 12, 2025
Enhance Text-to-Image Fine-Tuning with DRaFT+, Now Part of NVIDIA NeMo Technical Blog	1	205	May 2, 2024
Addressing Hallucinations in Speech Synthesis LLMs with the NVIDIA NeMo T5-TTS Model Technical Blog	1	95	July 2, 2024
How to Prune and Distill Llama-3.1 8B to an NVIDIA Llama-3.1-Minitron 4B Model Technical Blog llama	8	199	October 4, 2024
Jump-start Training for Speech Recognition Models in Different Languages with NVIDIA NeMo Technical Blog	0	545	August 25, 2020
Getting a Real Time Factor Over 60 for Text-To-Speech Services Using NVIDIA Jarvis Technical Blog	0	411	August 25, 2020
Develop Smaller Speech Recognition Models with NVIDIA’s NeMo Framework Technical Blog	11	932	November 8, 2022
GPU-Accelerated Speech to Text with Kaldi: A Tutorial on Getting Started Technical Blog	7	837	March 6, 2021
NVIDIA TensorRT Accelerates Stable Diffusion Nearly 2x Faster with 8-bit Post-Training Quantization Technical Blog	11	919	September 14, 2024
Generative AI Research Spotlight: Demystifying Diffusion-Based Models Technical Blog	0	326	December 14, 2023

Speeding Up Text-To-Speech Diffusion Models by Distillation

Related topics