GTC 2020: Scaling Deep Learning for Automatic Speech Recognition

GTC 2020 S21838
Presenters: Jacob Kahn,Facebook; Vineel Pratap,Facebook; Vitaliy Liptchinsky,Facebook AI Research
Abstract
We’ll discuss challenges of scaling automatic speech recognition (ASR) workloads with wav2letter++, a fast C++ toolkit for ASR. We’ll introduce distributed training techniques used to achieve almost linear scalability and compare wav2letter to other popular ASR toolkits. Constant increase in model and dataset sizes, along with current trends toward unsupervised and semi-supervised learning, require squeezing out every bit of performance. In addition to distributed training, we’ll cover other approaches for faster training and for training large models.

Watch this session
Join in the conversation below.