GTC 2020: High Performance Distributed Deep Learning: A Beginner's Guide

GTC 2020 S21546
Presenters: Ammar Ahmad Awan,Ohio State University; Dhabaleswar K (DK) Panda, Ohio State University; Hari Subramoni, Ohio State University
Abstract
Learn the current wave of advances in AI and HPC technologies to improve the performance of deep neural network training on NVIDIA GPUs. We’ll discuss many exciting challenges and opportunities for HPC and AI researchers. Several modern DL frameworks (Caffe, TensorFlow, CNTK, PyTorch, and others) that offer ease-of-use and flexibility to describe, train, and deploy various types of DNN architectures have emerged. We’ll provide an overview of interesting trends in DL frameworks from an architectural/performance standpoint. Most DL frameworks have utilized a single GPU to accelerate the performance of DNN training/inference. However, approaches to parallelize training are being actively explored. We’ll highlight new challenges for message-passing interface runtimes to efficiently support DNN training, and how efficient communication primitives in MVAPICH2-GDR can support scalable DNN training. Finally, we’ll discuss how we scale training of ResNet-50 using TensorFlow to 1,536 GPUs for MVAPICH2-GDR.

Watch this session
Join in the conversation below.