GTC 2020: Distributed Training and Fast Inter-GPU communication with NCCL

GTC 2020 S21107
Presenters: Sylvain Jeaugey,NVIDIA
Abstract
NCCL, NVIDIA Collective Communication Library, is used by all Deep Learning frameworks to distribute computing on multiple GPUs, allowing users to train very large networks in minutes instead of weeks. In this session, we will present how NCCL combines hardware technologies such as NVLink, PCI, Ethernet and Infiniband to achieve maximum speed for inter-GPU communication. We will detail how those technologies compare, and how much of a difference they make for users. We will also detail how we continue to innovate to accelerate distributed GPU computing and support new models.

Watch this session
Join in the conversation below.