GTC 2020: Overcoming Latency Barriers: Strong Scaling HPC Applications with NVSHMEM

GTC 2020 S21673
Presenters: Mathias Wagner,NVIDIA
Abstract
For scientific advancement through HPC, ever-increasing simulation capabilities are not the only key to success. Obtaining timely results is often even more important. Reducing the time-to-solution generally requires the application to be strong-scalable. However, scaling up improved single-GPU performance faces many obstacles. We’ll show you how to improve the strong-scaling on systems equipped with NVIDIA GPUs. Avoid or hide latencies by exploiting GPU-centric communication with NVSHMEM, an implementation of OpenSHMEM for GPUs. After introducing NVSHMEM, we’ll share best practices gathered from using NVSHMEM for QUDA, a library for Lattice QCD on GPUs used by codes as MILC and Chroma. We show results obtained on fat-GPU nodes like DGX-1/2, as well as scaling them to 1,000 GPUs in InfiniBand-connected systems, including Summit.

Watch this session
Join in the conversation below.