SC20 Demo: Maximizing Performance for Distributed Machine Learning and Deep Learning with SHARP

Originally published at: https://developer.nvidia.com/blog/sc20-demo-maximizing-performance-for-distributed-machine-learning-and-deep-learning-with-sharp/

Today’s modern-day machine learning data centers require complex computations and fast, efficient data delivery. The NVIDIA Mellanox Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) takes advantage of the in-network computing capabilities in the NVIDIA Mellanox Quantum switch, dramatically improving the performance of distributed machine learning workloads. SHARP technology improves upon the performance of MPI and…