Originally published at: Accelerating TensorFlow on NVIDIA A100 GPUs | NVIDIA Technical Blog
The NVIDIA A100, based on the NVIDIA Ampere GPU architecture, offers a suite of exciting new features: third-generation Tensor Cores, Multi-Instance GPU (MIG) and third-generation NVLink. Ampere Tensor Cores introduce a novel math mode dedicated for AI training: the TensorFloat-32 (TF32). TF32 is designed to accelerate the processing of FP32 data types, commonly used in…