Accelerating TensorFlow on NVIDIA A100 GPUs

jwitsoe · August 25, 2020, 11:53pm

Originally published at: Accelerating TensorFlow on NVIDIA A100 GPUs | NVIDIA Technical Blog

The NVIDIA A100, based on the NVIDIA Ampere GPU architecture, offers a suite of exciting new features: third-generation Tensor Cores, Multi-Instance GPU (MIG) and third-generation NVLink. Ampere Tensor Cores introduce a novel math mode dedicated for AI training: the TensorFloat-32 (TF32). TF32 is designed to accelerate the processing of FP32 data types, commonly used in…

Topic		Replies	Views
Accelerating AI Training with NVIDIA TF32 Tensor Cores Technical Blog	1	555	January 29, 2021
NVIDIA Ampere Architecture In-Depth Technical Blog	0	954	August 25, 2020
Getting Immediate Speedups with NVIDIA A100 TF32 Technical Blog	1	464	November 15, 2020
Tf-trt conversion got killed TensorRT tensorrt , tensorflow , jetson-inference	3	747	April 22, 2021
Mixed-Precision Programming with CUDA 8 Technical Blog	1	392	February 23, 2017
Inside Volta: The World’s Most Advanced Data Center GPU Technical Blog	43	1069	October 1, 2018
Deploying Models from TensorFlow Model Zoo Using NVIDIA DeepStream and NVIDIA Triton Inference Server Technical Blog	13	1185	May 25, 2022
Setting New Records at Data Center Scale Using NVIDIA H100 GPUs and NVIDIA Quantum-2 InfiniBand Technical Blog	0	319	November 8, 2023
Profiling and Optimizing Deep Neural Networks with DLProf and PyProf Technical Blog	13	1414	August 11, 2021
Memory Issues and Conversion issues with TF-TRT on Nano Jetson Nano tensorrt	8	1534	October 18, 2021

Accelerating TensorFlow on NVIDIA A100 GPUs

Related topics