Getting the Most Out of NVIDIA T4 on AWS G4 Instances

Originally published at:

With the continued growth of AI models and data sets and the rise of real-time applications, getting optimal inference performance has never been more important. In this post, you learn how to get the best natural language inference performance from AWS G4dn instance powered by NVIDIA T4 GPUs, and how to deploy BERT networks easily using NVIDIA Triton Inference Server.