NVIDIA AI Inference Performance Milestones: Delivering Leading Throughput, Latency and Efficiency

Originally published at: https://developer.nvidia.com/blog/nvidia-ai-inference-performance-milestones-delivering-leading-throughput-latency-and-efficiency/

Inference is where AI-based applications really go to work. Object recognition, image classification, natural language processing, and recommendation engines are but a few of the growing number of applications made smarter by AI. Recently, TensorRT 5, the latest version of NVIDIA’s inference optimizer and runtime, became available. This version brings new features including support for…