Simplifying and Scaling Inference Serving with NVIDIA Triton 2.3

Originally published at: https://developer.nvidia.com/blog/simplifying-and-scaling-inference-serving-with-triton-2-3/

AI, machine learning (ML), and deep learning (DL) are effective tools for solving diverse computing problems such as product recommendations, customer interactions, financial risk assessment, manufacturing defect detection, and more. Using an AI model in production, called inference serving, is the most complex part of incorporating AI in applications. Triton Inference Server takes care of…