Simplifying AI Inference in Production with NVIDIA Triton

Originally published at: Simplifying AI Inference in Production with NVIDIA Triton | NVIDIA Developer Blog

AI machine learning is unlocking breakthrough applications in fields such as online product recommendations, image classification, chatbots, forecasting, and manufacturing quality inspection. There are two parts to AI: training and inference. Inference is the production phase of AI. The trained model and associated code are deployed in the data center or public cloud, or at…