Develop ML and AI with Metaflow and Deploy with NVIDIA Triton Inference Server

Originally published at:

There are many ways to deploy ML models to production. Sometimes, a model is run once per day to refresh forecasts in a database. Sometimes, it powers a small-scale but critical decision-making dashboard or speech-to-text on a mobile device. These days, the model can also be a custom large language model (LLM) backing a novel…

Thanks to the Outerbounds and NVIDIA teams for make it easier for developers to build and deploy ML/AI solutions! Thanks to Outerbounds for writing this blog post and the numerous NVIDIA team members who gave their valuable input and made it go live. If you have any questions or comments, let us know.