Deploying NVIDIA Triton at Scale with MIG and Kubernetes

Originally published at: Deploying NVIDIA Triton at Scale with MIG and Kubernetes | NVIDIA Developer Blog

NVIDIA Triton can manage any number and mix of models (limited by system disk and memory resources). It also supports multiple deep-learning frameworks such as TensorFlow, PyTorch, NVIDIA TensorRT, and so on. This provides flexibility to developers and data scientists, who no longer have to use a specific model framework. NVIDIA Triton is designed to integrate easily with Kubernetes for large-scale deployment in the data center.