Optimizing and Serving Models with NVIDIA TensorRT and NVIDIA Triton

Originally published at: https://developer.nvidia.com/blog/optimizing-and-serving-models-with-nvidia-tensorrt-and-nvidia-triton/

Learn how to optimize models from TensorFlow, PyTorch, or any other framework and then deploy/serve them at scale with NVIDIA TensorRT and NVIDIA Triton

I hope this blog helps readers get an intuition about how to use TensorRT and Triton in a pipeline. Feel free to reach out for any follow-up questions!