Deploying Models from TensorFlow Model Zoo Using NVIDIA DeepStream and NVIDIA Triton Inference Server

Originally published at:

If you’re building unique AI/DL application, you are constantly looking to train and deploy AI models from various frameworks like TensorFlow, PyTorch, TensorRT, and others quickly and effectively. Whether it’s deployment using the cloud, datacenters, or the edge, NVIDIA Triton Inference Server enables developers to deploy trained models from any major framework such as TensorFlow,…

Hi, this is Dhruv. Hope the blog was instructional. Triton Inference Server is something I find myself using very often to deploy models for simple tests as well as production. Being framework agnostic, it’s also really useful for testing off the shelf models for latency/performance and accuracy to make sure it’ll meet my needs. With the integration of Triton with DeepStream, these abilities are now available on NVIDIA dGPU and NVIDIA Jetson with streaming video and edge-to-cloud features. While this blog focuses on deepstream-app as a turnkey solution for IVA, the nvinferserver gstreamer plugin can be used for most models. Furthemore, TF-TRT allows for easy performance optimization with minimal time spent in creating a TensorRT plan so you can prototype and see what kind low hanging fruit can be used to improve performance. Good luck with your IVA projects!