Originally published at: Production Deep Learning Inference with TensorRT Inference Server | NVIDIA Technical Blog
In the video below, watch how TensorRT Inference server can improve deep learning inference performance and production data center utilization. TensorRT inference server: Simplifies deploying AI inference.Maximizes GPU utilization with concurrent execution of AI modelsIncreases high inference throughout and scales to peak loads Whether it’s performing object detection in images or video, recommending restaurants, or…