Originally published at: TensorRT 3: Faster TensorFlow Inference and Volta Support | NVIDIA Technical Blog
NVIDIA TensorRT™ is a high-performance deep learning inference optimizer and runtime that delivers low latency, high-throughput inference for deep learning applications. NVIDIA released TensorRT last year with the goal of accelerating deep learning inference for production deployment. A new NVIDIA Developer Blog post introduces TensorRT 3, which improves performance over previous versions and adds new features that make it easier…