Maximum Performance and Minimum Footprint for AI Apps with NVIDIA TensorRT Weight-Stripped Engines

Originally published at: https://developer.nvidia.com/blog/maximum-performance-and-minimum-footprint-for-ai-apps-with-nvidia-tensorrt-weight-stripped-engines/

NVIDIA TensorRT, an established inference library for data centers, has rapidly emerged as a desirable inference backend for NVIDIA GeForce RTX and NVIDIA RTX GPUs. Now, deploying TensorRT into apps has gotten even easier with prebuilt TensorRT engines.  The newly released TensorRT 10.0 with weight-stripped engines offers a unique solution for minimizing the engine shipment…