NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support

jwitsoe · August 15, 2024, 5:10pm

Originally published at: https://developer.nvidia.com/blog/nvidia-tensorrt-model-optimizer-v0-15-boosts-inference-performance-and-expands-model-support/

NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques including quantization, sparsity, and pruning. These techniques reduce model complexity and enable downstream inference frameworks like NVIDIA TensorRT-LLM and NVIDIA TensorRT to more efficiently optimize the inference speed of generative AI models. This post outlines…

Topic		Replies	Views
업그레이드된 NVIDIA TensorRT 10.0의 사용성, 성능, AI 모델 지원 Technical Blog - South Korea	1	130	May 29, 2024
NVIDIA TensorRT 10.0 Upgrades Usability, Performance, and AI Model Support Technical Blog	1	197	May 14, 2024
TensorRT 5 RC Now Available Technical Blog	0	262	August 21, 2022
Speed Up New models with TensorRT Updates Technical Blog	0	241	August 21, 2022
TensorRT 5 GA Now Available Technical Blog	0	255	August 21, 2022
TensorRT 3: Faster TensorFlow Inference and Volta Support Technical Blog	0	258	August 21, 2022
New Update to the NVIDIA Deep Learning SDK Now Help Accelerate Inference Technical Blog	0	279	November 8, 2021
Video Tutorial: Accelerating Inference Performance of Recommendation Systems with TensorRT Technical Blog	0	189	August 21, 2022
TensorRT 4 Accelerates Neural Machine Translation, Recommenders, and Speech Technical Blog	0	380	August 25, 2020
NVIDIA AI Inference Performance Milestones: Delivering Leading Throughput, Latency and Efficiency Technical Blog	0	394	March 13, 2023

NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support

Related topics