NVIDIA Blackwell Delivers Massive Performance Leaps in MLPerf Inference v5.0

jwitsoe · April 2, 2025, 6:14pm

Originally published at: NVIDIA Blackwell Delivers Massive Performance Leaps in MLPerf Inference v5.0 | NVIDIA Technical Blog

The compute demands for large language model (LLM) inference are growing rapidly, fueled by the combination of growing model sizes, real-time latency requirements and most recently AI reasoning. At the same time, as AI adoption grows, the ability of an AI factory to serve as many users as possible, all while maintaining good per-user experiences,…

Topic		Replies	Views
NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1 Technical Blog	2	113	August 28, 2024
NVIDIA Blackwell Delivers up to 2.6x Higher Performance in MLPerf Training v5.0 Technical Blog	1	70	June 4, 2025
NVIDIA H200 Tensor Core GPUs and NVIDIA TensorRT-LLM Set MLPerf LLM Inference Records Technical Blog	1	330	March 27, 2024
NVIDIA Blackwell Doubles LLM Training Performance in MLPerf Training v4.1 Technical Blog	1	95	November 13, 2024
Leading MLPerf Inference v3.1 Results with NVIDIA GH200 Grace Hopper Superchip Debut Technical Blog	1	503	October 3, 2023
NVIDIA Sets New Generative AI Performance and Scale Records in MLPerf Training v4.0 Technical Blog	1	164	June 12, 2024
MLPerf Training v4.1에서 LLM 트레이닝 성능을 두 배로 향상시킨 NVIDIA Blackwell Technical Blog - South Korea	1	75	November 26, 2024
NVIDIA Blackwell Delivers World-Record DeepSeek-R1 Inference Performance Technical Blog	3	218	July 17, 2025
Blackwell Breaks the 1,000 TPS/User Barrier With Meta’s Llama 4 Maverick Technical Blog llama	3	157	September 10, 2025
Full-Stack Innovation Fuels Highest MLPerf Inference 2.1 Results for NVIDIA Technical Blog	0	402	September 8, 2022

NVIDIA Blackwell Delivers Massive Performance Leaps in MLPerf Inference v5.0

Related topics