NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1

jwitsoe · August 28, 2024, 3:00pm

Originally published at: https://developer.nvidia.com/blog/nvidia-blackwell-platform-sets-new-llm-inference-records-in-mlperf-inference-v4-1/

Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a highly optimized inference engine are required for high-throughput, low-latency inference. MLPerf Inference v4.1 is the latest version of the popular and widely recognized MLPerf Inference benchmarks, developed by the MLCommons consortium. The benchmark includes many popular…

aldobbarros · August 28, 2024, 7:01pm

MLPerf v4.1 mede desempenho de LLMs! GPUs potentes + interconexão rápida + bibliotecas otimizadas = inferência eficiente. Veja + sobre LLMs Aqui llm mlperf

Topic		Replies	Views
NVIDIA H200 Tensor Core GPUs and NVIDIA TensorRT-LLM Set MLPerf LLM Inference Records Technical Blog	1	264	March 27, 2024
Leading MLPerf Inference v3.1 Results with NVIDIA GH200 Grace Hopper Superchip Debut Technical Blog	1	450	October 3, 2023
NVIDIA Blackwell Delivers World-Record DeepSeek-R1 Inference Performance Technical Blog	2	51	March 24, 2025
NVIDIA Sets New Generative AI Performance and Scale Records in MLPerf Training v4.0 Technical Blog	1	98	June 12, 2024
NVIDIA TensorRT-LLM Supercharges Large Language Model Inference on NVIDIA H100 GPUs Technical Blog	5	1034	September 27, 2023
Achieving High Mixtral 8x7B Performance with NVIDIA H100 Tensor Core GPUs and TensorRT-LLM Technical Blog	1	103	July 2, 2024
Demystifying AI Inference Deployments for Trillion Parameter Large Language Models Technical Blog	2	174	July 11, 2024
Setting New Records at Data Center Scale Using NVIDIA H100 GPUs and NVIDIA Quantum-2 InfiniBand Technical Blog	0	314	November 8, 2023
NVIDIA TensorRT-LLM Enhancements Deliver Massive Large Language Model Speedups on NVIDIA H200 Technical Blog	0	413	December 5, 2023
Optimizing Inference on Large Language Models with NVIDIA TensorRT-LLM, Now Publicly Available Technical Blog	8	1686	January 25, 2024

NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1

Related topics