Benchmarking LLMs on AI-Generated CUDA Code with ComputeEval 2025.2

jwitsoe · November 7, 2025, 4:30pm

Originally published at: Benchmarking LLMs on AI-Generated CUDA Code with ComputeEval 2025.2 | NVIDIA Technical Blog

Can AI coding assistants write efficient CUDA code? To help measure and improve their capabilities, we created ComputeEval, a robust, open source benchmark for evaluating AI models and agents on CUDA programming tasks. A few months ago, we announced the first release of ComputeEval and today, we’re introducing its first major expansion by adding more…

Topic		Replies	Views
Announcing ComputeEval, an Open-Source Framework for Evaluating LLMs on CUDA Technical Blog	3	119	October 18, 2025
CUDA AI Code Editor for faster CUDA kernals CUDA Programming and Performance cuda , vscode , inception	2	139	November 7, 2025
Deploy an AI Coding Assistant with NVIDIA TensorRT-LLM and NVIDIA Triton Technical Blog	1	424	February 2, 2024
StarCoder2 15B: A Powerful LLM for Code Generation, Summarization, and Documentation Technical Blog	1	168	July 1, 2024
LLM Performance Benchmarking: Measuring NVIDIA NIM Performance with GenAI-Perf Technical Blog nim , llama	1	144	May 6, 2025
Unlock Your LLM Coding Potential with StarCoder2 Technical Blog	1	353	February 28, 2024
Improve AI Code Generation Using NVIDIA AgentIQ Open-Source Toolkit Technical Blog agentic-ai	2	271	April 10, 2025
CUDA 그래프로 llama.cpp AI 추론 최적화하기 Technical Blog - South Korea llama	1	140	August 9, 2024
How Early Access to NVIDIA GB200 Systems Helped LMArena Build a Model to Evaluate LLMs Technical Blog	1	39	June 25, 2025
Returning only response without explanation NVIDIA Nemotron	2	520	January 4, 2024

Benchmarking LLMs on AI-Generated CUDA Code with ComputeEval 2025.2

Related topics