Boosting Matrix Multiplication Speed and Flexibility with NVIDIA cuBLAS 12.9

jwitsoe · May 1, 2025, 8:00pm

Originally published at: https://developer.nvidia.com/blog/boosting-matrix-multiplication-speed-and-flexibility-with-nvidia-cublas-12-9/

The NVIDIA CUDA-X math libraries empower developers to build accelerated applications for AI, scientific computing, data processing, and more. Two of the most important applications of CUDA-X libraries are training and inference LLMs, whether for use in everyday consumer applications or highly specialized scientific domains like drug discovery. Multiple CUDA-X libraries are indispensable for efficiently…

Topic		Replies	Views
Accelerating GPU Applications with NVIDIA Math Libraries Technical Blog	0	380	July 26, 2022
NVIDIA Releases Updates to CUDA-X AI Libraries Technical Blog	0	362	August 21, 2022
A few Questions related to CUDA and CUBLAS CUDA Programming and Performance	0	909	February 1, 2013
Why is numpy matrix multiplication faster than CUDA? CUDA Developer Tools	0	684	February 17, 2021
Any libraries to use for Sum, Max, Min etc? To be applied on one big array CUDA Programming and Performance	2	614	July 31, 2015
How to use the library CUBLAS CUDA Programming and Performance	1	1990	February 4, 2012
Just Released: cuBLASDx Technical Blog	1	300	January 12, 2024
NVIDIA CUDA-Q Powers Quantum Applications Research Technical Blog	3	10	March 21, 2025
Matrix multiplication CUDA Programming and Performance	3	3786	March 6, 2008
NVIDIA CUDA-X AI SDK where to download GPU-Accelerated Libraries	2	1507	March 29, 2019

Boosting Matrix Multiplication Speed and Flexibility with NVIDIA cuBLAS 12.9

Related topics