cuBLAS matrix multiplication of a batch

jul1a.berger · May 2, 2020, 11:56am

Hi,
I would like to process a batch of small matrices within CUDA. So far, I used the GEMMstridedBatched routines from CuBLAS. Now I have an array containing a batch of many “concatenated” small matrices in column-major format as typical in cuBLAS. I would like to calculate the matrix product of all these matrices in one array, thus reduce all M NxN matrices to one NxN matrix by calculating A1 * A2 * … * Am. Is there a possibility to do something like that in cuBLAS? I looked also at cuBLASLt but I couldn’t see how one could exploit the operation descriptor formalism to achieve this reduction operation. Do you have any hint for me?

Julia

Topic		Replies	Views
Matrix multiplication of many small-sized matrices CUDA Programming and Performance	3	1706	March 30, 2020
Pro Tip: cuBLAS Strided Batched Matrix Multiply Technical Blog	11	1149	February 16, 2018
Pro Tip: cuBLAS Strided Batched Matrix Multiply Technical Blog	0	427	November 1, 2021
Batch Matrix Multiplication using CuBLAS GPU-Accelerated Libraries tensorrt , cuda , kernel , c-plus-plus	17	4039	March 2, 2021
Batch Matrix Multiplication using CuBLAS GPU-Accelerated Libraries tensorrt , cuda , c-plus-plus	1	1116	February 19, 2021
multiple matrix-matrix multiplications CUDA Programming and Performance	4	1424	May 21, 2014
SgemmBatched to multiply batched matrix and non batched matrix CUDA Programming and Performance	1	1124	April 16, 2015
cublasSgemmBatched CUDA Programming and Performance	2	2021	May 30, 2015
Question about cublas and optimizing multiple matrix operations GPU-Accelerated Libraries	3	644	February 4, 2020
Matrix product along batch axis & backpropagation GPU-Accelerated Libraries	0	407	August 24, 2020

cuBLAS matrix multiplication of a batch

Related topics