Inverse of a 3x3 matrix

dscerutti · February 20, 2024, 9:05pm

Interesting! I can see the usefulness of what you are doing with a highly batched problem. In my case, what I need is a single 3x3 matrix inversion for the entire thread block to then use and do some work. The total number of inversions could, in theory, be in the thousands, but is much more likely to be in the tens to hundreds. The other critical piece of information is that the matrix data is already on the GPU, resident in its D-RAM, and the result will be needed by the GPU afterwards. That’s the real reason why I’m not just performing the calculation on the CPU: offloading and then loading back is the same cost (though you might correct me on the technical details) as downloading the data, solving the inverse, and then putting the result back.

I’m curious about the cublas setup–doesn’t it take a minimum matrix size of 32? One could pad or place ten 3x3 matrices as blocks along the diagonal of a 32x32 matrix, but at least in my case the point is to get one step out of the way rather than to compute a huge batch of results.

Robert_Crovella · February 20, 2024, 9:13pm

The maximum matrix size is 32. The documentation states:

This function is a short cut of cublasgetrfBatched plus cublasgetriBatched. However it doesn’t work if n is greater than 32. If not, the user has to go through cublasgetrfBatched and cublasgetriBatched.

If n (the matrix side dimension) is larger than 32, you can still do a batched inverse, but the methodology is to use getrfBatched and getriBatched together, like this.

dscerutti · February 20, 2024, 9:21pm

Thanks, sorry I had read that but thought that n was referring to the batch size, not the matrix size itself. The perils of working on too little sleep…

Topic		Replies	Views
Incorrect inversion for matrix with CUBLAS CUDA Programming and Performance	4	1298	December 22, 2016
Matrix Inversion with cublasSgetri GPU-Accelerated Libraries	17	8690	April 10, 2019
Can someone help me optimize some 3x3 matrix inverse code? CUDA Programming and Performance	10	1933	August 19, 2014
cuBLAS convolution does not use Tensor Cores GPU-Accelerated Libraries cublas	6	2238	June 8, 2021
Is it correct that my Pascal card is calling Maxwell_gemm kernels through cublas? And if so, why is cublas unusably slow for me? CUDA Programming and Performance	6	949	August 23, 2018
Questions about cuFFT for 3D matrix, arrayFire GPU-Accelerated Libraries	5	1671	October 12, 2021
Matrix inverse usng linear system solver through cublas , cublasCreate exception or something else CUDA Programming and Performance	1	4631	June 16, 2013
Batched solver code available CUDA Programming and Performance	29	14585	July 17, 2023
Matlab mex file using cublas - problems CUDA Programming and Performance	13	8967	October 13, 2009
Matrix Multiplication Garbage value :( CUDA Programming and Performance	10	3416	July 25, 2009

Inverse of a 3x3 matrix

Related topics