Matrix Inversion source

laplasch · October 21, 2008, 2:01pm

Hi,

I want to share my matrix inversion. It is based on an blocked GauÃŸ elimination Algorithm. I found this algorithm in an ppt from “Christian Heinrich”. Gauselimination
I build the rest to invert a matrix.

This is one of my first CUDA projects, maybe there are already better implementations but i didn’t found one.

I Used the VS_Project Wizard 1.2 with a custom cuda.rules (included in the zip). There is an test envirement for matlab included, so you can easily load it as a library.

The results where the expected one, i got a performance boost if the matrixsize
is bigger than 128x128.

I need this algorithm to solve an matrix equation. So if you got an faster one so tell me,please =)

I run my tests with a
Asus Gforce GTX260 (192shader@1,24GHz)
AMD x2 3800+ am2 (2,0GHz)

Results:
small matrix sizes
[attachment=7627:attachment]
larger matrix sizes
[attachment=7628:attachment]

Ch. Wagner

update: documentation
MatrixInversionv1.1.zip (545 KB)

callingearthling · October 21, 2008, 5:01pm

Hey,

Nice work! Just wanted to know if you have ever written anything for blocking of 3d matrices? Any reference code u have available?

Thanks,

Vandhan

Hi,

I want to share my matrix inversion. It is based on an blocked GauÃŸ elimination Algorithm. I found this algorithm in an ppt from “Christian Heinrich”. Gauselimination

I build the rest to invert a matrix.

This is one of my first CUDA projects, maybe there are already better implementations but i didn’t found one.

I Used the VS_Project Wizard 1.2 with a custom cuda.rules (included in the zip). There is an test envirement for matlab included, so you can easily load it as a library.

The results where the expected one, i got a performance boost if the matrixsize

is bigger than 128x128.

I need this algorithm to solve an matrix equation. So if you got an faster one so tell me,please =)

I run my tests with a

Asus Gforce GTX260 (192shader@1,24GHz)

AMD x2 3800+ am2 (2,0GHz)

Results:

small matrix sizes

[attachment=9849:attachment]

larger matrix sizes

[attachment=9850:attachment]

Ch. Wagner

[snapback]454690[/snapback]

paulius · October 21, 2008, 5:08pm

I’m guessing the performance charts are using timings with a straight-forward (not highly optimized) CPU implementation? Otherwise I don’t follow why Matlab would be that much faster. Do you happen to have timing results with some optimized math library (along the lines of MKL, etc.)?

Paulius

laplasch · October 22, 2008, 8:49am

Sorry, i havent anything about solving 3D matrices. I am happy that i finished work with the 2D stuff External Media

Yes the CPU implementation isn’t optimzed, i used the standard MS c compiler and there are no SIMD optimisations in the code. You can take a look at the reference c code in the zip archiv. I also surpriced that the MatLab version is very fast. In the MatLab documentaion is a reference that they are using the LAPACK library for the calculation of the inverse matrix. I thought that LAPACK is a fully optimized library for algebra problems. I will complete my docu. and will add some messurement with other optimized libraries and other GPUs.

suriudit · March 15, 2010, 5:01am

i tried compiling ur code but got an error “cutil32D.dll missing” even though the file is present.
do you have a code in which cpu and gpu code are in the same file? i’m not good with the project you have posted. i’m looking for a code which compute inversion on cpu and gpu in the same code.

facialnerve415 · February 2, 2011, 7:31pm

I got your code to compile fine, but your test example is flawed, the memory addressing wasn’t correct. when fixed, the code no longer validates on the inverse of the identity matrix. I changed your matrix generation code to:

tmp = dataInput;
for(int i =0; i <size;i++) {
for (int j=0; j < size;j++) {
if (i==j) {
tmp[i*size+j]=1;
} else {
tmp[i*size+j]=0;
}
}
}

The error gets better as the number increases, but at this time it is not working correctly. I will look into your code and to compare and contrast.

Shan

player999 · November 13, 2011, 11:21pm

I found a way to “**** up” CULA in order to create inverse matrix in free edition(normally that function available in standard edition) for dense matrices. Everything is pretty simple. There is available function, that solves linear equations of type A * X = B(1) called GESV. Where is B is NOT vector, but matrix. Therefor that function solves multiple equations with different right parts, specified in columns of matrix B.

So let’s get down to business. It is known that A*A^(-1) = E(2). Where is E – matrix with ones in diagonal. Let’s assume that X in formula (1) is A^(-1) and B in formula (1) is E. Voila!

Of course this is obvious thing. But maybe certain folks don’t about know that :)

Source code:

# include <cula.h>

# define THREADS 512

__global__ void generateEye(float *E, int size);

int matrixInvert(float *devInput, float *devOutput, int size){

    int *ipiv;

    float *cache;

    int size2 = size * size * sizeof(float);

    int blocks = (size * size) / THREADS + 1;

    cudaMalloc(&ipiv, size2);

    cudaMalloc(&cache, size2);

    generateEye<<<blocks, THREADS>>>(devOutput, size);

    cudaMemcpy(cache, devInput, size2, cudaMemcpyDeviceToDevice);

    culaInitialize();

    culaDeviceSgesv(size, size, cache, size, ipiv, devOutput, size);

    culaShutdown();

    cudaFree(ipiv);

    cudaFree(cache);

    return 0;

}

__global__ void generateEye(float *E, int size){

    int offset = blockIdx.x * blockDim.x + threadIdx.x;

    int y = offset / size;

    int x = offset - y * size;

    if(offset > size * size) return;

    if(x == y) E[offset] = 1;

    else E[offset] = 0;

}

CULA is fin tool. But i prefer to not pay for software. Greetings from Ukraine :)

Gtska · March 6, 2012, 9:58pm

Hi, I downloaded your method, but I’d like to implement one by myself but the doc files are written in German, Is there a way I could get a copy from the documentation file in English? And also I would like to contact you by e-mail

Topic		Replies	Views
Inverse of a 3x3 matrix CUDA Programming and Performance	22	9499	February 20, 2024
troubles in matrix inversion takes more time than cpu CUDA Programming and Performance	4	1203	March 7, 2012
numerous, but small-sized matrix inversions looking 4 advise how-to speed-up problem CUDA Programming and Performance	4	3176	August 20, 2008
CULAPACK CUDA Programming and Performance	9	17983	April 30, 2009
about inversion of complex matrix CUDA Programming and Performance	6	1432	April 14, 2014
32x32 matrix invert NOT using cuBLAS? CUDA Programming and Performance	1	527	May 24, 2016
Incorrect inversion for matrix with CUBLAS CUDA Programming and Performance	4	1304	December 22, 2016
Large Matrix Multiplication and Inversion Matrices that does'nt fit in GPU-Memory CUDA Programming and Performance	1	4510	September 19, 2011
Lots of small matrices CUDA Programming and Performance	27	73479	September 23, 2011
About Matrix Inversion CUDA Programming and Performance	7	3261	December 4, 2012

Matrix Inversion source

Ch. Wagner

Related topics