Inverse of a square matrix using Gauss-Jordan elimination

Jkd · May 18, 2017, 6:07pm

Hello there,
I am trying to calculate inverse of a 3x3 matrix on the GPU using Gaussian elimination, as required by another kernel already running on the GPU. However, seems I am not getting the right answer:

My original matrix is:
204.00000401_____________-204.00000001____________________-2e-06
-204.00000001_____________ 204.00000401____________________-2e-06
-2e-06____________________-2e-06____________________ 4e-06

If I try my algorithm on the CPU using a simple C-code or in Matlab, it gives me the right answer as follows:

      250000.001138622__________249999.998687641__________249999.999913132
      249999.998687641__________250000.001138622__________249999.999913132
      249999.999913132__________249999.999913132__________499999.999913132

however if I try this on the GPU, I get the following:
nan__________nan__________nan
nan__________nan__________nan
nan__________nan__________nan

I am aware that this issue is probably related to the IEEE 754 standard of floating point operations on the GPU. Is there any way to work around this issue? I am pretty sure someone in this forum is having this similar issue. It would be highly appreciated if someone could share their few point here. thanks in advance.
-Jayanta

Robert_Crovella · May 18, 2017, 8:57pm

I doubt this has anything to do with IEEE 754 standard. More likely a bug in your code. Here is a rudimentary GJE example:

[url]algorithm - cuda matrix inverse gaussian jordan - Stack Overflow

Jkd · May 19, 2017, 2:48am

Thank you for the comment. I am thinking about IEEE 754 because the matrix is only 3x3 and I tried with a different 3x3 matrix. As shown below:

Matrix before inversion:
6.000000__________-2.500000__________-1.000000
-2.500000__________78.320000__________-5.247000
-1.000000__________-5.247000__________11.259000

Matrix after inversion:
0.172555506__________0.00674538361__________0.0184695385
0.00674538361__________0.0134432986__________0.0068640532
0.0184695385__________0.0068640532__________0.0936570987

These answers are from GPU version of the code and is in perfect match with CPU-based algorithms. I tried my code for other bigger matrices as well, results matches perfectly. So the issue of Bug is probably not here.

Robert_Crovella · May 19, 2017, 3:50am

which is your kernel using float or double? (matlab uses double)

Jkd · May 19, 2017, 4:23pm

Thank you Txbob. I am using float type both on the CPU and GPU.Sorry for the delay in replying.

Robert_Crovella · May 19, 2017, 4:31pm

Try using double. GJE involves division, and some of your divisors for the badly performing matrix are quite small relatively speaking. The exact details probably matter here, i.e. the exact code probably matters. I don’t think this is actually specific to the GPU, because, in my experience, properly constrained float behavior on the CPU is basically identical to float behavior on the GPU. It is pretty much a misconception floating around that there are differences.

Jkd · May 25, 2017, 4:41pm

Thank you Txbob for your effort. For me no Luck yet!
However, I found some interesting thought from the following post, where it says:

"For single precision…

If you want an accuracy of +/-0.5 (or 2^-1), the maximum size that the number can be is 2^23. Any larger than this and the distance between floating point numbers is greater than 0.5.

If you want an accuracy of +/-0.0005 (about 2^-11), the maximum size that the number can be is 2^13. Any larger than this and the distance between floating point numbers is greater than 0.0005. " – during calculations, some of my numbers are going beyond 10^6 (i.e. 2^23) and I am getting wrong values? What do you think? thanks in advance.

https://stackoverflow.com/questions/872544/what-range-of-numbers-can-be-represented-in-a-16-32-and-64-bit-ieee-754-syste

Topic		Replies	Views
Inverse of a 3x3 matrix CUDA Programming and Performance	22	9381	February 20, 2024
Pycuda - Adapt existing code and Kernel code to perform a high number of 3x3 matrix inversion CUDA Programming and Performance	0	458	March 27, 2019
Incorrect inversion for matrix with CUBLAS CUDA Programming and Performance	4	1293	December 22, 2016
Precision Problem CUDA Programming and Performance	2	1725	June 21, 2009
Matrix Inversion source CUDA Programming and Performance	7	25230	March 6, 2012
Matrix inverse accuracy errors CUDA Programming and Performance	2	1167	April 10, 2011
matrix multiplication with large dimensions CUDA Programming and Performance	7	1584	April 9, 2011
matrix multiplication benchmark CUDA Programming and Performance	8	4416	May 21, 2010
Different output from emulation and device precision issues on GPU vs CPU CUDA Programming and Performance	9	7234	August 20, 2009
Question regarding Precision Issues in BLAS CUDA Programming and Performance	9	8516	March 4, 2010

Inverse of a square matrix using Gauss-Jordan elimination

Related topics