I have worked on a code that performs matrix inversion on GPU. The code is in attachments (it is written for a complex matrix, but the conversion to real matrices is quite straightforward). For now, the code works good when the rank of the matrix is exactly a multiple of the BLOCK_SIZE variable.

I have worked for several days trying to tranpose this code so that it works no matter what the rank of the matrix is, for a givent BLOCK_SIZE size.

I would be very grateful if I could get some help on it. From previous posts, I know this is a function that a lot of people could be interested in.

Perhaps you can fix the problem with the input matrix size which should be %BLOCK_SIZE, by padding zeros to the inputmatrix and ones at the diag elements.

You are not adding BlockSize-1 to size when computing execution model. Should be (size + Blocksize - 1) / Blocksize.

That said, I’m confused on how to make your code work. Is the input matrix square or do I tack on the identity matrix to the right before passing in. basically, I can’t make the code work.

The input matrix have to be sqare and currently sized with BLOCKSIZE (16x16, 32x32, 48x48). The identity matrix will be generated with the method “GPUsetIdentity”. So the execution parameter (size + Blocksize - 1) / Blocksize equals (size / Blocksie). Here you will find an example.

IF you need to invert an other sized matrix like 39x39 you have to do some changes.