Hi there,

i’m looking for a way to implement an algorithm in CUDA, that is able of calculating the Inverse of a Matrix and to multiplicate 2 rectangular Matrices. The Problem is the following, the Matrices are too big to fit in the GPU-Memory, but we assume, that they fit in the CPU-Memory, so I need a Block algorithm, which copies back and forth, but I don’t know how to do these,

Could anyone help me?