Is there any linear algebra library available for coding CUDA kernels? In developing of CUDA code, I

[font=Arial, Helvetica,]In developing of CUDA code, I found it is quite common to need sort of Matrix operation, Matrix inverse, so on. Is there any opensource code available for this?[/font]

It depends if you want to process one large matrix (say 10 000 x 10 000) or many small matrices (say 10 x 10 in each GPU thread).

There is CUBLAS from Nvidia, then there is MAGMA and CULA.

CUBLAS

MAGMA

CULA

Can CUBLAS APIs directly called from a CUDA kernel?

For example:

device forceinline void func( MatrixType M, MatrixType& N){

N = M.inverse();

}

Probably, it is not possible to call a CUBLAS function from inside a kernel. Check this thread:
http://forums.nvidia.com/index.php?showtopic=168283#entry1053222

I think this is possible now with Dynamic parallelism on Kepler series.

That feature is restricted to the GK110 chips, which will not be released until the end of the year. Current Kepler chips (GK104 and GK107) do not support dynamic parallelism.