Hi, First I apologize for my newbie question. I am just diving into this for the first time. I am reading everything relevant I can about GPU computing for solving a large dense matrix using a direct solver for the tesla boards. Perhaps you can answer a question and point me in the right direction.
I have an existing Fortran code that has been linking to lapack w/ blas libraries to solve a large direct solve matrix. As you may know many hardware platforms have these math packs optimized for their specific hardware to get the best performance. Generally, to create an executable code for a specific machine it usually requires little more than adding them to the link line while compiling.
With cublas or CULA is that pretty much the same thing or will I have to do write some kind of driver to get this to work on a TESLA k20 card?