Building/linking R against cuBLAS?

I can upgrade the default NetLib BLAS that comes with an existing R install by compiling a new BLAS (OpenBlas in this case) and renaming/symlinking the new BLAS to the old BLAS filename. I don’t have to recompile the R binary.

Is it possible to do this with cuBLAS? Or do I have to build R against CUDA, and if so how? Apologizes if this has been asked before, but I’ve searched the forums and Google and haven’t found an answer.

Something similar should be possible with nvblas:

http://docs.nvidia.com/cuda/nvblas/index.html#abstract

It involves interception instead of re-linking. The advantage is the NVBLAS library has a goal to only accelerate calls where there will be a (performance) benefit.

CUBLAS works best when you have a sequence of linear algebra operations, where you can retain intermediate results on the GPU. You do not want to be transferring data back and forth on each library call. Since CUBLAS generally requires the programmer to manage all this, the programmer can decide when and which results need to be moved. A “dumb” BLAS drop-in replacement using CUBLAS doesn’t know this, and so has to transfer data to and from the GPU on every library call. This limits the number and sizes of problems for which it is still beneficial to use the GPU.

NVBLAS attempts to work this way, by only attempting to accelerate the routines and data sizes that will actually show a performance benefit in this “dumb” model. Other intercepted calls are just “reflected” back to your existing BLAS implementation.

To get more benefit from GPU BLAS acceleration than this, it requires something other than a dumb relink or intercept model.