I’m the author of https://github.com/fommil/netlib-java/ and I’m very impressed by the performance results of cuBLAS!
I’d really like to be able to give instructions for my users to make use of cuBLAS, but I don’t have an NVIDIA graphics card to be able to test the instructions.
In order to use the system optimised BLAS, my users just have to make a libblas.so.3 available on their Linux machines and on OS X, the veclib framework is used. On Windows, things are always a lot tricker as there is no easy way to set this up in a way that works well (although I’m keen to do something similar to Linux).
On Linux, can cuBLAS be setup to provide the system libblas.so.3 ?
On OS X, can cuBLAS be setup to provide the BLAS part of the veclib?
On Windows, can cuBLAS be setup to provide the system libblas3.dll ?