I’ve gotten my feet wet with CUDA, and am working on wetting my ankles now. Recall the mex file routine for using CUDA for a generalized matrix multiplication, SGEMM, from matlab:
I am now searching for a way to calculate a matrix division in a mex file: “C=A/B” Google searching finds only hints and promises of such code; this is of course implemented in LAPACK, but that’s not available for CUDA yet.
Right now I am using the matlab mexCallMATLAB function to use matlab’s division. My division itself is not that intensive, but having to copy matrices in and out of the GPU to do the division is a bit annoying/time consuming.
Is code for calculating this division available now, or is it still in the works?