Have a CUDA mex file which I am accessing from MATLAB which works fine when I am using a K20, but when the same code on a different laptop(compiled with compute 3.0 instead of 3.5) I get the infamous INTERNAL_ERROR from a cublas Sgemv() call.
I searched already the forums and understand in general terms when this error shows up, but in this case the code works fine on the 3.5 machine, but crashes with a 3.0.
The code gets through all the memory allocations, and at least 10-15 Sgemm() and Strsm() before it exits from a Sgemv() call.
Every GPU command in the code checks for errors, and it works fine on the K20, but not the 680 in the laptop.
This error seems very general, so what are some possible causes in this case, which would show up only with the 680 but not the K20?
The 680 is the only GPU in the laptop running W7, Visual Studio 2010 and Matlab 2011b.