CUBLAS floating point precision

HI,
does anyone know references for precision of floating point matrix multiplication operation on CUBLAS?
a simple 500x500 X 500x500 matrix multiplication gave errors as large as 1x10^(-4) as compared to the MATLAB’s results (intel MKL).

Attached is a mex code using CUBLAS and Makefile, i’m using device gtx 280
matlab code ran is this:

a = single(randn(500));
b = single(randn(500));
c = mexcuMatMul(a,b);
d = a*b;
max(max(c-d))

Note: the error of matlab’s double and single matrix multiplication is much smaller than sum(sum(d-c))
thanks

Makefile:

CUDAHOME = /usr/local/cuda
CUDASDKHOME = ~/NVIDIA_CUDA_SDK
INCLUDEDIR = -I$(CUDAHOME)/include -I$(CUDASDKHOME)/common/inc
INCLUDELIB = -L$(CUDAHOME)/lib -lcublas -lcufft -Wl,-rpath,$(CUDAHOME)/lib
CFLAGS = -fPIC -D_GNU_SOURCE -pthread -fexceptions
COPTIMFLAGS = -O3 -funroll-loops -msse2

Define installation location for MATLAB.

export MATLAB = /usr/local/matlab
MEX = $(MATLAB)/bin/mex
MEXEXT = .$(shell $(MATLAB)/bin/mexext)

List the mex files to be built. The .mex extension will be replaced with the

appropriate extension for this installation of MATLAB, e.g. .mexglx or

.mexa64.

MEXFILES = mexcuMatMul.mex

all: $(MEXFILES:.mex=$(MEXEXT))

clean:
rm -f $(MEXFILES:.mex=$(MEXEXT))

.SUFFIXES: .cu .cu_o .mexglx .mexa64 .mexmaci
.cpp.mexa64:
$(MEX) -v CFLAGS=‘$(CFLAGS)’ COPTIMFLAGS=‘$(COPTIMFLAGS)’ $<
$(INCLUDEDIR) $(INCLUDELIB)
mexcuMatMul.cpp (2.33 KB)