HI,
does anyone know references for precision of floating point matrix multiplication operation on CUBLAS?
a simple 500x500 X 500x500 matrix multiplication gave errors as large as 1x10^(-4) as compared to the MATLAB’s results (intel MKL).
Attached is a mex code using CUBLAS and Makefile, i’m using device gtx 280
matlab code ran is this:
a = single(randn(500));
b = single(randn(500));
c = mexcuMatMul(a,b);
d = a*b;
max(max(c-d))
Note: the error of matlab’s double and single matrix multiplication is much smaller than sum(sum(d-c))
thanks
Makefile:
CUDAHOME = /usr/local/cuda
CUDASDKHOME = ~/NVIDIA_CUDA_SDK
INCLUDEDIR = -I$(CUDAHOME)/include -I$(CUDASDKHOME)/common/inc
INCLUDELIB = -L$(CUDAHOME)/lib -lcublas -lcufft -Wl,-rpath,$(CUDAHOME)/lib
CFLAGS = -fPIC -D_GNU_SOURCE -pthread -fexceptions
COPTIMFLAGS = -O3 -funroll-loops -msse2
Define installation location for MATLAB.
export MATLAB = /usr/local/matlab
MEX = $(MATLAB)/bin/mex
MEXEXT = .$(shell $(MATLAB)/bin/mexext)
List the mex files to be built. The .mex extension will be replaced with the
appropriate extension for this installation of MATLAB, e.g. .mexglx or
.mexa64.
MEXFILES = mexcuMatMul.mex
all: $(MEXFILES:.mex=$(MEXEXT))
clean:
rm -f $(MEXFILES:.mex=$(MEXEXT))
.SUFFIXES: .cu .cu_o .mexglx .mexa64 .mexmaci
.cpp.mexa64:
$(MEX) -v CFLAGS=‘$(CFLAGS)’ COPTIMFLAGS=‘$(COPTIMFLAGS)’ $<
$(INCLUDEDIR) $(INCLUDELIB)
mexcuMatMul.cpp (2.33 KB)