Hello,
I’m trying to use cublas in a sparse linear solver using StarPU.
My programm execute several GEMM / AXPY on GPU using cublas.
My problem is that sometime i get a CUBLAS_STATUS_EXECUTION_FAILED status after running cublasSgemm.
I look at my parameters and it looks ok :
transa = ‘n’, transb = ‘t’,
M = 12, N = 4, K = 4,
alpha = 1.0, A = 2f109e34, lda = 25,
B = 2f10a034, ldb = 25,
beta = 0.0, C = 20fc0000, ldc = 12
What is more strange is that if after getting this status i copy the data to the host and print all in files, my product is correct.
I runned all my application ignoring CUBLAS_STATUS_EXECUTION_FAILED and my system is correctly solved…
So all seems correct except that I have this CUBLAS_STATUS_EXECUTION_FAILED and I would like to know why I get this error and how I could correct it.
I call cublasSgemm trought this code :
#define CUBLAS(func) cublasS ## func
#define CUBLAS_GEMM(i,j,m,n,k,x,a,u,b,v,y,c,w) \
{ \
BLAS_INT varim = (BLAS_INT)(m); \
BLAS_INT varin = (BLAS_INT)(n); \
BLAS_INT varik = (BLAS_INT)(k); \
BLAS_INT variu = (BLAS_INT)(u); \
BLAS_INT variv = (BLAS_INT)(v); \
BLAS_INT variw = (BLAS_INT)(w); \
FLOAT varix = (FLOAT)(x); \
FLOAT variy = (FLOAT)(y); \
CUBLAS(gemm)(*(i), *(j), varim, varin, varik, varix, (a), \
variu, (b), variv, variy, (c), variw); \
CUBLAS_CHECK_GEMM(*(i),*(j),a,b,c); \
cudaStreamSynchronize(starpu_cuda_get_local_stream()); \
}
Where th check only get status, and if there is an error, print status error, print parameters, get matrices from GPU and print them into file.
Thanks,
XL
Edit, my config :
DELL Precision T7400
Linux Debian 6.0.3 - 2.6.32-5-amd64
icc (ICC) 12.0.3 20110309
CUDA 4.0.17
GeForce GTX 295
2 quadcore Intel(R) Xeon(R) CPU E5410 @ 2.33GHz
RAM 32Go