I try to do a simple product of matrix … normaly simple with the Cublas library …
But, I have a little problem, I don’t find the correspondance between the documentation and implementation.
To do the product of matrix, i use the followinf function :

void

cublasSgemm (char transa, char transb, int m, int n, int k, float alpha,

const float *A, int lda, const float *B, int ldb, float beta,

float *C, int ldc)

C = alpha * op(A) * op(B) + beta * C,
with alpha = 1.0 and beta = 0.0 …
But I have two problems :
In the case of square matrix, I need to exchange the order of matrix to obtain the good result, I use this line
cublasSgemm(‘n’,‘n’,tx,ty,tx,1.0,d_B,tx,d_A,tx,0.0,d_C,tx);
In spite of
cublasSgemm(‘n’,‘n’,tx,ty,tx,1.0,d_A,tx,d_B,tx,0.0,d_C,tx);
My 2nd problem is i cant’ obtain good result if I try to evaluate the product of matrix not square …
I use the Cuda 2_0 beta version, on a Windows Xp System.
What is wrong ?
Thanks
Beleys
status = cublasAlloc(m1X*m1Y, sizeof(float), (void**)&d_A);
status = cublasAlloc(m2X*m2Y, sizeof(float), (void**)&d_B);
status = cublasAlloc(m2Y*m1X, sizeof(float), (void**)&d_C);
cublasSetMatrix (m1Y, m1X, sizeof(float), fat1, m1X, d_A,m1X);
cublasSetMatrix (m2X, m2Y, sizeof(float), fat2, m2X, d_B,m2X);
cublasSgemm('N', 'N', m1Y, m2X, m1X, 1, d_B, m1X, d_A, m2X, 0.0, d_C, m2X);
status = cublasGetError();
status = cublasGetMatrix (m1Y, m2X, sizeof(float),d_C, m1Y, fat3, m1Y);