 # CUBLAS, simple product of matrix

I try to do a simple product of matrix … normaly simple with the Cublas library …

But, I have a little problem, I don’t find the correspondance between the documentation and implementation.

To do the product of matrix, i use the followinf function :

• void

• cublasSgemm (char transa, char transb, int m, int n, int k, float alpha,

• ``````         const float *A, int lda, const float *B, int ldb, float beta,
``````
• ``````         float *C, int ldc)
``````
• C = alpha * op(A) * op(B) + beta * C,

with alpha = 1.0 and beta = 0.0 …

But I have two problems :

In the case of square matrix, I need to exchange the order of matrix to obtain the good result, I use this line

cublasSgemm(‘n’,‘n’,tx,ty,tx,1.0,d_B,tx,d_A,tx,0.0,d_C,tx);

In spite of

cublasSgemm(‘n’,‘n’,tx,ty,tx,1.0,d_A,tx,d_B,tx,0.0,d_C,tx);

My 2nd problem is i cant’ obtain good result if I try to evaluate the product of matrix not square …

I use the Cuda 2_0 beta version, on a Windows Xp System.

What is wrong ?

Thanks

Beleys

`````` status = cublasAlloc(m1X*m1Y, sizeof(float), (void**)&d_A);

status = cublasAlloc(m2X*m2Y, sizeof(float), (void**)&d_B);

status = cublasAlloc(m2Y*m1X, sizeof(float), (void**)&d_C);

cublasSetMatrix (m1Y, m1X, sizeof(float), fat1, m1X, d_A,m1X);

cublasSetMatrix (m2X, m2Y, sizeof(float), fat2, m2X, d_B,m2X);

cublasSgemm('N', 'N', m1Y, m2X, m1X, 1, d_B, m1X, d_A, m2X, 0.0, d_C, m2X);

status = cublasGetError();

status =  cublasGetMatrix (m1Y, m2X, sizeof(float),d_C, m1Y, fat3, m1Y);
``````

CUBLAS is using Fortran ordering ( column major), if you are calling from C your matrices are in row major ordering.

Yep. Instead of using ‘N’ ‘N’ as the first two parameters you pass in, use ‘t’ ‘t’

``````cublasSgemm('t', 't', m1Y, m2X, m1X, 1, d_B, m1X, d_A, m2X, 0.0, d_C, m2X);
``````

It should be like that. Also, are you sure the square of a matrix is working? It should be returning the transpose of the result you want, not the actual result.

Instead of C = A * B … It’s why I have obtained good result for square matrix … 