Cublas and Matrix Multiplication of a transpose matrix with CUBLAS

Hello everyone,

I have a little problem with CUBLAS and with this function : cublasSgemm…

I want to Multiply the matrix A of dimenssion (N x M) with its transpose. Yet I use this call :
cublasSgemm(‘n’, ‘t’, N, N, M, 1.0, A, N, A, N, 0.0f, Out, N);
But the result is false maybe because the function call is wrong.

Any help is appreciated. Thanks in advance.

The transpose flag only says that an input array is transposed order. It will not actually perform the transpose for you. Also keep in mind that CUBLAS is a fortran ordered BLAS. If you array is in C order (so row major order), it won’t give you the results you might expect, even if you pass the correctly transposed array to it.

Oops… ‘T’ does not transpose and then multiply…??? I thought it would implement “C +/-= (A_or_A_Transpose * B)”, No?..

If it is not gonna transponse and multiply, whats the point in telling that the matrix is transposed???
The transpose operation will not reflect in the passed arrays… thats understandable…

THanks for bringing this up Avid…

No. It is modelled after the BLAS and BLAS has never performed “silent” transposition or in-situ operations.

The point is that the routine can multiply two matrices together without needing to compute a transpose. If you have A=[MxK] and B=[NxK], you can compute C=AxB^T=[MxK][KxN]=[MxN] without needing to explicitly perform the intermediate transposition of B, which would only waste time and storage.

Yeah. It won’t perform in-situ trnaspose… But won’t it transpose and multiply? (The transpose will not affect A or B… but they are transposed as needed before multiplication – possibly on a temporary array)

An excerpt from

Look @ the phrase – “TRANSA and TRANSB determines if the matrices A and B are to be transposed…”

So it seems like the SGEMM operation does do the transpose and multiply… The only array that will be affected will be “C”.

A and B are “not” changed – that does not mean that the transpose is “not” performed…

Here is an excerpt from SGEMM.f from “netlib”

No. It is purely a question of storage order. Fortran arrays a simple linear blocks of memory, and the Transpose flag only controls how the memory is indexed. There is no transposition performed. Go and have a look at the code of any faithful FORTRAN BLAS implementation if you don’t believe me.

If you are very lucky, then a combination of reversing the indexing scheme and selecting the right leading dimension value will give you the equivalent of a transposition for “free”, but that isn’t guaranteed depending on zero padding or alignment, or whether the gemm call is using the whole matrix or not.

My problem is not the order because I initialize my array in Column-major order.

Well, Let me make my point of contention explicit…

I am contending on the “semantics” of the “transA”, “transB” arguments to XGEMM.

You had mentioned said that they are just an “indication” to the API that A or B is in transposed order.

My contention is that:

The API will effectuate a AB (or) ATransposeB (or) ABTranspose (or) ATransposeBTranspose – as requested by the API user.
I am contending on the semantics of “TRANSA”, “TRANSB” arguments.

Thats the reason why I cited from wikipedia and netlib.

Which they are. Even the snippets of netlib you posted agree. They tell the routine(s) how to interpret the M,N,K lda and ldb dimensions that are provided to the call and use them to index into the memory provided for the A and B matrices.

Sorry, but that doesn’t happen.

I think your “lda” is wrong. I think it has to be “M” rather. Check the CUBLASSGEMM documentation in CUBLAS guide.

Also do the cross check for ldb and ldc as well.

In fact my lda and ldb was wrong, the correct function call is : :rolleyes:

cublasSgemm(‘n’, ‘t’, N, N, M, 1.0, A,M, A,M, 0.0f, Out, N);

thanks a lot for your help Aviday and Sarnath.