simple cublasCgemm: what's wrong with?

Hi All,

i have a very (probably stupid) question about cublasCgemm.
I’m tring to perform a 2D FFT by means of Mat-Mat-Multiplication and I want to exploit cublasCgemm in order to improve the performarces.

Hence, I have 3 matrix: Ker_u (N2xM), Ker_v (N1xM) and data (MxM). The code segment is the following one:

cublasCgemm (‘n’,‘n’,N1, M, M, alpha, ker_v, N1, DATA_d, M,beta, temp, N1);
cublasCgemm (‘n’,‘t’, N1, N2, M, alpha, temp, N1, ker_u, N2, beta, FFT_DATA_d, N1);

Please, note that temp is a N1xM and FFT_DATA is a N1xN2 matrix, both of cuFloatComplex type and allocated empty, while alpha and beta are respectively cuFloatComplex scalar with value 1 and 0. Further, column-major storage has been adopted and checked.

If I check the temp values after the first multiplication comparing the result of the same calculation in matlab, I get a correct result. What is not correct is the result of the second multiplication temp*Transpose(Ker_u). I cannot recognize the error and it seems that i use the command in the correct way with the right parameters.

Does anyone can help me? Please, I’m wasting a lot of time on something that is probably stupid for somedoby else.



I believe your arguments in cublasCgemm are consistent. You might be able to narrow your bug down if you replace one of your arguments in your second cublasCgemm with identity (don’t forget to modify the corresponding m, n, or k value).

Are you sure about the dimension of your matrices?

If you want to do a DFT transform of a matrix X(m,n), let’s call Dx(m,m) and Dy(n,n) the DFT matrix in one dimension.
Dx(m,m)*X(m,n)= X1(m,n) will transform the column.
To transform the rows, you need to transpose it before applying Dy(n,n)*X1’(n,m)=X2(n,m) and then transpose it again to go back to the original order.

So FFT2(X)=DxXDy’

This is the Matlab code:

% Build Dx

% Build Dy

%random matrix

%2D fft

%2D DFT (we really want the transpose, not the hermitian)


Many thanks for your reply.

Yes I’m sure about the dimension: I have a MxM input data matrix that correpsonds to the the values ofa function F(x,y) evaluated on a cartesian grid of M-points in x-axes. and M-points in y-axes.

I want N1 points in the y-conjugate domain and N2 points in the x-conjugate domain.

In your code you are implicitly assuming that you have and input data sized (n,m) and you want to get exactly (n,m) samples in the conjugate domain. If you need a different number of samples in the spectrum the, say k and h for x-conjugate and y-conjugate variables, respectively, and assume that n=m, the matrices invoved in your matlab code will have dimension: Dx (Kxm), Dy (hxm). So they agree with dimensions of matrices of my code.

I think that I’m not using properly the cublasCgemm…but I can’t see how! :-(



In Matlab, if you use a “prime” instead of explicity typing out the word “transpose”, it gives the conjugate transpose. Your cublasCgemm is using the transpose option, not the conjugate transpose option. Could this be the problem?