Hi All,
i have a very (probably stupid) question about cublasCgemm.
I’m tring to perform a 2D FFT by means of Mat-Mat-Multiplication and I want to exploit cublasCgemm in order to improve the performarces.
Hence, I have 3 matrix: Ker_u (N2xM), Ker_v (N1xM) and data (MxM). The code segment is the following one:
cublasCgemm (‘n’,‘n’,N1, M, M, alpha, ker_v, N1, DATA_d, M,beta, temp, N1);
cublasCgemm (‘n’,‘t’, N1, N2, M, alpha, temp, N1, ker_u, N2, beta, FFT_DATA_d, N1);
Please, note that temp is a N1xM and FFT_DATA is a N1xN2 matrix, both of cuFloatComplex type and allocated empty, while alpha and beta are respectively cuFloatComplex scalar with value 1 and 0. Further, column-major storage has been adopted and checked.
If I check the temp values after the first multiplication comparing the result of the same calculation in matlab, I get a correct result. What is not correct is the result of the second multiplication temp*Transpose(Ker_u). I cannot recognize the error and it seems that i use the command in the correct way with the right parameters.
Does anyone can help me? Please, I’m wasting a lot of time on something that is probably stupid for somedoby else.
Thanks,
P.