Hi, I’m new to CUDA 4.2 programming and I am trying simple Matrix Multiplication using cublas.
I have 2 Matrices A[HA x WA] and B[HB x WB]. I’m storing the results in C[HC x WC]. I’m running the cublas library and verifying the results against regular matrix multiplication on host.
cublasSgemm_v2(handle,CUBLAS_OP_T, CUBLAS_OP_T, HA, WB, WA, &alpha, d_A, HA, d_B, HB, &beta, d_C, WB);
Case 1: Square Matrix A[N x N] x B[N x N]
The output is Transposed but otherwise, the result is correct. How to get the output without the Transpose?
Case 2: Rectangle Matrix A[N x M] x B[M x N]
Error: ** On entry to SGEMM parameter number 8 had an illegal value
cublasSgemm returned error code 7, line(377)
Why does this error occur?
Case 3: Square Matrix A[N x N] x B[N x N] with CUBLAS_OP_N
The results are incorrect.
Why can’t I use CUBLAS_OP_N option?
Thanks in advance.