Hi all,
I’ve started working with CUBLAS recently and i checked the CUDASDK example, tried it and got “PASSED”. Soon after, i made one alteration: instead of filling the matrices with random elements, i filled them with their indexes. However the is something very wrong and very strange.
Matrix A:
0.000000 4.000000 8.000000 12.000000
0.000000 5.000000 9.000000 13.000000
2.000000 6.000000 10.000000 14.000000
3.000000 7.000000 11.000000 15.000000
Matrix B:
0.000000 4.000000 8.000000 12.000000
0.000000 5.000000 9.000000 13.000000
2.000000 6.000000 10.000000 14.000000
3.000000 7.000000 11.000000 15.000000
Matrix C:
0.000000 152.000000 248.000000 344.000000
0.000000 174.000000 286.000000 398.000000
68.000000 196.000000 324.000000 452.000000
74.000000 218.000000 362.000000 506.000000
How can this be? The element 1 is always 0, nevertheless the resulting C matrix is printed as if the other two had the element 1=1, except it prints its element = 0: In other words, if you check the multiplication of the printed matrices A and B it doesnt result in the printed matrix C. This does not make sense and i am certain it has something to do with the code that fills the matrices. Here are the alterations i made to the CUDASDK example:
Alteration 1:
#define index(i,j,ld) (((j)*(ld))+(i))
Alteration 2:
float* h_A,* h_B,* h_C,*h_C_ref;
int i,j;
float* d_A;
float* d_B;
float* d_C;
Alteration 3:
for (i=0;i<HA;i++)
for (j=0;j<WA;j++)
h_A[index(i,j,HA)] = (float) index(i,j,HA);
for (i=0;i<HB;i++)
for (j=0;j<WB;j++)
h_B[index(i,j,HB)] = (float) index(i,j,HB);
My print function
void printMatrix(float*C,int uWC,int uHC){
int i,j;
for(i=0;i<uHC;i++){
printf("\n");
for(j=0;j<uWC;j++)
printf("%f ",C[index(i,j,HC)]);
}
}
Any thoughts? Thanks beforehand
P.S: Does anyone has a CUBLAS implementation of matrix multiplication that works for all type of matrices? The CUDASDK example only works for square matrices.
EDIT: If i print Matrix A or B right after filling them, they are printed right, so it must be one of the functions of the SDK example that is corrupting it