Cublas : cublasDgemm() issue is not doing what they say

Hi everyone,

I’ve been stuck for 2hours debbuging my code to discover that it was not from me (I think…) but from the cublasDgemm() function from the cublas library.

It’s supposed to do that :

but it’s doing that :

C = alpha * op( B ) * op( A ) + beta * C

Tell me if it’s the same for you or if it’s a problem in the pdf document that explains cublas library.

Thanks.

Bye :)

It isn’t the same for me and it isn’t a problem with the documentation. In every version I have used (from 2.1 through to 3.1beta), it works correctly and exactly as described in the documentation (and as you would expect any BLAS to).

So I am guessing you are doing something incorrectly.

I looked around but I don’t think I’ve missed something…

My version of cublas : 3.0.14 (64bits)

Here’s my code :

[codebox]

#include “cublas.h”

typedef struct {

int colonnes;

int lignes;

int singuliere; // 0=inversible, 1=singuliere, 2=inconnu

double* elements;

}Matrice;

// Rempli une matrice avec des nombres aléatoires

void rempliMatrixRandom(Matrice* M) {

int i,j;

time_t t;

srand((unsigned)time(&t));



for (i=0; i<M->lignes; ++i) {

	for (j=0; j<M->colonnes; ++j) {

		setElement(M,i,j,((double)rand()/(double)RAND_MAX));

	}

}

}

Matrice* initMatrix(int lignes, int colonnes) {

Matrice* M=(Matrice*)malloc(sizeof(Matrice));

M->colonnes=colonnes;

M->lignes=lignes;

M->singuliere=2;

if (!(M->elements=(double*)malloc(sizeof(double)*M->lignes*M->colonnes))) {

	printf("Erreur d'allocation memoire\n");

	system("PAUSE");

	exit(EXIT_FAILURE);

}

return M;

}

Matrice* multiplieMatrixGPU(Matrice* A,Matrice* B) {

Matrice* C=initMatrix(A->lignes,B->colonnes);

int lda,ldb,ldc;

lda=A->lignes;

ldb=B->lignes;

ldc=C->lignes;

double* ptrA;

double* ptrB;

double* ptrC;

cublasInit();

cublasAlloc(A->lignes*A->colonnes,sizeof(double),(void**) &ptrA);

cublasAlloc(B->lignes*B->colonnes,sizeof(double),(void**) &ptrB);

cublasAlloc(C->lignes*C->colonnes,sizeof(double),(void**) &ptrC);

cublasSetMatrix(A->lignes,A->colonnes,sizeof(double),A->elements,lda,ptrA,lda);

cublasSetMatrix(B->lignes,B->colonnes,sizeof(double),B->elements,ldb,ptrB,ldb);

cublasSetMatrix(C->lignes,C->colonnes,sizeof(double),C->elements,ldc,ptrC,ldc);

cublasDgemm('n','n',A->lignes,B->colonnes,A->colonnes,1,ptrA,lda,ptrB,ldb,0,ptrC,ldc);

cublasGetMatrix(C->lignes,C->colonnes,sizeof(double),ptrC,ldc,C->elements,ldc);

cublasFree(ptrA);

cublasFree(ptrB);

cublasFree(ptrC);

cublasGetError();

cublasShutdown();

return C;

}

int main(int argc, char** argv) {

Matrice* A=initMatrix(3,3);

Matrice* B=initMatrix(3,3);



rempliMatrixRandom(A);

rempliMatrixRandom(B);

//multiplieMatrixGPU(A,B)); doesnt work

multiplieMatrixGPU(B,A)); // ok!

system("PAUSE");

//runMenu(argc,argv);

}

[/codebox]

You do realise that CUBLAS is a Fortran ordered BLAS, so your arrays need to be in column major order?

Your code is incomplete so it is impossible to say with certainty, but it looks suspiciously like your storage is row major ordered.

I didn’t understand well how that works like you mean leading dimension? I must change the values from rows to columns?

What is missing?

Thanks for your help.

A discussed here. Fortran ordered arrays are stored in column order, C are stored in row order. If you look at the first two pages of the CUBLAS manual you will find discussion of indexing and C preprocessor macros to write Fortran ordered data in C.

A definition of setElement.

Ok I’ve reordered my arrays in column major order and my problem is solved now :)

Thanks all