# a cublas problem

Hi , i have a little problem with a sgemm cublas .

I wold like to create a cycle like this :

``````real (fp_kind), dimension(:,:), allocatable ::      A, B, C

real ::      time_start,time_end

real (fp_kind)::      alpha=1._fp_kind,beta=1._fp_kind, c_right

integer::  i,j,m1,m2

integer :: stat,stat2,stat3

integer:: size_of_real=16

integer*8:: devPtrA, devPtrB, devPtrC

c CUBLAS

external cublas_init, cublas_set_matrix, cublas_get_matrix

external cublas_shutdown, cublas_alloc

integer cublas_alloc

C CUBLAS

call cublas_init()

do m1=128,2560,32

print *,m1

allocate(A(m1,m1))

allocate(B(m1,m1))

allocate(C(m1,m1))

stat=cublas_Alloc(m1*m1,size_of_real, devPtrA)

stat2=cublas_Alloc(m1*m1,size_of_real, devPtrB)

stat3=cublas_Alloc(m1*m1,size_of_real, devPtrC)

! Initialize the matrices A,B and C

A=1._fp_kind

B=2._fp_kind

C=3._fp_kind

call cublas_Set_Matrix(m1,m1,size_of_real,A,m1,

.           devPtrA,m1)

call cublas_Set_Matrix(m1,m1,size_of_real,B,m1,

.           devPtrB,m1)

call cublas_Set_Matrix(m1,m1,size_of_real,C,m1,

.           devPtrC,m1)

call cublas_SGEMM ('n','n',m1,m1,m1,alpha,devPtrA,m1,

.              devPtrB,m1,beta,devPtrC,m1)

call cublas_Free(devPtrA)

call cublas_Free(devPtrB)

call cublas_Free(devPtrC)

deallocate(A,B,C)

end do

call cublas_shutdown()
``````

But when i’m exec i have only one iteration . Why ?

Thanks for help !!!

For one thing, your size_of_real is wrong.
A single precision floating point value is 4 byte, you are using 16.

Thank you.
Now the cycle go on , but when i compute cputime this values are 0 .
So i suppose that dgemm not calling correctly .
right ?

You never copy back the results. Unless you add a cudaDeviceSynchronize or copy back C, your timing will be incorrect.

thank you . I copy back C and i have what i wont .

There is another question . I wuold like to confrontation Blas dgemm and cublas dgemm. I found that the cublas dgemm faster then mkl dgemm .

But in my application dgemm mkl is better then cublas dgemm .

So i would like to know if there is a place when i read anything in this way .