a cublas problem

Hi , i have a little problem with a sgemm cublas .

I wold like to create a cycle like this :

real (fp_kind), dimension(:,:), allocatable ::      A, B, C

      real ::      time_start,time_end

      real (fp_kind)::      alpha=1._fp_kind,beta=1._fp_kind, c_right

      integer::  i,j,m1,m2

      integer :: stat,stat2,stat3

      integer:: size_of_real=16

      integer*8:: devPtrA, devPtrB, devPtrC


      external cublas_init, cublas_set_matrix, cublas_get_matrix

      external cublas_shutdown, cublas_alloc

      integer cublas_alloc


      call cublas_init()

do m1=128,2560,32

       print *,m1




stat=cublas_Alloc(m1*m1,size_of_real, devPtrA)

       stat2=cublas_Alloc(m1*m1,size_of_real, devPtrB)

       stat3=cublas_Alloc(m1*m1,size_of_real, devPtrC)

! Initialize the matrices A,B and C




call cublas_Set_Matrix(m1,m1,size_of_real,A,m1,

     .           devPtrA,m1) 

call cublas_Set_Matrix(m1,m1,size_of_real,B,m1,

     .           devPtrB,m1)

call cublas_Set_Matrix(m1,m1,size_of_real,C,m1,

     .           devPtrC,m1)

call cublas_SGEMM ('n','n',m1,m1,m1,alpha,devPtrA,m1,

     .              devPtrB,m1,beta,devPtrC,m1)

call cublas_Free(devPtrA)

       call cublas_Free(devPtrB)

       call cublas_Free(devPtrC)


      end do

call cublas_shutdown()

But when i’m exec i have only one iteration . Why ?

Thanks for help !!!

For one thing, your size_of_real is wrong.
A single precision floating point value is 4 byte, you are using 16.

Thank you.
Now the cycle go on , but when i compute cputime this values are 0 .
So i suppose that dgemm not calling correctly .
right ?

You never copy back the results. Unless you add a cudaDeviceSynchronize or copy back C, your timing will be incorrect.

thank you . I copy back C and i have what i wont .

There is another question . I wuold like to confrontation Blas dgemm and cublas dgemm. I found that the cublas dgemm faster then mkl dgemm .

But in my application dgemm mkl is better then cublas dgemm .

So i would like to know if there is a place when i read anything in this way .