I am trying to speed up some fortran codes with cublas. Matrix operations like cublassgemm can be used in my code and the GPU’s acceleration performance is very good. However, the cublassgeam used to transpose the matrix cannot be called. The error meassage is “undefined reference to `cublassgeam_'”. I found that all the codes in “BLAS-like Extension” can not be found in the fortran code.
Does anyone know how to call cublassgeam in PGI fortran? Looking forward to your help!
subroutine(matrixa, matrixb, nva, nvb)
use cublas
use cudafor
integer nva, nvb
real matrixa(nva,nvb), matrixb(nvb, nva)
call cublassgeam(……)
return
end