In the document CUBLAS Library, PG-00000-002_V1.1, September, 2007,
there is a discussion on p. 75 regarding Fortran Bindings:
“…CUBLAS uses 1â€based indexing and Fortranâ€style columnâ€major
storage for multidimensional data to simplify interfacing to Fortran
applications. Unfortunately, Fortranâ€toâ€C calling conventions are not
standardized and differ by platform and toolchain.â€â€¦
This is no longer a correct statement. Fortran 2003 provides the concept “Interoperability with C.†Many Fortran compilers have implemented the new standard, which allow a programmer to write portable code that interfaces to C. This is done with Use Association through an intrinsic module ISO_C_BINDING. Several vendors have provided partial support for this interoperability module:
Intel Fortran compiler 10.1, the IBM XL Fortran Enterprise Edition for AIX, V11.1 (5724-S72) Version 11.01.0000.0001D, (September 19, 2007), Sun Fortran 95 8.3 (July 18, 2007), g95 version 0.92 (March 14, 2009), and gfortran version 4.4.0 (February 19, 2009) have this intrinsic module available.
Using an efficient version of the routine SGEMM() is a crucial factor for achieving high performance in numerical linear algebra. The CUBLAS library lists a C version of matrix multiply, cublasSgemm(), that provides the functionality of the Level 3 BLAS routine SGEMM(). We include a standard and portable interface to that CUBLAS library code using a wrapper subprogram. This not been tested with the genuine CUBLAS library but it has been emulated using a stub C function that has the same calling protocols. Note the use of the attribute VALUE for passing some of the arguments, and the use of a copy from Fortran CHARACTER data to C char data.
Some forum posters have commented that the CUBLAS library is best used for larger problem sizes. The listed code could obviously be modified to call the CUBLAS code for one range of matrix dimensions and one or other routines for smaller sizes.
SUBROUTINE SGEMM(TRANSA,TRANSB,M,N,K,ALPHA,&
A,LDA,B,LDB,BETA,C,LDC)
USE ISO_C_BINDING
IMPLICIT NONE
! .. Scalar Arguments ..
REAL ALPHA,BETA
INTEGER K,LDA,LDB,LDC,M,N
CHARACTER(LEN=*) TRANSA,TRANSB
! ..
! .. Array Arguments ..
REAL A(LDA,*),B(LDB,*),C(LDC,*)
! Define the INTERFACE to the NVIDIA C code cublasSgemm.
! This version of SGEMM is used in a user application
! that calls LAPACK single precision routines, or makes
! other uses of that code.
character(1,c_char) cta, ctb
INTERFACE
! This is what the NVIDIA code expects for its inputs:
! void cublasSgemm (char transa, char transb, int m, int n,
! int k, float alpha, const float *A, int lda,
! const float *B, int ldb, float beta,
! float *C, int ldc)
subroutine c_sgemm(cta, ctb, m, n, k,&
alpha, A, lda, B, ldb, beta, c, ldc)bind(C,name='cublasSgemm')
USE ISO_C_BINDING
character(1,c_char),value :: cta, ctb
integer(c_int),value :: m,n,k,lda,ldb,ldc
real(c_float),value :: alpha,beta
real(c_float) :: A(lda,*),B(ldb,*),C(ldc,*)
end subroutine c_sgemm
END INTERFACE
! The calculation, excepting initialization and finalization,
! is done with the NVIDIA C routine 'cublasSgemm.'
! A local name c_sgemm is used in Fortran.
! The name c_sgemm could be replaced by NVIDIA's name if one chose.
cta=transa(1:1); ctb=transb(1:1) ! Pass only first character.
call c_sgemm(cta, ctb,&
m, n, k, alpha, A, lda, B, ldb, beta, c, ldc)
return
end