How to call cublas library into my cuda fortran code?

Luzhang · December 4, 2011, 4:07am

Hi,
I am a new user of CUDA Fortran.
I use the same cuda fortran code in this forum post:How to use (or call) cublas library

I compile with “pgfortran -Mcuda -o test_cublasSgemm_gpu test_cublasSgemm.F90 C:\cuda\lib\cublas.lib”.
Then I got an error
“test_cublasSgemm.obj : error LNK2019: unresolved external symbol cublasSgemm referenced in function MAIN_
test_cublasSgemm_gpu.exe : fatal error LNK1120: 1 unresolved externals”

Based on Mat’s recommendation,

"because of Win32 calling conventions, you need to add an “@” followed by the size in bytes of the argument list to symbol decoration. For sgemm this is “@52”. "
“Also, Win64 uses a different calling convention and does not need the “@”.”

But I am using win7 64-bit. What’s Win64 calling convention?

Please help.

Thanks,
Luzhang

MatColgrove · December 5, 2011, 7:47pm

Hi Luzhang,

One possibility is that the entry symbol name is not “cublasSgemm” in the library you’re using. If this is the case, then you’ll need to adjust the name in the ISO C Binding clause.

Though, more likely you have a version conflict. By default, the PGI 2011 compilers generate CUDA 3.2 objects. If you are using a CUDA 4.0 cublas library, then this could cause issues. To fix, use the “-Mcuda=4.0” flag.

Finally, you can try using the cublas libraries that we ship with the compilers. They seem to work for me.

Note that I assume that you modified the code to remove the condition compilation (i.e. the #ifdef) since I get undefined external references to “sgemm_” without it.

Mat

PGI$ pgf90 -D_CUDAFOR -Mcuda=4.0 -V11.9 cublasSgemm.f90 -Mpreprocess C:\\Program\ Files\\PGI\\win64\\2011\\cuda\\4.0\\lib64\\cublas.lib
cublasSgemm.f90:
PGI$
PGI$
PGI$ pgf90 -D_CUDAFOR -Mcuda=4.0 -V11.9 cublasSgemm.f90 -Mpreprocess C:\\Program\ Files\\PGI\\win64\\2011\\cuda\\4.0\\lib64\\cublas.lib -o test40.exe
cublasSgemm.f90:
PGI$ pgf90 -D_CUDAFOR -Mcuda -V11.9 cublasSgemm.f90 -Mpreprocess C:\\Program\ Files\\PGI\\win64\\2011\\cuda\\3.2\\lib64\\cublas.lib -o test32.exe
cublasSgemm.f90:
PGI$ test40.exe
 Enter N:
1024
 Checking results....
 Total Time:    1.7000001E-02
 Total SGEMM gflops:     126.3226
 Done....
PGI$ test32.exe
 Enter N:
1024
 Checking results....
 Total Time:    1.5000000E-02
 Total SGEMM gflops:     143.1656
 Done....

Luzhang · December 6, 2011, 3:55am

Hi Mat,
Thank you for reply.

Based on your recommendation, I compiled with

PGI$ pgf90 -D_CUDAFOR -Mcuda=3.2 -V11.7 cublasSgemm.F90  C:\PGI\win64\2011\cuda\3.2\lib64\cublas.lib -o test32.exe 
PGI$ test32.exe 
 Enter N: 
1024 
 Checking results.... 
 Total Time:    2.7000000E-02 
 Total SGEMM gflops:     79.53643 
 Done....

PGI$ pgf90 -D_CUDAFOR -Mcuda=4.0 -V11.7 cublasSgemm.F90  C:\PGI\win64\2011\cuda\4.0\lib64\cublas.lib -o test40.exe 
PGI$ test40.exe 
 Enter N: 
1024 
0: ALLOCATE: 4194304 bytes requested; ststus= 35<CUDA driver version is insufficient for CUDA runtime version

Would you please tell me how I can solve this problem?

Then I tried to use the cublas libraries that ship with the compilers. I modified the code by removing the interface and the condition compilation (i.e. the #ifdef).

program cublasSgemm
use cudafor
use cublas
real, device, allocatable, dimension(:,:) :: dA, dB, dC
......
call sgemm('n','n', n, n, n, alpha, dA, n, dB, n, beta, dC, n)
......
end

Is it right?

thanks,
Luzhang

MatColgrove · December 6, 2011, 3:46pm

Hi Luzhang,

The error messages says:

CUDA driver version is insufficient for CUDA runtime version

Hence, you need to update your CUDA device driver in order to use CUDA 4.0. Goto http://developer.nvidia.com/cuda-toolkit-40 and look for the “Developer Driver” links.

Then I tried to use the cublas libraries that ship with the compilers. I modified the code by removing the interface and the condition compilation (i.e. the ifdef).
Is it right?

No. You must have an explicit interface when calling CUDA routines. Without an interface, F77 calling conventions are used and will result in errors.

Hope this helps,
Mat

Luzhang · December 8, 2011, 2:36am

Hi Mat,
Thank you for reply.

It works after I update my CUDA device driver.

Now I still have a question. I remove the interface and the condition compilation (i.e. the #ifdef) by palcing the “use cublas” statement in the host-code.

program cublasSgemm 
use cudafor 
use cublas 
real, device, allocatable, dimension(:,:) :: dA, dB, dC 
...... 
call sgemm('n','n', n, n, n, alpha, dA, n, dB, n, beta, dC, n) 
...... 
end

It works!

Then I use

call cublasSgemm('n','n', n, n, n, alpha, dA, n, dB, n, beta, dC, n)

It gets the same result as

call sgemm('n','n', n, n, n, alpha, dA, n, dB, n, beta, dC, n)

Would you please tell me the reason?
If I need to use the other CUBLAS routines, any informations or suggestions about it?

thanks,
Luzhang

MatColgrove · December 8, 2011, 5:11pm

Would you please tell me the reason?

When using the cublas module, “sgemm” is a generic interface which maps to cublasSgemm. Though unlike cublasSgemm which must be called device arrays, “sgemm” can be called with either host or device arrays.

If I need to use the other CUBLAS routines, any informations or suggestions about it?

The complete list of CUBLAS routines can be found at http://developer.download.nvidia.com/compute/DevZone/docs/html/CUDALibraries/doc/CUBLAS_Library.pdf. It my understanding that our cublas module has an interfaces for all currently available cublas routines. We also have an example on how to use cublas in chapter 5 of the CUDA Fortran Users Guide PGI Documentation Archive for Versions Prior to 17.7

Hope this helps,
Mat

Topic		Replies	Views
How to use (or call) cublas library Legacy PGI Compilers	4	11598	July 12, 2010
Problem running test_cublas sample Legacy PGI Compilers	7	6256	May 2, 2013
Undefined reference to cublas_ -- but library is there! Legacy PGI Compilers	5	12477	September 1, 2010
Fortran examples CUBLAS, Pinned memory, Interoperability CUDA Programming and Performance	4	5085	May 8, 2010
Using CUDA Libraries from CUDA Fortran Device Code Legacy PGI Compilers	6	7434	July 19, 2017
Compiling Fortran CUBLAS example CUDA Programming and Performance	2	5278	February 3, 2009
Calling cuBlas from a Fortran program Legacy PGI Compilers	9	769	June 8, 2020
how to use cublasSgemm_v2 ? Legacy PGI Compilers	11	12170	December 7, 2016
Problem using Cuda as a static library with C++ and Fortran on VS2012 CUDA Programming and Performance	3	720	February 23, 2017
0: copyout Memcpy... FAILED: 4(unspecified launch failure) Legacy PGI Compilers	3	5244	May 13, 2013

How to call cublas library into my cuda fortran code?

Related topics