Undefined reference to cublas_ -- but library is there!

I am getting a compilation error when trying to compile some code for testing cublas.

This is the error message I am getting:

# pgf90 -Mcuda -o cublas_test cublas_test.cuf -L/opt/cuda/cuda/lib64 -lcublas
/tmp/pgf90tb-5Dhbc8WG.o: In function `.C1_293':
cublas_test.cuf:(.data+0x98): undefined reference to `cublas_'

The contents of the linked library at /opt/cuda/cuda/lib64 are:

# ls /opt/cuda/cuda/lib64/
libcublasemu.so         libcublas.so         libcudartemu.so         libcudart.so         libcufftemu.so         libcufft.so
libcublasemu.so.3       libcublas.so.3       libcudartemu.so.3       libcudart.so.3       libcufftemu.so.3       libcufft.so.3
libcublasemu.so.3.0.14  libcublas.so.3.0.14  libcudartemu.so.3.0.14  libcudart.so.3.0.14  libcufftemu.so.3.0.14  libcufft.so.3.0.14

Libcublas is definitely present.

I am using the sample code available at http://cudamusing.blogspot.com/2010/05/calling-cublas-from-cuda-fortran.html as it is given there, without any changes.

Am I missing something here?


EDIT: I am attaching the verbose output of the compilation; maybe it helps?

# pgfortran -Mcuda -o cublas_test cublas_test.cuf -L/opt/cuda/cuda/lib64 -lcublas -v

/opt/pgi/linux86-64/10.5/bin/pgf901 cublas_test.cuf -opt 1 -nohpf -nostatic -x 19 0x400000 -quad -x 59 4 -x 59 4 -x 15 2 -x 49 0x400004 -x 51 0x20 -x 57 0x4c -x 58 0x10000 -x 124 0x1000 -x 57 0xfb0000 -x 58 0x78031040 -x 48 4608 -x 49 0x100 -x 120 0x200 -stdinc /opt/pgi/linux86-64/10.5/include:/usr/local/include:/usr/lib/gcc/x86_64-linux-gnu/4.4.4/include:/usr/lib/gcc/x86_64-linux-gnu/4.4.4/include:/usr/include -def unix -def __unix -def __unix__ -def linux -def __linux -def __linux__ -def __NO_MATH_INLINES -def __x86_64__ -def __LONG_MAX__=9223372036854775807L -def '__SIZE_TYPE__=unsigned long int' -def '__PTRDIFF_TYPE__=long int' -def __THROW= -def __extension__= -def __amd64__ -def __SSE__ -def __MMX__ -def __SSE2__ -def __SSE3__ -def __SSSE3__ -freeform -x 137 1 -x 176 1 -vect 48 -x 137 1 -modexport /tmp/pgfortranrWhbZgWmWeBq.cmod -modindex /tmp/pgfortranzWhbl9q-fN5t.cmdx -output /tmp/pgfortranjWhbBR6Ctt7o.ilm
  0 inform,   0 warnings,   0 severes, 0 fatal for gemm_test
PGF90/x86-64 Linux 10.5-0: compilation successful

/opt/pgi/linux86-64/10.5/bin/pgf902 /tmp/pgfortranjWhbBR6Ctt7o.ilm -fn cublas_test.cuf -opt 1 -x 51 0x20 -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 19 0x400000 -x 28 0x40000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -quad -x 59 4 -x 59 4 -tp nehalem-64 -x 120 0x1000 -x 124 0x1400 -y 15 2 -x 57 0x3b0000 -x 58 0x48000000 -x 49 0x100 -x 120 0x200 -astype 0 -x 124 1 -x 137 1 -x 176 1 -x 137 1 -x 176 1 -cmdline '+pgfortran cublas_test.cuf -Mcuda -o cublas_test -L/opt/cuda/cuda/lib64 -lcublas -v' -asm /tmp/pgfortranHWhbJM4PIPcL.s
  0 inform,   0 warnings,   0 severes, 0 fatal for gemm_test
PGF90/x86-64 Linux 10.5-0: compilation successful

/usr/bin/as /tmp/pgfortranHWhbJM4PIPcL.s -o /tmp/pgfortranPWhb7ADTNit0.o

/opt/pgi/linux86-64/10.5/bin/pgappend -noerror /tmp/pgfortranPWhb7ADTNit0.o -name .IPDINFO /tmp/pgfortranrWhbZgWmWeBq.cmod -name .IPEINFO /tmp/pgfortranzWhbl9q-fN5t.cmdx

/usr/bin/ld /usr/lib64/crt1.o /usr/lib64/crti.o /opt/pgi/linux86-64/10.5/lib/trace_init.o /usr/lib/gcc/x86_64-linux-gnu/4.4.4/crtbegin.o /opt/pgi/linux86-64/10.5/lib/f90main.o -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /opt/pgi/linux86-64/10.5/lib/pgi.ld -L/opt/cuda/cuda/lib64 -L/opt/pgi/linux86-64/10.5/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-linux-gnu/4.4.4 /tmp/pgfortranPWhb7ADTNit0.o -lcublas -rpath /opt/pgi/linux86-64/10.5/lib -rpath /opt/pgi/linux86-64/2010/cuda/2.3/lib -o cublas_test -lcudafor -L/opt/pgi/linux86-64/2010/cuda/2.3/lib -lcudart -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lnspgc -lpgc -lrt -lpthread -lm -lgcc -lc -lgcc /usr/lib/gcc/x86_64-linux-gnu/4.4.4/crtend.o /usr/lib64/crtn.o
/tmp/pgfortranPWhb7ADTNit0.o: In function `.C1_293':
cublas_test.cuf:(.data+0x98): undefined reference to `cublas_'
pgfortran-Fatal-linker completed with exit code 1

Unlinking /tmp/pgfortranjWhbBR6Ctt7o.ilm
Unlinking /tmp/pgfortranrWhbZgWmWeBq.cmod
Unlinking /tmp/pgfortranzWhbl9q-fN5t.cmdx
Unlinking /tmp/pgfortranHWhbJM4PIPcL.s
Unlinking /tmp/pgfortranPWhb7ADTNit0.o

Hi,

Undefined reference cublas_ is actually a module name that you have in your source file. Make sure to remove any existing .mod files if you have any before compiling and also check all the names(module names) in your source file that they are consistent.

Let me know if that still does not work.

Hongyon

I have the folllowing module in cublas.cuf:

module cublas
 !
 ! Define the INTERFACE to the NVIDIA C code cublasSgemm and cublasDgemm
 !
 interface cuda_gemm
 !
 ! void cublasSgemm (char transa, char transb, int m, int n,
 ! int k, float alpha, const float *A, int lda,
 ! const float *B, int ldb, float beta, float *C, int ldc)
 !
  subroutine cuda_sgemm(cta, ctb, m, n, k,&
   alpha, A, lda, B, ldb, beta, c, ldc) bind(C,name='cublasSgemm')
   use iso_c_binding
   character(1,c_char),value :: cta, ctb
   integer(c_int),value :: m,n,k,lda,ldb,ldc
   real(c_float),value :: alpha,beta
   real(c_float), device, dimension(lda,*) :: A
   real(c_float), device, dimension(ldb,*) :: B
   real(c_float), device, dimension(ldc,*) :: C
  end subroutine cuda_sgemm

  !
  ! void cublasDgemm (char transa, char transb, int m, int n,
  ! int k, double alpha, const double *A, int lda,
  ! const double *B, int ldb, double beta, double *C, int ldc)
  !
  subroutine cuda_dgemm(cta, ctb, m, n, k,&
   alpha, A, lda, B, ldb, beta, c, ldc) bind(C,name='cublasDgemm')
   use iso_c_binding
   character(1,c_char),value :: cta, ctb
   integer(c_int),value :: m,n,k,lda,ldb,ldc
   real(c_double),value :: alpha,beta
   real(c_double), device, dimension(lda,*) :: A
   real(c_double), device, dimension(ldb,*) :: B
   real(c_double), device, dimension(ldc,*) :: C
  end subroutine cuda_dgemm

 end interface

end module cublas

I used to generate a .mod file using:

pgfortran -Mcuda -c cublas.cuf -L/opt/cuda/cuda/lib64 -lcublas

Which I load and call from within my main program with

use cublas
! a lot of lines skipped
call cuda_gemm ('N','N',m,n,k,alpha,a_d,m,b_d,k,beta,c_d,m)

I hope consistency is therefore maintained?

What I don’t understand, how can I not have .mod files? I need to have them for modules, don’t I?
Having removed the .mod file generated from cublas.cuf, I get the following compilation error when trying to compile cublas_test.cuf:

# pgfortran -Mcuda -o cublas_test cublas_test.cuf -L/opt/cuda/cuda/lib64 -lcublas
PGF90-F-0004-Unable to open MODULE file cublas.mod (cublas_test.cuf: 3)
PGF90/x86-64 Linux 10.5-0: compilation aborted

I can post the remaining source code, too - but it’s pretty much the same source as available in the link I posted in the initial post.

Hi,

If you put module in a different file, you will need to ink its object too. Sorry, no other way around.

For example,

% pgfortrans -Mcuda cublas.cuf cublas_test.cuf -I…

OR

% pgfortrans -Mcuda cublas.o cublas_test.cuf …

Hope this works.

Regarding remove mod files,
I mean if you have existing mod files from some older compilation that happens to be the same name in the same directory, remove them. I didn’t mean to remove the ones you just compile for this test.

Hongyon

I do not have CUDA hardware or software, so this is just a guess based on how PGI behaves on the host. My observations are in agreement with what Hongyon wrote, but may add to your understanding.

When you compile the source file cublas.cuf with the interfaces to create the module file, a dummy entry point cublas_ is created in the object file cublas.o. The only executable instruction in the object file is an immediate C3 (ret) instruction. One may guess that when the CUDA hardware is targeted there may be more code than this to do the handshaking.

Similar comments apply to precision.o.

That such stub routines are needed to be linked in when compiling for the co-processor, but not for the host, is a curiosity that perhaps the compiler authors could throw light on.

Try including cublas.o and precision.o in your link command. That is, use

# pgf90 -Mcuda -o cublas_test cublas_test.cuf cublas.o precision.o -L/opt/cuda/cuda/lib64 -lcublas

instead of

# pgf90 -Mcuda -o cublas_test cublas_test.cuf -L/opt/cuda/cuda/lib64 -lcublas

Let me stress once again that my guesses could be wrong. However, considering that the additional object files contain routine entry points with immediate return instructions, there could be no harm in trying my suggestion.

Now, that actually does make sense! I actually should have thought of that…

Thank you guys a lot!