Undefined Symbol calling attributes(global) subroutine

I am posting this for L. Mauricio Merino:

Now the problem is when I started to write my own code, a small fracture mechanics program, the only idea is do some simple operations at GPU doesn’t matter no speed-up improve because is sequential code. It compiled without errors, I used

UTSA-I:Project root# pgfortran -Mcuda -c -g fracturegpu.CUF
NOTE: your trial license will expire in 7 days, 11.8 hours.
NOTE: your trial license will expire in 7 days, 11.8 hours.
UTSA-I:Project root# pgfortran -Mcuda -o fracturegpu fracturegpu.o
Undefined symbols:
  "_fracturekernel_", referenced from:
      _MAIN_ in fracturegpu.o
     (maybe you meant: __fracturekernel___entry, _fracturekernel___entry )
ld: symbol(s) not found for inferred architecture i386
UTSA-I:Project root#

So, as you can see I have a problem linking and I don’t understand where is my mistake, I already tried put CPU code and GPU in different files and the result was the same, and interesting thing is, I put my needed operations of the fractiongpu program inside a copy of matmul and it works with a correct result. Others fast questions are:

  • Is not needed specify the variables as INTENT(IN) or INTENT(OUT) inside a device subroutine?
  • If I need to do a simple sequential operation, is ok use just one block and one thread? (I think use more is a waste or GPU resources)
  • for the last question, is not necessary use a statement like i=(blockidx%x-1)*bockdi…, right?
  • when I need to use variables in GPU (not arrays) is ok not use on them allocatable attribute, allocate and deallocate them?


L. Mauricio Merino
Research Scholar
Computational Reliability and Visualization Laboratory, UTSA
GIBUP Biomedical Engineering Group, University of Pamplona

The solution to the linker problem is to either use an interface block in the main routine:

        ATTRIBUTES(GLOBAL) SUBROUTINE fracturekernel (C,M,temp,temp1,temp2)
        REAL, DEVICE, INTENT(OUT) :: temp,temp1,temp2    !my device variables
        REAL, VALUE, INTENT(IN) :: C,M                   !my device constants
        end subroutine
    end interface

or to put the subroutine in a Fortran Module and USE the module in the main routine. The interface to the kernel subroutine must be explicit to the caller.

For the other questions, intent() specifications are (as in any Fortran routine) optional.

You can launch a kernel with one block and specify one thread, but you will always get a full warp (32 threads) at a minimum. Your program must then include a conditional to make sure that only one thread executes the code.

I’m not sure what your third questions is.

Global variables on the GPU can be declared in a Fortran module as:

module foo
  real, device :: globala
  attributes(global) subroutine kernel(...)
    ...here use globala...
  end subroutine
end module