compilation of device function with cuda fortran

EricMan · March 25, 2016, 5:12am

Hello,
the compiler is frustrating me with

PGF90-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Unsupported procedure (em_fw.cuf: 1)
PGF90/x86-64 Linux 15.7-0: compilation aborted
pgf90-Fatal-f902 completed with exit code 1

I don’t know what the problem could be. I followed the directives from the compiler manual and the kernel is compiled out of the box.
I invoke the compiler with

pgf90 -c -v -Mcuda -Minfo=all em_fw.cuf

and I obtain the output

Export PGI=/share/apps/pgi

/share/apps/pgi/linux86-64/15.7/bin/pgf901 em_fw.cuf -opt 1 -nohpf -nostatic -x 19 0x400000 -quad -x 59 4 -x 15 2 -x 49 0x400004 -x 51 0x20 -x 57 0x4c -x 58 0x10000 -x 124 0x1000 -tp haswell -x 57 0xfb0000 -x 58 0x78031040 -x 47 0x08 -x 48 4608 -x 49 0x100 -x 120 0x200 -stdinc /share/apps/pgi/linux86-64/15.7/include-gcc44:/share/apps/pgi/linux86-64/15.7/include:/usr/local/include:/usr/lib/gcc/x86_64-redhat-linux/4.4.7/include:/usr/include -cmdline '+pgf90 em_fw.cuf -c -v -Mcuda -Minfo=all' -def unix -def __unix -def __unix__ -def linux -def __linux -def __linux__ -def __NO_MATH_INLINES -def __x86_64 -def __x86_64__ -def __LONG_MAX__=9223372036854775807L -def '__SIZE_TYPE__=unsigned long int' -def '__PTRDIFF_TYPE__=long int' -def __THROW= -def __extension__= -def __amd_64__amd64__ -def __k8 -def __k8__ -def __SSE__ -def __MMX__ -def __SSE2__ -def __SSE3__ -def __SSSE3__ -idir /share/apps/pgi/linux86-64/2015/cuda/6.5/include -def _CUDA -ccff -freeform -x 137 1 -x 180 0x4000000 -cudaver 6.5 -vect 48 -y 54 1 -def __CUDA_API_VERSION=6050 -x 70 0x40000000 -x 189 0x8000 -y 163 0xc0000000 -x 189 0x10 -x 137 1 -modexport /tmp/pgf90VGFvn6SZp5q9.cmod -modindex /tmp/pgf90NGFv1l0KW0QO.cmdx -output /tmp/pgf90-GFv9Dx35pqg.ilm
  0 inform,   0 warnings,   0 severes, 0 fatal for cuda_em_fw
  0 inform,   0 warnings,   0 severes, 0 fatal for computeref
  0 inform,   0 warnings,   0 severes, 0 fatal for ref
PGF90/x86-64 Linux 15.7-0: compilation successful

/share/apps/pgi/linux86-64/15.7/bin/pgf902 /tmp/pgf90-GFv9Dx35pqg.ilm -fn em_fw.cuf -opt 1 -x 51 0x20 -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 19 0x400000 -x 28 0x40000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -x 125 0x20000 -x 117 0x1000 -quad -x 59 4 -tp haswell -x 120 0x1000 -x 124 0x1400 -y 15 2 -x 57 0x3b0000 -x 58 0x48000000 -x 49 0x100 -x 120 0x200 -astype 0 -x 137 1 -x 180 0x4000000 -cudaver 6.5 -x 176 0x100 -cudacap 20 -cudacap 30 -cudacap 35 -cudacap 50 -cudaver 6.5 -x 70 0x40000000 -x 124 1 -x 189 0x8000 -y 163 0xc0000000 -x 189 0x10 -y 189 0x4000000 -x 137 1 -x 180 0x4000000 -x 176 0x100 -cudacap 20 -cudacap 30 -cudacap 35 -cudacap 50 -cudaver 6.5 -x 0 0x1000000 -x 2 0x100000 -x 0 0x2000000 -x 161 53239 -x 162 53239 -cmdline '+pgf90 em_fw.cuf -c -v -Mcuda -Minfo=all' -asm /tmp/pgf90FGFvDQ16Kx2u.s
  0 inform,   0 warnings,   0 severes, 0 fatal for cuda_em_fw
  0 inform,   0 warnings,   0 severes, 0 fatal for computeref
PGF90-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Unsupported procedure (em_fw.cuf: 1)
PGF90/x86-64 Linux 15.7-0: compilation aborted
pgf90-Fatal-f902 completed with exit code 1

Unlinking /tmp/pgf90-GFv9Dx35pqg.ilm
Unlinking /tmp/pgf903GFvLnZT9_VH.stb
Unlinking /tmp/pgf90VGFvn6SZp5q9.cmod
Unlinking /tmp/pgf90NGFv1l0KW0QO.cmdx
Unlinking /tmp/pgf90FGFvDQ16Kx2u.s
Unlinking /tmp/pgf90xGFvf_c90sEB.ll

the function ref is declared as

 attributes(device) real function ref(n,d,c,rho,alf,theta,freq)
    implicit none
    integer, device, intent(in)                      :: n
    real, device, dimension(2:n), intent(in) :: d
    real, device, dimension(n+1), intent(in) :: c ,rho, alf
    real, device, intent(in)                 :: theta, freq
    complex, device, dimension(2:n)          :: zin
    complex, device, dimension(n+1)          :: z
    complex, device, dimension(n+1)          :: th
    complex, device, dimension(2:n)          :: s, phi
    complex, device, dimension(n+1)          :: v, k
    integer, device                                  :: i
[...]
end function ref

and I really don’t do anything fancy within the function (unless vector assignment is fancy and I never realised it).

Thanks in advance for any help!

MatColgrove · March 28, 2016, 5:47pm

Hi EricMan,

It’s probably a compiler generated procedure call that creates a temp array when calling routines where an array section is being passed in as one of the arguments. How is “ref” being called?

If you can, please send a reproducible example to PGI Customer Service (trs@pgroup.com) and ask them to forward the example to me. I can then confirm if this is the problem and might be able to offer suggested work arounds.

Thanks,
Mat

EricMan · March 31, 2016, 9:18pm

Hi mkcolg,

I do call ref from the kernel as

 attributes(global) subroutine computeRef(n,d,c,rho,alf,thetaArray,freqArray,r)
    implicit none
    integer(ib), value, intent(in) :: n
    real(rp), dimension(2:n), intent(in) :: d
    real(rp), dimension(n+1), intent(in) :: c ,rho, alf
    real(rp), dimension(10), intent(in) :: thetaArray
    real(rp), dimension(4) , intent(in) :: freqArray
    real(rp), dimension(4,10) :: r
    
    integer :: i,j

    i = (blockIdx%x-1)*blockDim%x + threadIdx%x    
    j = (blockIdx%y-1)*blockDim%y + threadIdx%y

    if ( ( i <= 4) .and. (j <= 10 ) ) then
       r(i,j) = ref(n,d,c,rho,alf,thetaArray(j),freqArray(i))
       !write(*,*) i, j, thetaArray(j), freqArray(i), r(i,j)
    end if
  end subroutine computeRef

I am sendin an email to the customer services attaching the test program I am not able to compile.

(Note that the codes compiles and works properly if I use -Mcuda=emu)

Thanks,
Eric

MatColgrove · March 31, 2016, 10:31pm

Hi Eric,

Thanks for sending in the example. The problem was a known issue where we weren’t handling complex return types properly in CUDA Fortran device functions. The error was fixed in the 16.1 release and I was able to successfully build and run your example.

Thanks!
Mat

% pgf90 em_ref_nlay3.cuf -fast -V15.7
PGF90-F-0000-Internal compiler error. Unhandled return type for function       4 (em_ref_nlay3.cuf: 120)
PGF90/x86-64 Linux 15.7-0: compilation aborted
% pgf90 em_ref_nlay3.cuf -fast -V16.3
% a.out
   0.8678  0.8695  0.6196  0.3704  0.4103  0.2182  0.3084  0.1129  0.2002  0.0822
   0.8587  0.8557  0.6013  0.5529  0.1219  0.2724  0.2374  0.0698  0.3867  0.2231
   0.8688  0.8652  0.3527  0.2064  0.1539  0.0528  0.1628  0.2037  0.2551  0.0981
   0.8729  0.8648  0.3843  0.5624  0.2415  0.3341  0.2845  0.0959  0.2797  0.1342

EricMan · April 9, 2016, 7:35am

I’ve been able to compile and run the example, with the same numerical results as the cpu code. Thanks a million.

Topic		Replies	Views
First try compile errors Legacy PGI Compilers	15	14354	August 29, 2013
compiler ask acc routine information for internal function Legacy PGI Compilers	12	20316	October 25, 2017
Problems with the device subprograms Legacy PGI Compilers	4	5918	September 11, 2013
Compiler failed to translate accelerator region Legacy PGI Compilers	9	6747	June 26, 2013
CUDA Fortran can not compile Legacy PGI Compilers	5	10191	August 28, 2014
error with derived types in PGI CUDA 10.4 Legacy PGI Compilers	8	13139	May 11, 2010
Signal 11 when compiling for profiling Legacy PGI Compilers	15	15596	September 10, 2015
PGF90-W-0155-Compiler failed ... with PGI 12.4 Legacy PGI Compilers	17	11287	August 30, 2012
Compiler failed to translate accelerator region Legacy PGI Compilers	1	3787	April 29, 2015
not-very-explicit compiler error (CUDA fortran) Legacy PGI Compilers	1	1786	January 26, 2011

compilation of device function with cuda fortran

Related topics