Dear all,
we use pointer procedures for the easy handling of a set of different schemes implemented into a CFD solver, however, we have found issues in compiling our code, see the snippets below. We understand that the offloading of pointer procedures could be currently not supported (at least with nvfortran, gfortran seems to have a workaround), thus we have a couple of questions;
-
Can pointer procedures not be offloaded on the device or is our approach wrong (see the example below)? In this case, can you drive us to the correct approach to offload pointer procedures?
-
assuming that pointer procedures offloading is not currently supported, do you have any suggestions on handling multiple schemes implementations other than the (ab)use of preprocessing flags?
Thank you in advance for your help, it is appreciated.
Kind regards,
Stefano
Minimal Working Example
program test_pointer_procedure
use openacc
implicit none
integer, parameter :: ni = 1000
integer :: i
real :: a(ni)
procedure(subroutine_template), pointer :: do_work
interface
subroutine subroutine_template(i, x)
integer, intent(in) :: i
real, intent(out) :: x
endsubroutine subroutine_template
endinterface
do_work => do_work_ok
!$acc enter data create(a, do_work)
!$acc parallel loop present(a)
do i=1, ni
call do_work(i=i, x=a(i))
enddo
!$acc exit data copyout(a) delete(do_work)
print *, ' work ok', maxval(a)
do_work => do_work_ko
!$acc enter data create(a)
!$acc parallel loop present(a, do_work)
do i=1, ni
call do_work(i=i, x=a(i))
enddo
!$acc exit data copyout(a) delete(do_work)
print *, ' work ko', maxval(a)
contains
subroutine do_work_ok(i, x)
integer, intent(in) :: i
real, intent(out) :: x
!$acc routine(do_work_ok)
x = real(i)
endsubroutine do_work_ok
subroutine do_work_ko(i, x)
integer, intent(in) :: i
real, intent(out) :: x
!$acc routine(do_work_ko)
x = -real(i)
endsubroutine do_work_ko
endprogram test_pointer_procedure
Compiling with gfrotran (v14.02) and running we got the correct output:
└──────╼ ./test_pointer_procedure
work ok 1000.00000
work ko -1.00000000
Compiling with nvfortan (v24.11-0) we got an ICE
└──────╼ nvfortran -acc -gpu=cc89 -fast -Minfo=all test_pointer_procedure.f90 -o test_pointer_procedure
NVFORTRAN-S-0000-Internal compiler error. size_of: bad dtype 39 (test_pointer_procedure.f90: 36)
NVFORTRAN-W-0155-Data clause needed for exposed use of pointer do_work$sd (test_pointer_procedure.f90: 21)
NVFORTRAN-S-0155-Accelerator region ignored; see -Minfo messages (test_pointer_procedure.f90: 21)
NVFORTRAN-S-0000-Internal compiler error. size_of: bad dtype 39 (test_pointer_procedure.f90: 20)
NVFORTRAN-S-0000-Internal compiler error. size_of: bad dtype 39 (test_pointer_procedure.f90: 28)
NVFORTRAN-S-0000-Internal compiler error. size_of: bad dtype 39 (test_pointer_procedure.f90: 33)
test_pointer_procedure:
20, Generating enter data create(do_work,a(:))
21, Accelerator restriction: size of the GPU copy of do_work$sd is unknown
Accelerator region ignored
22, Loop not vectorized/parallelized: contains call
25, Generating exit data delete(do_work)
Generating exit data copyout(a(:))
26, maxval reduction inlined
Loop not fused: function call before adjacent loop
Generated vector simd code for the loop containing reductions
28, Generating enter data create(a(:))
29, Generating present(do_work,a(:))
Generating NVIDIA GPU code
30, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
30, Loop not vectorized/parallelized: contains call
33, Generating exit data delete(do_work)
Generating exit data copyout(a(:))
34, maxval reduction inlined
Loop not fused: function call before adjacent loop
Generated vector simd code for the loop containing reductions
0 inform, 1 warnings, 5 severes, 0 fatal for test_pointer_procedure
do_work_ok:
37, Generating acc routine seq
Generating NVIDIA GPU code
do_work_ko:
44, Generating acc routine seq
Generating NVIDIA GPU code