OpenMP 4.5 and Fortran Type bound procedure

fcouderc · July 10, 2023, 1:22pm

Dear,

Please find this modern Fortran program below with GPU offload using OpenMP

Using type bound procedure for my unk derived type generates a Segmentation fault (core dumped)

nvfortran --version
nvfortran 23.5-0 linuxarm64 target on aarch64 Linux -tp neoverse-n1

nvfortran -o exe -Minfo=mp -mp=gpu test.f90

If I break the type bound procedure and use classical SUBROUTINE with type(unk) as argument, it works
But I need to define map(to: sw, sw%h, sw%u ... and I don’t understand why sw is needed
If not, there is this error message:

Failing in Thread:1
Accelerator Fatal Error: call to cuMemcpyDtoHAsync returned error 700: Illegal address during kernel execution

Thanks very much for your answers :)

MODULE m_sw

   USE iso_fortran_env, only: ip => INT32, rp => REAL64

   implicit none

   TYPE :: unk

      real(rp), allocatable :: h(:)
      real(rp), allocatable :: u(:)
      real(rp), allocatable :: v(:)

   CONTAINS

      procedure, pass(sw) :: init => init_sw
      procedure, pass(sw) :: run => run_sw

   END TYPE unk

CONTAINS

   SUBROUTINE init_sw(sw, n)

      class(unk), intent(inout) :: sw
      integer(ip), intent(in) :: n

      allocate (sw%h(n), source=1._rp)
      allocate (sw%u(n), source=0._rp)
      allocate (sw%v(n), source=0._rp)

      !$omp target enter data map(to: sw, sw%h, sw%u, sw%v)

   END SUBROUTINE

   SUBROUTINE run_sw(sw, n, nt)

      class(unk), intent(inout) :: sw
      integer(ip), intent(in) :: n , nt

      integer(ip) :: ite , ic

      do ite = 1,nt
         !$omp target teams distribute parallel do
         do ic = 2,n-1
            sw%h(ic) = sw%h(ic) - ( sw%h(ic+1) * sw%u(ic+1) )
         end do
      end do
      !$omp exit target data map(delete: sw, sw%h, sw%u, sw%v)

   END SUBROUTINE

END MODULE m_sw

PROGRAM test

   USE m_sw

   implicit none

   type(unk) :: sw

   integer(ip), parameter :: n = 100
   integer(ip), parameter :: nt = 1000

   call sw%init( sw , n )

   call sw%run( sw , n , nt )

END PROGRAM test

MatColgrove · July 10, 2023, 6:39pm

Hi fcouderc, thanks for the report!

I was able to reproduce the error so filed TPR #33957 and sent it to engineering for review.

Note that the code doesn’t compile as is and needed a few fixes. Also, the loop has a forward dependency so not parallelizable, but that’s not the cause of the segv. The segv still occurs after I fix this.

One possible work around is to use OpenACC, which seems to work correctly. Here’s my modified code:

% cat test.F90
MODULE m_sw

   USE iso_fortran_env, only: ip => INT32, rp => REAL64

   implicit none

   TYPE :: unk

      real(rp), allocatable :: h(:)
      real(rp), allocatable :: hold(:)
      real(rp), allocatable :: u(:)
      real(rp), allocatable :: v(:)

   CONTAINS

      procedure, pass(sw) :: init => init_sw
      procedure, pass(sw) :: run => run_sw

   END TYPE unk

CONTAINS

   SUBROUTINE init_sw(sw, n)

      class(unk), intent(inout) :: sw
      integer(ip), intent(in) :: n

      allocate (sw%h(n), source=1._rp)
      allocate (sw%hold(n), source=1._rp)
      allocate (sw%u(n), source=4._rp)
      allocate (sw%v(n), source=3._rp)

   END SUBROUTINE

   SUBROUTINE run_sw(sw, n, nt)

      class(unk), intent(inout) :: sw
      integer(ip), intent(in) :: n , nt

      integer(ip) :: ite , ic
!$omp target data map(to:sw,sw%u) map(tofrom:sw%h) map(alloc:sw%hold)
!$acc data copyin(sw,sw%u) copy(sw%h) create(sw%hold)
      do ite = 1,nt
         sw%hold=sw%h
!$omp target update to(sw%hold)
!$acc update device(sw%hold)
         !$omp target teams loop
         !$acc parallel loop
         do ic = 2,n-1
            sw%h(ic) = sw%h(ic) - ( sw%hold(ic+1) * sw%u(ic+1) )
         end do
      end do
!$omp end target data
!$acc end data

   END SUBROUTINE

END MODULE m_sw

PROGRAM test

   USE m_sw

   implicit none

   type(unk) :: sw

   integer(ip), parameter :: n = 100
   integer(ip), parameter :: nt = 1

   !call init_sw( sw , n )
   call sw%init(  n )

   !call run_sw( sw , n , nt )
   call sw%run( n , nt )
   print *, sw%h(1:10)

END PROGRAM test
% nvfortran test.F90 -mp=gpu  ; a.out
Segmentation fault (core dumped)
% nvfortran test.F90 -acc=gpu ; a.out
    1.000000000000000        -3.000000000000000        -3.000000000000000
   -3.000000000000000        -3.000000000000000        -3.000000000000000
   -3.000000000000000        -3.000000000000000        -3.000000000000000
   -3.000000000000000

-Mat

Topic		Replies	Views
Error compiling with GPU accelerated OpenMP and Debug flag nvc, nvc++ and nvfortran	4	825	October 12, 2022
OpenMP error: use of undefined value nvc, nvc++ and nvfortran	7	615	September 1, 2023
Allocatable arrays made private for openmp leads to out of bounds (0x0) writes with nvfortran nvc, nvc++ and nvfortran	17	434	June 21, 2024
Runtime error with nvfortran 20.7 nvc, nvc++ and nvfortran	7	815	March 24, 2022
Internal Compiler error in nvfortran with OMP-GPU offloading nvc, nvc++ and nvfortran hpc	3	657	May 17, 2022
Bounds error with OpenMP using array reduction in nvhpc 21.11 nvc, nvc++ and nvfortran	3	600	January 21, 2022
ICE: Stack error while lowering nvc, nvc++ and nvfortran	2	203	May 13, 2024
Parallelizing with a fortran routine Legacy PGI Compilers	4	3070	December 13, 2019
OpenACC + GPU: Attempt to call method of allocated derived type class: "Call to NVHPC runtime function not supported - pgf90_test_and_set_type_i8" nvc, nvc++ and nvfortran	11	74	October 10, 2024
Nvfortran 23.7 Fails to Recognize !$acc routine Directive in Modules Compiler Version: NGC GPU Cloud gpu-computing	0	8	April 24, 2025

OpenMP 4.5 and Fortran Type bound procedure

Related topics