# Procedure in derived type data

Hi,

I have a question about parallelization of nested loops inside which procedures in derived type data are used. A template of my code is formulated and shown below:

``````      do k = bound%idx_min(3), bound%idx_max(3)
do j = bound%idx_min(2), bound%idx_max(2)
do i = bound%idx_min(1), bound%idx_max(1)

call find_bc_indicies_from_cell( [i,j,k], bound%face_label,        &
bound%gc_stencil_size,            &
low_idx, high_idx, dir )

call find_bc_normal_from_cell( [i,j,k], bound%face_label, gblock,  &
normal )

call bound%fill_ghost_cells(                                       &
normal,                                                     &
sblock%rho(    low_idx(1):high_idx(1):dir(1),               &
low_idx(2):high_idx(2):dir(2),               &
low_idx(3):high_idx(3):dir(3) ),             &
sblock%vel( :, low_idx(1):high_idx(1):dir(1),               &
low_idx(2):high_idx(2):dir(2),               &
low_idx(3):high_idx(3):dir(3) ),             &
sblock%p(      low_idx(1):high_idx(1):dir(1),               &
low_idx(2):high_idx(2):dir(2),               &
low_idx(3):high_idx(3):dir(3) ),             &
sblock%temp(   low_idx(1):high_idx(1):dir(1),               &
low_idx(2):high_idx(2):dir(2),               &
low_idx(3):high_idx(3):dir(3) ) )

end do
end do
end do
``````

I want to parallelize this nested loop with three layers, however I found that I need to call bound%fill_ghost_cells but bound is an derived type data (declared as “class(bc_t), pointer :: bound”) and class bc_t is defined as:

``````  type, abstract :: bc_t

! Is this boundary connected to another boundary
logical :: is_connected = .false.

logical :: overwrites_flux = .false.
! True  = this BC calculates its own flux at boundary face
! False = this BC uses MUSCL to calculate flux at boundary face

integer :: block_id        = -1  ! What parent block this is on
integer :: bc_label        = -1  ! What bc type this is
integer :: face_label      = -1  ! What face this is on

! Number of interior cells in the stencil to fill the GCs
integer :: gc_stencil_size = -1

! Indicies of the interior cells that are along this boundary
integer, dimension(3) :: idx_min = -1
integer, dimension(3) :: idx_max = -1

! Out direction of this face
integer, dimension(3) :: out_dir

type(nbor_info_t) :: nbor

contains

private

! Common Procedures
procedure, public,   pass :: init_bc
procedure, public,   pass :: fill_nbor_info
procedure, public,   pass :: is_xi_face
procedure, public,   pass :: is_eta_face
procedure, public,   pass :: is_zeta_face
procedure, public,   pass :: is_min_face
procedure, public,   pass :: find_nbor_idx
procedure, public,   pass :: find_my_idx
procedure, public, nopass :: extrapolate_through_face_to_gc1
procedure, public, nopass :: extrapolate_state_to_gc
procedure, public, nopass :: gc_from_face_jacobians

! Deferred Procedures
procedure(fill_gc_i),      public, deferred, pass :: fill_ghost_cells
procedure(gc_jacobians_i), public, deferred, pass :: ghost_cell_jacobians

! Procedures which are not implemented in general but are overwritten if
! this%overwrites_flux
procedure,                 public,           pass :: boundary_flux
procedure,                 public,           pass :: boundary_flux_jacobians

end type bc_t

type :: bc_holder_t

class(bc_t), allocatable :: bc

end type
``````

Here you can see that fill_ghost_cells is a deferred procedure in bc_t. My concern is: if I directly parallize this nested loop, would there be an segmentation fault? And, does OpenACC support calling procedures inside a derived type data? If not, then how should I tackle with this situation? I am a beginner of OpenACC, and I would greatly appreciuate any help! Thanks a lot!

The way this code is written, it will be nearly impossible to accelerate it using OpenACC. The time/registers and overhead for every thread computing the bounds, creating the structures to pass the bounds, etc. will far outweigh the amount of time each thread spends doing real work.

I’d try to simplify it on the CPU first. It will probably speed up there too.

Got it! Thanks!

Hi Brent,

Sorry that I have another question. Last time you mentioned that it would be better to simplify our CPU code first. However, our CFD code is not that easy to be modified as a guy in our lab rewrote part of the whole code using Fortran 95/03, such as derived type data. I am wondering whether OpenACC can support calling a derived type’s procedure or not. Thus, I formulated a very simple code in which boundary enforcement is done by calling a procedure inside derived data and I attached it here. I cannot compile the formulated code successfully, and I really wanted to know whether it is because OpenACC does not support derived type data or because there are mistakes here.

Thank you very much for your help!

Best,

Weicheng

Here is the code:

`````` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! vel_derived_type is used to create a structured data type that stores the !!
!! values of velocity at different nodes
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
module vel_derived_type

type soln_vel
real(8), allocatable, dimension(:,:,:) :: vel, vel_r
end type soln_vel

contains

!! allocate arrays
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine allocate_vel_array(imin, jmin, kmin, imax, jmax, kmax, soln)

integer, intent(in)           :: imin, jmin, kmin
integer, intent(in)           :: imax, jmax, kmax
type(soln_vel), intent(inout) :: soln

allocate(soln%vel(imin:imax,jmin:jmax,kmin:kmax),                         &
soln%vel_r(imin:imax,jmin:jmax,kmin:kmax))

end subroutine allocate_vel_array

!! deallocate arrays
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine deallocate_vel_array(imin, jmin, kmin, imax, jmax, kmax, soln)

integer, intent(in)           :: imin, jmin, kmin
integer, intent(in)           :: imax, jmax, kmax
type(soln_vel), intent(inout) :: soln

deallocate(soln%vel(imin:imax,jmin:jmax,kmin:kmax),                       &
soln%vel_r(imin:imax,jmin:jmax,kmin:kmax))

end subroutine deallocate_vel_array

end module vel_derived_type

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! boundary_setting is used to enforce boudary conditions
!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
module boundary_setting

type bc_t

! Indicies of the cells that are along this boundary
integer, dimension(3) :: idx_min = -1
integer, dimension(3) :: idx_max = -1

contains
procedure, public, nopass :: bc_wall
end type bc_t

contains

!! set wall boundary condition
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine bc_wall(imin,jmin,kmin,imax,jmax,kmax,i,j,vel_i,vel_ir)

!\$acc routine seq
implicit none

integer, intent(in) :: imin, jmin, kmin
integer, intent(in) :: imax, jmax, kmax
integer, value :: i, j
real(8), dimension(imin:imax,jmin:jmax,kmin:kmax), intent(inout) :: vel_i
real(8), dimension(imin:imax,jmin:jmax,kmin:kmax), intent(inout) :: vel_ir

call linear_extrap(imin,jmin,kmin,imax,jmax,kmax,i,j,vel_i,vel_ir)

end subroutine bc_wall

!! linear extrapolation
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine linear_extrap(imin,jmin,kmin,imax,jmax,kmax,i,j,vel_i,vel_ir)

!\$acc routine seq
implicit none

integer, intent(in) :: imin,jmin,kmin
integer, intent(in) :: imax,jmax,kmax
integer, value :: i, j
real(8), dimension(imin:imax,jmin:jmax,kmin:kmax), intent(inout) :: vel_i
real(8), dimension(imin:imax,jmin:jmax,kmin:kmax), intent(inout) :: vel_ir

vel_i(i,j,kmax) = 2*vel_i(i,j,kmax-1) - vel_i(i,j,kmax)
vel_ir(i,j,kmax) = 2*vel_i(i,j,kmax-1) - vel_ir(i,j,kmax)

end subroutine linear_extrap

end module boundary_setting

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! this program is used to test whether OpenACC direcives support procedures
!! in derived type data
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
program derived_procedure_test

use vel_derived_type
use boundary_setting

implicit none

type(soln_vel) :: soln
type(bc_t)     :: boundary

integer :: i, j, k
integer :: imin, jmin, kmin
integer :: imax, jmax, kmax

continue

imin = 1
jmin = 1
kmin = 1

imax = 3
jmax = 4
kmax = 5

call allocate_vel_array(imin, jmin, kmin, imax, jmax, kmax, soln)

do k = kmin, kmax
do j = jmin, jmax
do i = imin, imax
soln%vel(i,j,k) = k+j*10+i**2
soln%vel_r(i,j,k) = 0.0D0
enddo
enddo
enddo

boundary%idx_min = [imin, jmin, kmax]
boundary%idx_max = [imax, jmax, kmax]

!\$acc enter data create(soln)
!\$acc enter data copyin(soln%vel, soln%vel_r, boundary)

!\$acc kernels default(present) present(soln)
!\$acc loop independent
do j = boundary%idx_min(2), boundary%idx_max(2)
!\$acc loop independent
do i = boundary%idx_min(1), boundary%idx_max(1)
call boundary%bc_wall(imin, jmin, kmin, imax, jmax, kmax, i, j, &
soln%vel, soln%vel_r)
enddo
enddo
!\$acc end kernels

!\$acc exit data copyout(soln%vel, soln%vel_r)
!\$acc exit data delete(soln)

write(*,*) soln%vel(imax,2:jmax,kmax)
write(*,*) soln%vel_r(imax,2:jmax,kmax)

call deallocate_vel_array(imin, jmin, kmin, imax, jmax, kmax, soln)

end program derived_procedure_test
``````

If I do not use GPU, then it could compile and run without any issues. However, if I compile with “-acc -Minfo=accel” added, then it reports “unsupported statement type” error.

Yes, as you’ve found, we do not currently support derived-type procedures, type-bound procedures, or procedure pointers in GPU code.

Ok. Thank you very much!

Best,

Weicheng Xue

hi,
you can try the derived types in main program directly by using : !\$cuf kernel do <<<,>>> instead of acc loop directive, but it’s not efficient on acceleration at all. The more complicated the derived type, the much slower it works.

Pearlriver,