Fortran 2003 derived types in accelerator region


I am trying to gpu-accelerate a large CFD code using acc directives (PGI 12.10 compiler). The code is written in Fortran 2003, and makes extensive use of derived types including allocatable arrays within derived types (which do not seem to be supported inside accelerator regions). I found an earlier post on this forum ( stating that support for allocatables inside derived types was being considered / worked on–I was wondering if and how soon this feature might be incorporated into the compiler?

An example of the type of loop I need to run is shown below.

    do k = 1, k_cells
      do j = 1, j_cells
        do i = 1, i_cells
          a = speed_of_sound( sblock%p(i,j,k), sblock%rho(i,j,k) )

          x_n = half*(grid_vars%xi_n(:,i,j,k) + grid_vars%xi_n(:,i+1,j,k))
          y_n = half*(grid_vars%eta_n(:,i,j,k) + grid_vars%eta_n(:,i,j+1,k))
          z_n = half*(grid_vars%zeta_n(:,i,j,k) + grid_vars%zeta_n(:,i,j,k+1))

          x_face_area = half *                                                 &
            ( grid_vars%xi_area(i,j,k) + grid_vars%xi_area(i+1,j,k) )
          y_face_area = half *                                                 &
            ( grid_vars%eta_area(i,j,k) + grid_vars%eta_area(i,j+1,k) )
          z_face_area = half *                                                 &
            ( grid_vars%zeta_area(i,j,k) + grid_vars%zeta_area(i,j,k+1) )

          sblock%dt(i,j,k) = CFL * grid_vars%volume(i,j,j) /                   &
            ( (abs(dot_product(sblock%vel(:,i,j,k), x_n)) + a)*x_face_area +   &
              (abs(dot_product(sblock%vel(:,i,j,k), y_n)) + a)*y_face_area +   &
              (abs(dot_product(sblock%vel(:,i,j,k), z_n)) + a)*z_face_area )

          sblock%dt_min = min(sblock%dt_min, sblock%dt(i,j,k))

        end do
      end do
    end do

If statements such as grid_vars%xi_area(i,j,k) are not likely to be supported in acc regions soon, I will probably be looking at a significant refactor.

Thank you,

Hi 64ext,

Unfortunately no progress has been made of this one. It’s proven to be extremely difficult to implement on an accelerator.