Invalid loop error in openacc

Hi, I would like to seek some help to solve my problem in compiling an OpenACC directed program. The program is simplified and shown below:

module math
contains
SUBROUTINE pnm (A,N,IWK,WK)
!$acc routine seq
IMPLICIT NONE
INTEGER4 N,IWK(6N+150)
DOUBLE PRECISION WK(6N+150)
COMPLEX
16 A(N)
INTEGER*4 K,ILL,JJ,K0,KB,JK,I,J,ITA,ITB

15 K = -IWK(ILL+K)
JJ = K0
K0 = JK * K + KB
I = 0
IF (K .NE. J) GO TO 15
55 A(K0+I+1) = DCMPLX(WK(ITA+I),WK(ITB+I))
I = I + 1
IF (I .LT. JK) GO TO 55

  END

end module math

program main
use math
implicit none

integer(kind=4)::nmax,n_north,n_east,i,j,k,m,num_grid,prd
integer,allocatable:: iwk(:)
real(kind=8),allocatable::grav(:,:),wk(:)
real(kind=8):: error
character(len=40) filedgcombine
complex*16,allocatable:: pnmdata_cpx(:)

write(*,*) "please input n_east,n_north,num_grid and nmax"
read(*,*) n_east,n_north,num_grid,nmax

allocate(grav(num_grid,4))
allocate(pnmdata_cpx(n_east))
allocate(iwk(6*n_east+150))
allocate(wk(6*n_east+150))
write(*,*) "please input the gravity anomaly file name and error "
read(*,*) filedgcombine,error
open(unit=10,file=filedgcombine)
do i=1,num_grid
	read(10,*) grav(i,1),grav(i,2),grav(i,3),grav(i,4)
end do
close(10)     

!$acc kernels
!$acc loop independent private(pnmdata_cpx)
do i=1, n_north
	pnmdata_cpx=dcmplx(0.D0,0.D0)
	!$acc loop independent
	do j=1, n_east
	        pnmdata_cpx(j)=dcmplx(grav((i-1)*n_east+j,4))
	end do
	call pnm(pnmdata_cpx,n_east,iwk,wk)
end do   ! loop i
!$acc end kernels

end ! the main program

If I compile it with the following command:
mpif90 -acc -gpu=cc70 -gpu=cuda11.0 -Minfo=accel example.f90 -o example
The error information are shown below:
pnm:
0, Accelerator region ignored
16, Accelerator restriction: invalid loop
0 inform, 0 warnings, 1 severes, 0 fatal for pnm
main:
51, Generating implicit copy(wk(:),n_east) [if not already present]
Generating implicit copyin(grav(:,4)) [if not already present]
Generating implicit copy(iwk(:)) [if not already present]
53, Loop is parallelizable
Generating Tesla code
53, !$acc loop gang ! blockidx%x
54, !$acc loop vector(128) ! threadidx%x
56, !$acc loop vector(128) ! threadidx%x
54, Loop is parallelizable
56, Loop is parallelizable

It says the loop 15 in the subroutine is invalid loop. There are two loop in the subroutine. The other loop is loop 55. However, if I delete or comment the loop 55, which means the program is changed, the program could be successfully compiled.

I could not find some problems in the loop 15. Moreover, the program could be successfully compiled to a cpu-version program with the command “gfortran”

Could someone tell why the loop could be compiled to GPU program?

Many thanks!

Hi wliang246,

0, Accelerator region ignored
16, Accelerator restriction: invalid loop

We didn’t add support for parallelization of GOTO loops since they are rarely used any longer and considered poor practice. You should consider updating these to use “do while” loops instead. Though given “pmn” is a sequential device routine, the compiler wouldn’t parallelize these loops anyway (besides being while loops which can’t be parallelized), so the messages are somewhat extraneous.

Though, the “pmn” routine does not look correct in that the “J” and “JK” variables are uninitialized. Maybe they are from a common block that wasn’t translated to a module variable? In any event, I suggest something like the following, though please correct the initial values:

module math
contains
SUBROUTINE pnm (A,N,IWK,WK)
!$acc routine seq
IMPLICIT NONE
INTEGER N,IWK(N+150)
DOUBLE PRECISION WK(N+150)
COMPLEX*16 A(N)
INTEGER*4 K,ILL,JJ,K0,KB,JK,I,J,ITA,ITB

J=0
JK=N
K=1
do while (K.NE.J)
K = -IWK(ILL+K)
JJ = K0
K0 = JK * K + KB
enddo
I = 0
do while(I.LT.JK)
A(K0+I+1) = DCMPLX(WK(ITA+I),WK(ITB+I))
I = I + 1
ENDDO

END
end module math


% nvfortran main.f90 -acc -Minfo=accel
pnm:
      3, Generating acc routine seq
         Generating Tesla code
main:
     54, Generating implicit copy(wk(:),n_east) [if not already present]
         Generating implicit copyin(grav(:,4)) [if not already present]
         Generating implicit copy(iwk(:)) [if not already present]
     56, Loop is parallelizable
         Generating Tesla code
         56, !$acc loop gang ! blockidx%x
         57, !$acc loop vector(128) ! threadidx%x
         59, !$acc loop vector(128) ! threadidx%x
     57, Loop is parallelizable
     59, Loop is parallelizable

-Mat

Hi, Mat,

Many thanks for your help!

After updating the GOTO loops to “do while” loops, the program works now.

With respect to the initialization, as these are only part of the whole subroutine, there are really some lines for the initialization, which are now shown here.