I am in the transition period migrating from Intel Fortran Compiler to PVF 14.2 to try openacc. My system has GTX 760 and i7-4770k; and currently, my monitor is connected to GTX 760. After spending lots of time, I succeeded in compiling the code, but couldn’t run it because of the error 702. Please help me out.
I wanted to start from something simple. What the code below does is followings: there’s a 3-tuple indexed by (iz,ia,ih) which is interpreted as an individual with some preference that is measured by “vf”; each individual maximize its preference by choosing two variables indexed by (iia,iih); “temp_vf” is a big array that stores all possible measured preference for each choice of (iia,iih); the first loop executes to calculate “temp_vf”; and the second loop is to obtain “vf” which is just obtained by maxval of “temp_vf”
At the moment, I wouldn’t be concerned by data movement and just hope this code to run well.
124 subroutine dynamic_decision() 125 126 real(8), dimension(zn,an,hn,an,hn) :: temp_vf 127 real(8), dimension(2) :: policy_temp 128 real(8) :: c 129 integer :: iz,ia,ih,iia,iih 130 131 !$acc parallel loop 132 do iz = 1, zn 133 do ia = 1, an 134 do ih = 1, hn 135 136 do iia = 1, an 137 do iih = 1, hn 138 139 c = pol_inc(iz,ia,ih) + unitP*( hG(ih)-hG(iih) ) - aG(iia) 140 141 if (c <= 0.0d0) then 142 temp_vf(iz,ia,ih,iia,iih) = -1.0d10 143 else 144 temp_vf(iz,ia,ih,iia,iih) = ( ( c* (hG(ih)) ) ** (1.0d0-sig) ) 145 + beta * dot_product(zT(iz,:),old_vf(:,iia,iih)) 146 end if 147 148 end do 149 end do 150 end do 151 end do 152 end do 153 !$acc end parallel loop 154 !$acc parallel loop 155 do iz = 1, zn 156 do ia = 1, an 157 do ih = 1, hn 158 vf(iz,ia,ih) = maxval(temp_vf(iz,ia,ih,:,:)) 159 end do 160 end do 161 end do 162 !$acc end parallel loop 163 164 end subroutine
To give an extra information on the accelerating region:
131, Accelerator kernel generated 132, !$acc loop gang ! blockidx%x 144, !$acc loop vector(256) ! threadidx%x Sum reduction generated for zt$r 131, Generating present_or_copyin(old_vf(:zt$sd+old_vf$sd- 1,1:an,1:hn)) Generating present_or_copyin(zt(1:zn,:)) Generating present_or_copyin(ag(1:an)) Generating present_or_copyin(pol_inc(1:zn,1:an,1:hn)) Generating present_or_copyin(hg(1:hn)) Generating present_or_copyout(temp_vf(:zn,:an,:hn,:an,:hn)) Generating Tesla code 133, Loop is parallelizable 134, Loop is parallelizable 136, Loop is parallelizable 137, Loop is parallelizable 144, Loop is parallelizable 154, Accelerator kernel generated 155, !$acc loop gang ! blockidx%x 158, !$acc loop vector(256) ! threadidx%x Max reduction generated for temp_vf$r 154, Generating present_or_copyin(temp_vf(:zn,:an,:hn,:an,:hn)) Generating present_or_copyout(vf(1:zn,1:an,1:hn)) Generating Tesla code 156, Loop is parallelizable 157, Loop is parallelizable 158, Loop is parallelizable
The error I see when running the code:
Would anyone please kindly let me know how to fix this?