I am in the transition period migrating from Intel Fortran Compiler to PVF 14.2 to try openacc. My system has GTX 760 and i7-4770k; and currently, my monitor is connected to GTX 760. After spending lots of time, I succeeded in compiling the code, but couldn’t run it because of the error 702. Please help me out.
I wanted to start from something simple. What the code below does is followings: there’s a 3-tuple indexed by (iz,ia,ih) which is interpreted as an individual with some preference that is measured by “vf”; each individual maximize its preference by choosing two variables indexed by (iia,iih); “temp_vf” is a big array that stores all possible measured preference for each choice of (iia,iih); the first loop executes to calculate “temp_vf”; and the second loop is to obtain “vf” which is just obtained by maxval of “temp_vf”
At the moment, I wouldn’t be concerned by data movement and just hope this code to run well.
124 subroutine dynamic_decision()
125
126 real(8), dimension(zn,an,hn,an,hn) :: temp_vf
127 real(8), dimension(2) :: policy_temp
128 real(8) :: c
129 integer :: iz,ia,ih,iia,iih
130
131 !$acc parallel loop
132 do iz = 1, zn
133 do ia = 1, an
134 do ih = 1, hn
135
136 do iia = 1, an
137 do iih = 1, hn
138
139 c = pol_inc(iz,ia,ih) + unitP*( hG(ih)-hG(iih) ) - aG(iia)
140
141 if (c <= 0.0d0) then
142 temp_vf(iz,ia,ih,iia,iih) = -1.0d10
143 else
144 temp_vf(iz,ia,ih,iia,iih) = ( ( c* (hG(ih)) ) ** (1.0d0-sig) )
145 + beta * dot_product(zT(iz,:),old_vf(:,iia,iih))
146 end if
147
148 end do
149 end do
150 end do
151 end do
152 end do
153 !$acc end parallel loop
154 !$acc parallel loop
155 do iz = 1, zn
156 do ia = 1, an
157 do ih = 1, hn
158 vf(iz,ia,ih) = maxval(temp_vf(iz,ia,ih,:,:))
159 end do
160 end do
161 end do
162 !$acc end parallel loop
163
164 end subroutine
To give an extra information on the accelerating region:
131, Accelerator kernel generated
132, !$acc loop gang ! blockidx%x
144, !$acc loop vector(256) ! threadidx%x
Sum reduction generated for zt$r
131, Generating present_or_copyin(old_vf(:zt$sd+old_vf$sd- 1,1:an,1:hn))
Generating present_or_copyin(zt(1:zn,:))
Generating present_or_copyin(ag(1:an))
Generating present_or_copyin(pol_inc(1:zn,1:an,1:hn))
Generating present_or_copyin(hg(1:hn))
Generating present_or_copyout(temp_vf(:zn,:an,:hn,:an,:hn))
Generating Tesla code
133, Loop is parallelizable
134, Loop is parallelizable
136, Loop is parallelizable
137, Loop is parallelizable
144, Loop is parallelizable
154, Accelerator kernel generated
155, !$acc loop gang ! blockidx%x
158, !$acc loop vector(256) ! threadidx%x
Max reduction generated for temp_vf$r
154, Generating present_or_copyin(temp_vf(:zn,:an,:hn,:an,:hn))
Generating present_or_copyout(vf(1:zn,1:an,1:hn))
Generating Tesla code
156, Loop is parallelizable
157, Loop is parallelizable
158, Loop is parallelizable
The error I see when running the code:
and also
Would anyone please kindly let me know how to fix this?
Best,