Question about the openACC data regions

Hello Mat.
I have some question about openACC

I added the code, compile message, result.

I usually use this compile option.
“ACC = -fast -acc -Minfo=accel -ta=tesla:cc80”
but when i want to compare the result between cpu / gpu used above option.
“ACC = -fast -acc -Minfo=accel -ta=tesla:autocompare”

Q1.
I don’t know why i have to use present clause, because i already use !$acc enter data copyin RHS1 .
In my opnions i dont need use present clause at parallel loop. Is it right?
When i doensn’t use present clause, it looks RHS1 didn’t copy the parallel loop.
make copy because of enter data copy regions but didn’t copy at parallel loop.
Generating enter data copyin(rhs1( 0 :m1, 0 :m2, 0 :m3, 3 ))

Q2.
As you can see the Sample1, Sample2 code only different at RHS1

Sample1
!$acc parallel loop collapse(2) present(RHS1, AK, AKUV, TNUY, FIXKL, CK, CKUV, FIXKU, BK, GK)

Sample2
!$acc parallel loop collapse(2) copy(RHS1) present(AK, AKUV, TNUY, FIXKL, CK, CKUV, FIXKU,

why the result is different?

Sample 1 code

!$acc enter data copyin ( AK(1:M1, 1:M3), BK(1:M1, 1:M3), CK(1:M1, 1:M3), GK(1:M1, 1:M3) )
!$acc enter data copyin ( AI(1:M2, 1:M1), BI(1:M2, 1:M1), CI(1:M2, 1:M1), GI(1:M2, 1:M1) )
!$acc enter data copyin ( AJ(1:M1, 1:M2), BJ(1:M1, 1:M2), CJ(1:M1, 1:M2), GJ(1:M1, 1:M2) )
 
!$acc enter data copyin ( RHS1(0:M1, 0:M2, 0:M3, 3) )
!$acc enter data copyin ( FIXIL(1:M1M), FIXIU(1:M1M), FIXJL(1:M2M), FIXJU(1:M2M), FIXKL(1:M3M), FIXKU(1:M3M) )
!$acc enter data copyin ( TNU (0:M1, 0:M2, 0:M3), TNUX(0:M1, 0:M2, 0:M3), TNUY(0:M1, 0:M2, 0:M3), TNUZ(0:M1, 0:M2, 0:M3) )
!$acc enter data copyin ( AKW(1:M3), CKW(1:M3), AKUV(1:M3), CKUV(1:M3) )
!$acc enter data copyin ( XMP(0:M1), YMP(0:M2), ZMP(0:M3) )
!$acc enter data copyin ( X(M1), Y(M2), Z(M3) )
 
!=====ADI STARTS
if (N3M /= 1) then
    !-----Z-DIRECTION
    do J=1,N2M 
        !$acc parallel loop collapse(2) present(RHS1, AK, AKUV, TNUY, FIXKL, CK, CKUV, FIXKU, BK, GK)
        do K=1,N3M
        do I=I_BGPX,N1M
            AK(I,K)=AKUV(K)*(1.+CRE*TNUY(I,J,K)) *(1.-FIXKL(K)*FLOAT(KUB))
            CK(I,K)=CKUV(K)*(1.+CRE*TNUY(I,J,K+1))*(1.-FIXKU(K)*FLOAT(KUT))
             
            IF (IVELSRC .EQ. 1) THEN
                PRIK=0. 
                BK(I,K)=ACOEFI*(1+ACOEF*PRIK)-AK(I,K)-CK(I,K)
                GK(I,K)=ACOEFI*(1+ACOEF*PRIK)*RHS1(I,J,K,1)
            ELSE
                BK(I,K)=ACOEFI-AK(I,K)-CK(I,K)
                GK(I,K)=ACOEFI*RHS1(I,J,K,1)
            ENDIF
        enddo
        enddo
        !$acc end loop
    enddo
endif

Sample 1 - compile message

lhsu:
   1783, Generating enter data copyin(ck(1:m1,1:m3),bk(1:m1,1:m3),ak(1:m1,1:m3),gk(1:m1,1:m3))
   1784, Generating enter data copyin(ci(1:m2,1:m1),bi(1:m2,1:m1),ai(1:m2,1:m1),gi(1:m2,1:m1))
   1785, Generating enter data copyin(cj(1:m1,1:m2),bj(1:m1,1:m2),aj(1:m1,1:m2),gj(1:m1,1:m2))
   1788, Generating enter data copyin(rhs1(0:m1,0:m2,0:m3,3))
   1789, Generating enter data copyin(fixkl(1:m3m),fixju(1:m2m),fixjl(1:m2m),fixiu(1:m1m),fixil(1:m1m),fixku(1:m3m))
   1790, Generating enter data copyin(tnuy(0:m1,0:m2,0:m3),tnux(0:m1,0:m2,0:m3),tnu(0:m1,0:m2,0:m3),tnuz(0:m1,0:m2,0:m3))
   1791, Generating enter data copyin(akw(1:m3),ckuv(1:m3),akuv(1:m3),ckw(1:m3))
   1792, Generating enter data copyin(ymp(0:m2),xmp(0:m1),zmp(0:m3))
   1793, Generating enter data copyin(z(m3),y(m2),x(m1))
   1799, Generating present(gk(:,:),ak(:,:),tnuy(:,:,:),rhs1(:,:,:,:),ckuv(:),bk(:,:),ck(:,:),fixkl(:),akuv(:),fixku(:))
         Generating NVIDIA GPU code
       1800, !$acc loop gang, vector(128) collapse(2) ! blockidx%x threadidx%x
       1801,   ! blockidx%x threadidx%x collapsed
   1898, Generating exit data delete(ck(1:m1,1:m3),bk(1:m1,1:m3),ak(1:m1,1:m3),gk(1:m1,1:m3))
   1899, Generating exit data delete(ci(1:m2,1:m1),bi(1:m2,1:m1),ai(1:m2,1:m1),gi(1:m2,1:m1))
   1900, Generating exit data delete(cj(1:m1,1:m2),bj(1:m1,1:m2),aj(1:m1,1:m2),gj(1:m1,1:m2))
   1902, Generating exit data copyout(rhs1(0:m1,0:m2,0:m3,3))
   1903, Generating exit data copyout(fixkl(1:m3m),fixju(1:m2m),fixjl(1:m2m),fixiu(1:m1m),fixil(1:m1m),fixku(1:m3m))
   1904, Generating exit data copyout(tnuy(0:m1,0:m2,0:m3),tnux(0:m1,0:m2,0:m3),tnu(0:m1,0:m2,0:m3),tnuz(0:m1,0:m2,0:m3))
   1905, Generating exit data copyout(akw(1:m3),ckuv(1:m3),akuv(1:m3),ckw(1:m3))
   1906, Generating exit data copyout(ymp(0:m2),xmp(0:m1),zmp(0:m3))
   1907, Generating exit data copyout(z(m3),y(m2),x(m1))

Sample 1 - result

rhs1 lives at 0x145f6d956c80 size 156558528 partially present
Present table dump for device[1]: NVIDIA Tesla GPU 0, compute capability 8.0, threadid=1
host:0x768410 device:0x145ea39fb200 size:128 presentcount:0+1 line:1793 name:descriptor
host:0x7684a0 device:0x145ea39fb600 size:128 presentcount:0+1 line:1793 name:descriptor
host:0x768530 device:0x145ea39fba00 size:128 presentcount:0+1 line:1793 name:descriptor
host:0x768920 device:0x145ea33fce00 size:128 presentcount:0+1 line:1789 name:descriptor
host:0x7689b0 device:0x145ea33fd800 size:128 presentcount:0+1 line:1789 name:descriptor
host:0x768a40 device:0x145ea33fde00 size:128 presentcount:0+1 line:1789 name:descriptor
host:0x768ad0 device:0x145ea33fe400 size:128 presentcount:0+1 line:1789 name:descriptor
host:0x768b60 device:0x145ea33fee00 size:128 presentcount:0+1 line:1789 name:descriptor
host:0x768bf0 device:0x145ea33ff800 size:128 presentcount:0+1 line:1789 name:descriptor
host:0x769340 device:0x145ea39f9c00 size:128 presentcount:0+1 line:1792 name:descriptor
host:0x7693d0 device:0x145ea39fa200 size:128 presentcount:0+1 line:1792 name:descriptor
host:0x769460 device:0x145ea39fae00 size:128 presentcount:0+1 line:1792 name:descriptor
host:0x76a150 device:0x145ea39f6c00 size:128 presentcount:0+1 line:1791 name:descriptor
host:0x76a1e0 device:0x145ea39f7800 size:128 presentcount:0+1 line:1791 name:descriptor
host:0x76a270 device:0x145ea39f8400 size:128 presentcount:0+1 line:1791 name:descriptor
host:0x76a300 device:0x145ea39f9000 size:128 presentcount:0+1 line:1791 name:descriptor
host:0x76e4d0 device:0x145ea33fc400 size:272 presentcount:0+1 line:1788 name:descriptor
host:0x76f4f0 device:0x145ea33ffa00 size:224 presentcount:0+1 line:1790 name:descriptor
host:0x76f5e0 device:0x145ea33ffc00 size:224 presentcount:0+1 line:1790 name:descriptor
host:0x76f6d0 device:0x145ea33ffe00 size:224 presentcount:0+1 line:1790 name:descriptor
host:0x76f7c0 device:0x145ea39f6000 size:224 presentcount:0+1 line:1790 name:descriptor
host:0x1f0df60 device:0x145ea39fb000 size:8 presentcount:0+1 line:1793 name:x
host:0x1f0e290 device:0x145ea39fb400 size:8 presentcount:0+1 line:1793 name:y
host:0x1f0eac0 device:0x145ea39fb800 size:8 presentcount:0+1 line:1793 name:z
host:0x1f11210 device:0x145ea33fc600 size:2048 presentcount:0+1 line:1789 name:fixil
host:0x1f11a40 device:0x145ea33fd000 size:2048 presentcount:0+1 line:1789 name:fixiu
host:0x1f12270 device:0x145ea33fda00 size:768 presentcount:0+1 line:1789 name:fixjl
host:0x1f125a0 device:0x145ea33fe000 size:768 presentcount:0+1 line:1789 name:fixju
host:0x1f128d0 device:0x145ea33fe600 size:2048 presentcount:0+1 line:1789 name:fixkl
host:0x1f13100 device:0x145ea33ff000 size:2048 presentcount:0+1 line:1789 name:fixku
host:0x1f187d0 device:0x145ea39f9200 size:2064 presentcount:0+1 line:1792 name:xmp
host:0x1f19010 device:0x145ea39f9e00 size:784 presentcount:0+1 line:1792 name:ymp
host:0x1f19350 device:0x145ea39fa400 size:2064 presentcount:0+1 line:1792 name:zmp
host:0x1f1c910 device:0x145ea39f6200 size:2056 presentcount:0+1 line:1791 name:akw
host:0x1f1d140 device:0x145ea39f6e00 size:2056 presentcount:0+1 line:1791 name:ckw
host:0x1f1d970 device:0x145ea39f7a00 size:2056 presentcount:0+1 line:1791 name:akuv
host:0x1f1e1a0 device:0x145ea39f8600 size:2056 presentcount:0+1 line:1791 name:ckuv
host:0x1f31ee0 device:0x145ea32fa000 size:528392 presentcount:0+1 line:1783 name:ak
host:0x1fb89d0 device:0x145ea337b200 size:528392 presentcount:0+1 line:1783 name:bk
host:0x203f6d0 device:0x145ea3800000 size:528392 presentcount:0+1 line:1783 name:ck
host:0x20c65e0 device:0x145ea3881200 size:528392 presentcount:0+1 line:1783 name:gk
host:0x2147610 device:0x145ea3902400 size:199432 presentcount:0+1 line:1784 name:ai
host:0x2178140 device:0x145ea3933000 size:199432 presentcount:0+1 line:1784 name:bi
host:0x21a8c70 device:0x145ea3963c00 size:199432 presentcount:0+1 line:1784 name:ci
host:0x21d97a0 device:0x145ea3994800 size:199432 presentcount:0+1 line:1784 name:gi
host:0x220a2d0 device:0x145ea39c5400 size:199432 presentcount:0+1 line:1785 name:aj
host:0x223ae00 device:0x145ea3a00000 size:199432 presentcount:0+1 line:1785 name:bj
host:0x226b930 device:0x145ea3a30c00 size:199432 presentcount:0+1 line:1785 name:cj
host:0x229c460 device:0x145ea3a61800 size:199432 presentcount:0+1 line:1785 name:gj
host:0x145f6123e4c0 device:0x145e76000000 size:52186176 presentcount:0+1 line:1790 name:tnuz
host:0x145f644052b0 device:0x145e7a000000 size:52186176 presentcount:0+1 line:1790 name:tnuy
host:0x145f675cb0a0 device:0x145e7e000000 size:52186176 presentcount:0+1 line:1790 name:tnux
host:0x145f6a790e90 device:0x145e82000000 size:52186176 presentcount:0+1 line:1790 name:tnu
host:0x145f73ce0500 device:0x145e86000000 size:52186176 presentcount:0+1 line:1788 name:rhs1
allocated block device:0x145e76000000 size:52186624 thread:1
allocated block device:0x145e7a000000 size:52186624 thread:1
allocated block device:0x145e7e000000 size:52186624 thread:1
allocated block device:0x145e82000000 size:52186624 thread:1
allocated block device:0x145e86000000 size:52186624 thread:1
allocated block device:0x145ea32fa000 size:528896 thread:1
allocated block device:0x145ea337b200 size:528896 thread:1
allocated block device:0x145ea33fc400 size:512 thread:1
allocated block device:0x145ea33fc600 size:2048 thread:1
allocated block device:0x145ea33fce00 size:512 thread:1
allocated block device:0x145ea33fd000 size:2048 thread:1
allocated block device:0x145ea33fd800 size:512 thread:1
allocated block device:0x145ea33fda00 size:1024 thread:1
allocated block device:0x145ea33fde00 size:512 thread:1
allocated block device:0x145ea33fe000 size:1024 thread:1
allocated block device:0x145ea33fe400 size:512 thread:1
allocated block device:0x145ea33fe600 size:2048 thread:1
allocated block device:0x145ea33fee00 size:512 thread:1
allocated block device:0x145ea33ff000 size:2048 thread:1
allocated block device:0x145ea33ff800 size:512 thread:1
allocated block device:0x145ea33ffa00 size:512 thread:1
allocated block device:0x145ea33ffc00 size:512 thread:1
allocated block device:0x145ea33ffe00 size:512 thread:1
allocated block device:0x145ea3800000 size:528896 thread:1
allocated block device:0x145ea3881200 size:528896 thread:1
allocated block device:0x145ea3902400 size:199680 thread:1
allocated block device:0x145ea3933000 size:199680 thread:1
allocated block device:0x145ea3963c00 size:199680 thread:1
allocated block device:0x145ea3994800 size:199680 thread:1
allocated block device:0x145ea39c5400 size:199680 thread:1
allocated block device:0x145ea39f6000 size:512 thread:1
allocated block device:0x145ea39f6200 size:2560 thread:1
allocated block device:0x145ea39f6c00 size:512 thread:1
allocated block device:0x145ea39f6e00 size:2560 thread:1
allocated block device:0x145ea39f7800 size:512 thread:1
allocated block device:0x145ea39f7a00 size:2560 thread:1
allocated block device:0x145ea39f8400 size:512 thread:1
allocated block device:0x145ea39f8600 size:2560 thread:1
allocated block device:0x145ea39f9000 size:512 thread:1
allocated block device:0x145ea39f9200 size:2560 thread:1
allocated block device:0x145ea39f9c00 size:512 thread:1
allocated block device:0x145ea39f9e00 size:1024 thread:1
allocated block device:0x145ea39fa200 size:512 thread:1
allocated block device:0x145ea39fa400 size:2560 thread:1
allocated block device:0x145ea39fae00 size:512 thread:1
allocated block device:0x145ea39fb000 size:512 thread:1
allocated block device:0x145ea39fb200 size:512 thread:1
allocated block device:0x145ea39fb400 size:512 thread:1
allocated block device:0x145ea39fb600 size:512 thread:1
allocated block device:0x145ea39fb800 size:512 thread:1
allocated block device:0x145ea39fba00 size:512 thread:1
allocated block device:0x145ea3a00000 size:199680 thread:1
allocated block device:0x145ea3a30c00 size:199680 thread:1
allocated block device:0x145ea3a61800 size:199680 thread:1
FATAL ERROR: variable in data clause is partially present on the device: name=rhs1
 file:/home/jsera.lee/lica/LICA/Canopy/3_Main/_modi_main/src/lica.f90 lhsu line:1799

Sample 2 - code

!$acc enter data copyin ( AK(1:M1, 1:M3), BK(1:M1, 1:M3), CK(1:M1, 1:M3), GK(1:M1, 1:M3) )
!$acc enter data copyin ( AI(1:M2, 1:M1), BI(1:M2, 1:M1), CI(1:M2, 1:M1), GI(1:M2, 1:M1) )
!$acc enter data copyin ( AJ(1:M1, 1:M2), BJ(1:M1, 1:M2), CJ(1:M1, 1:M2), GJ(1:M1, 1:M2) )
 
!!$acc enter data copyin ( RHS1(0:M1, 0:M2, 0:M3, 3) )
!$acc enter data copyin ( FIXIL(1:M1M), FIXIU(1:M1M), FIXJL(1:M2M), FIXJU(1:M2M), FIXKL(1:M3M), FIXKU(1:M3M) )
!$acc enter data copyin ( TNU (0:M1, 0:M2, 0:M3), TNUX(0:M1, 0:M2, 0:M3), TNUY(0:M1, 0:M2, 0:M3), TNUZ(0:M1, 0:M2, 0:M3) )
!$acc enter data copyin ( AKW(1:M3), CKW(1:M3), AKUV(1:M3), CKUV(1:M3) )
!$acc enter data copyin ( XMP(0:M1), YMP(0:M2), ZMP(0:M3) )
!$acc enter data copyin ( X(M1), Y(M2), Z(M3) )
 
!=====ADI STARTS
if (N3M /= 1) then
    !-----Z-DIRECTION
    do J=1,N2M 
        !$acc parallel loop collapse(2) copy(RHS1) present(AK, AKUV, TNUY, FIXKL, CK, CKUV, FIXKU, BK, GK)
        do K=1,N3M
        do I=I_BGPX,N1M
            AK(I,K)=AKUV(K)*(1.+CRE*TNUY(I,J,K)) *(1.-FIXKL(K)*FLOAT(KUB))
            CK(I,K)=CKUV(K)*(1.+CRE*TNUY(I,J,K+1))*(1.-FIXKU(K)*FLOAT(KUT))
             
            IF (IVELSRC .EQ. 1) THEN
                !PRIK=PERI(X(I),YMP(J),ZMP(K))
                PRIK=0. 
                BK(I,K)=ACOEFI*(1+ACOEF*PRIK)-AK(I,K)-CK(I,K)
                GK(I,K)=ACOEFI*(1+ACOEF*PRIK)*RHS1(I,J,K,1)
            ELSE
                BK(I,K)=ACOEFI-AK(I,K)-CK(I,K)
                GK(I,K)=ACOEFI*RHS1(I,J,K,1)
            ENDIF
        enddo
        enddo
        !$acc end loop
    enddo
endif

Sample 2 - compile message

lhsu:
   1783, Generating enter data copyin(ck(1:m1,1:m3),bk(1:m1,1:m3),ak(1:m1,1:m3),gk(1:m1,1:m3))
   1784, Generating enter data copyin(ci(1:m2,1:m1),bi(1:m2,1:m1),ai(1:m2,1:m1),gi(1:m2,1:m1))
   1785, Generating enter data copyin(cj(1:m1,1:m2),bj(1:m1,1:m2),aj(1:m1,1:m2),gj(1:m1,1:m2))
   1789, Generating enter data copyin(fixkl(1:m3m),fixju(1:m2m),fixjl(1:m2m),fixiu(1:m1m),fixil(1:m1m),fixku(1:m3m))
   1790, Generating enter data copyin(tnuy(0:m1,0:m2,0:m3),tnux(0:m1,0:m2,0:m3),tnu(0:m1,0:m2,0:m3),tnuz(0:m1,0:m2,0:m3))
   1791, Generating enter data copyin(akw(1:m3),ckuv(1:m3),akuv(1:m3),ckw(1:m3))
   1792, Generating enter data copyin(ymp(0:m2),xmp(0:m1),zmp(0:m3))
   1793, Generating enter data copyin(y(m2),z(m3),x(m1))
   1799, Generating copy(rhs1(:,:,:,:)) [if not already present]
         Generating present(ak(:,:),tnuy(:,:,:),gk(:,:),ckuv(:),bk(:,:),ck(:,:),fixkl(:),akuv(:),fixku(:))
         Generating NVIDIA GPU code
       1800, !$acc loop gang, vector(128) collapse(2) ! blockidx%x threadidx%x
       1801,   ! blockidx%x threadidx%x collapsed
   1898, Generating exit data delete(ck(1:m1,1:m3),bk(1:m1,1:m3),ak(1:m1,1:m3),gk(1:m1,1:m3))
   1899, Generating exit data delete(ci(1:m2,1:m1),bi(1:m2,1:m1),ai(1:m2,1:m1),gi(1:m2,1:m1))
   1900, Generating exit data delete(cj(1:m1,1:m2),bj(1:m1,1:m2),aj(1:m1,1:m2),gj(1:m1,1:m2))
   1903, Generating exit data copyout(fixkl(1:m3m),fixju(1:m2m),fixjl(1:m2m),fixiu(1:m1m),fixil(1:m1m),fixku(1:m3m))
   1904, Generating exit data copyout(tnuy(0:m1,0:m2,0:m3),tnux(0:m1,0:m2,0:m3),tnu(0:m1,0:m2,0:m3),tnuz(0:m1,0:m2,0:m3))
   1905, Generating exit data copyout(akw(1:m3),ckuv(1:m3),akuv(1:m3),ckw(1:m3))
   1906, Generating exit data copyout(ymp(0:m2),xmp(0:m1),zmp(0:m3))
   1907, Generating exit data copyout(z(m3),y(m2),x(m1))

Sample 2 - result
no error, pass without issues

Hi leejsera,

Let’s back-up a bit. When adding a compute region (parallel or kernels), there is an implicit data region. If the user does not include a variable used in the compute region in a visible data region that has implicit data attribute, the compiler must implicitly add it. For aggregate types such as arrays, the compiler will apply a “copy” clause.

However, “copy” clauses use “present_or” semantics. Meaning the the OpenACC runtime checks if the variable is already present on the device or not. If it is, a pointer to the data present on the device is passed to the compute kernel. If it is not, only then is it copied.

Putting the variable in a “present” clause is useful and considered by me to be best practice, but it is not required. It has the effect that the compiler does not need to add an implicit copy but instead only checks that the variable is present on the device at runtime. If it is not present, the executable will error.

Putting a compute region within a structured data region (i.e. “!$acc data” / “$acc end data”) does make the variables visible to the compiler and as such does not need to provide an implicit copy. However due to their nature, unstructured data regions (i.e. “!$acc enter data” / “!$acc exit data”) are not visible. The scoping of unstructured data regions are defined at runtime while structured data regions scope is defined at compile time.

To answer your questions more directly:

Q1.
I don’t know why i have to use present clause, because i already use !$acc enter data copyin RHS1 .

“present” is not required (except for a few cases), just best practice. Though given the data in an “enter/exit” data region is not visible to the compiler at compilation, the implicit data copy would be included without the “present”. Semantically, using “present” or an implicit copy within an unstructured data region are equivalent.

In my opnions i dont need use present clause at parallel loop. Is it right?

Correct.

The only time is really needed is for deep copies and more for C/C++ than Fortran. Though in an effort to not confuse things, I wont go into depth on this, and just state for your code, “present” is not required.

When i doensn’t use present clause, it looks RHS1 didn’t copy the parallel loop.
make copy because of enter data copy regions but didn’t copy at parallel loop.
Generating enter data copyin(rhs1( 0 :m1, 0 :m2, 0 :m3, 3 ))

I’m a bit unclear what you’re asking here. The feedback message implies RHS1 is in an “enter data” directive. What I’d expect is if you took RHS1 out of the “present” clause, but kept it in the “enter data” directive, is that the compiler would print a message about putting it in an implicit copy.

A “partially present” error means that a variable is already on the device but the present check shows that it’s a different size than what’s on the device.

Your enter data directive for RHS1 is this:

!$acc enter data copyin ( RHS1(0:M1, 0:M2, 0:M3, 3) )

Note the “3” in the fourth dimension. Since you don’t have a section, i.e. “:”, “:3” or “1:3”, only one element from this dimension is copied.

When you do a “present(RHS1)” it’s checking if the whole array is present, which it is not. Only one third of it is.

Also, note that when RHS1 is used: “GK(I,K)=ACOEFI*(1+ACOEF*PRIK)*RHS1(I,J,K,1)”, you have “1” in the third dimension which doesn’t exist on the device.

Likely changing:

!$acc enter data copyin ( RHS1(0:M1, 0:M2, 0:M3, 3) )

to

!$acc enter data copyin ( RHS1(0:M1, 0:M2, 0:M3, 1:3) )

or

!$acc enter data copyin ( RHS1(0:M1, 0:M2, 0:M3, :) )

will fix the issue.

Our compiler used to treat “3” as if it were “1:3”, but we needed to change this a few years ago. While nice for the user, it’s not standard complying and we got complaints. We have a flag “-gpu=implicitsections” to revert the behavior, but I’d still suggest explicitly adding the section.

Hope this helps,
Mat

2 Likes

I really appreciate your kind, clear answers.
Now i understand about ‘present’ clause.

I have some questions about the last descriptions.

Our compiler used to treat “3” as if it were “1:3”, but we needed to change this a few years ago. While nice for the user, it’s not standard complying and we got complaints. We have a flag “-gpu=implicitsections” to revert the behavior, but I’d still suggest explicitly adding the section.

Is it correct that the following modifications are necessary to make the correct code?

from.
!$acc enter data copyin ( RHS1(0:M1, 0:M2, 0:M3, 3) )
to.
!$acc enter data copyin ( RHS1(0:M1, 0:M2, 0:M3, 1:3) )

and also
from.
!$acc enter data copyin ( X(M1), Y(M2), Z(M3) )
to.
!$acc enter data copyin ( X(1:M1), Y(1:M2), Z(1:M3) )

I understand about RHS1.
But should X(M1) also be changed to X(1:M1)?

But should X(M1) also be changed to X(1:M1)?

Correct. “X(M1)” says to create and copy a single element of X. “X(1:M1)” says to create and copy all elements from 1 to M1.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.