Host memory usage increasing

I’m playing around with the PGI accelerators. I noticed that when I run the accelerator version of the code below my system memory increases linearly until the program finishes.

Could someone explain why this is and what appropriate data clauses I should use to fix this?

      PROGRAM TEST
      IMPLICIT DOUBLE PRECISION (A-H,O-Z)
C
      PARAMETER (NAT=27,NRADPT=96,NLEBPT=1202,NGRIDPT=96*1202)
      PARAMETER (ONE=1.0D+00)
C
      DIMENSION
     > GTEMPA(NAT,NGRIDPT),
     > GTEMPB(NAT,NAT),
     > GTEMPC(NAT)
C
      DO IRAD=1,NRADPT
        DO IANG=1,NLEBPT
        IPTME=(IRAD-1)*NLEBPT+IANG
!$acc data region copyout(gtempa(1:nat,iptme))
!$acc region local(gtempb(1:nat,1:nat),gtempc(1:nat))
          DO IATM=1,NAT
            GTEMPC(IATM)=ONE
            DO JATM=1,NAT
              GTEMPB(JATM,IATM)=ONE
              IF(IATM .EQ. JATM) CYCLE
              GTEMPB(JATM,IATM)=DBLE(JATM+IATM)
            ENDDO
            GTEMPC(IATM)=PRODUCT(GTEMPB(1:NAT,IATM),1)
          ENDDO
          GTEMPA(1:NAT,IPTME)=GTEMPC(1:NAT)
!$acc end region
!$acc end data region
        ENDDO
      ENDDO
      WRITE(999,*) GTEMPA
      END



     15, Generating copyout(gtempa(:,iptme))
     16, Generating local(gtempb(:,:))
         Generating local(gtempc(:))
     17, Loop is parallelizable
         Accelerator kernel generated
         17, !$acc do parallel, vector(27) ! blockidx%x threadidx%x
             Using register for 'gtempc'
     19, Loop is parallelizable
     24, product reduction inlined
         Loop is parallelizable
     26, Loop is parallelizable
         Accelerator kernel generated
         26, !$acc do parallel, vector(27) ! blockidx%x threadidx%x

Hi sslgamess,

What flags are you using? I’m able to recreate the issue but only when I use the “time” profiling sub-option (-ta=nvidia,time). It seems fine without the “time” sub-option.

I’ve send a report to our engineers (TPR#18371) to investigate if this is a memory leak with the profiling code, expected, or something else.

Thanks,
Mat

Hi Mat,

Yes, I am using the time flag.

Thanks for submitting the TPR.

We did indeed find a memory leak for programs using the -ta=nvidia,time option. It turned out the runtime was creating new cudaEvents, without recycling the old ones. That will be fixed in the 12.2 release.

Thanks.

Thanks for the update Michael

TPR 18371 was fixed in the 12.6 release.
Sorry about the delay.

dave