APM PGI 10.5 - !$acc region

I have a question.

!$acc region copyin(Cs), copyout(Ds)
Ds = Cs
DO i = 1, n
  DO j = 1, n
   ...
  ENDDO
ENDDO
!$acc end region

My question is whether the statement Ds=Cs is performed on Accelerator or not. If not, should I do something like

!$acc region copyin(Cs), copyout(Ds)
DO i = 1, n
  Ds(i,:) = Cs(i,:)
  DO j = 1, n
   ...
  ENDDO
ENDDO
!$acc end region

Thanks,
Tuan

Hi Tuan,

Since “Ds=Cs” is an implied do loop, it will be accelerated.

% cat test.f90
program test

real, dimension(1024,1024) :: Ds, Cs
integer :: i,j,n

n = 1024
Cs = 0.231

!$acc region copyin(Cs), copyout(Ds)
Ds = Cs
DO i = 1, n
  DO j = 1, n
     Ds(i,j) = Ds(i,j) * (i+j)
  ENDDO
ENDDO
!$acc end region

print *, Cs(1,1), Ds(1,1), Cs(1024,1024), Ds(1024,1024)

end program test

% pgf90 -ta=nvidia -Minfo=accel test.f90 -V10.5
test:
      9, Generating copyin(cs(:,:))
         Generating copyout(ds(:,:))
         Generating compute capability 1.0 binary
         Generating compute capability 1.3 binary
     10, Loop is parallelizable    <<<<< Implied Do loop for Ds=Cs
         Accelerator kernel generated
         10, !$acc do parallel, vector(16)
             CC 1.0 : 6 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy
             CC 1.3 : 6 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy
     11, Loop is parallelizable
     12, Loop is parallelizable
         Accelerator kernel generated
         11, !$acc do parallel, vector(16)
         12, !$acc do parallel, vector(16)
             CC 1.0 : 8 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy
             CC 1.3 : 8 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy

Hope this helps,
Mat

Thanks, mat
I forgot to check the compiler’s output

Tuan

Mat,

I know you knew this was coming: how can I get those nice cubin-like status messages out of pgfortran?

Hi Matt,

CC 1.0 : 6 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy
CC 1.3 : 6 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy

These are new in 10.5. We took your advice and added the output of “–ptxas-options=-v” to the “-Minfo=accel” messages.

Sorry, I should have updated your post to let you know.

Thanks,
Mat

Ah. The real reason I never saw it is that I was still running 10.4. Didn’t change the 2010 symlinks.

Thanks!