pgprof Internal Error with acc data local & update host

Hello .

I’am using the last pgi/12.8 version ( but all are inpacted I think )

I have a strange “Internal Error” with “pgprof” with I think the “update” clause .
( If I change a little the code to remove this update all is OK )

Perhaps I’m doing something wrong with this update clause ?

Compile & run under “pgcollect” of this simple example is OK …
but the “pgprof” give then this error in a pop-up window
when trying to browse the profile of the main program :

pgprof: Internal Error. Couldn’t find line information for query
pgprof.out@simple_local_update_host.f90@simple_local_update_host@9

Here is the source “simple_local_update_host.f90”
( ;-) extracted from the big real code )

PROGRAM simple_local_update_host

  IMPLICIT NONE

  INTEGER , PARAMETER :: n=10 , MAXIT = 3
  INTEGER  :: j,it
  REAL     :: a(n)

  !$acc data region local (a)

  !$acc region
  DO  j = 1,n
     a(j) = j*1.0
  END DO
  !$acc end region

  do it=1,MAXIT

     !$acc region 
     DO  j = 1,n
        a(j) = a(j) * 2.0
     END DO
     !$acc end region

     !$acc update host (a(1:1),a(n:n))
     print*,"a=",a(1),a(n)

  end do

  !$acc end data region 

END PROGRAM simple_local_update_host

the compilation

pgf90 -g -O0 -ta=nvidia,cc20  -Minfo=acc,ccff simple_local_update_host.f90 -o simple_local_update_host
+ pgf90 -g -O0 -ta=nvidia,cc20 -Minfo=acc,ccff simple_local_update_host.f90 -o simple_local_update_host
simple_local_update_host:
      9, Generating local(a(:))
     11, Generating local(a(:))
         Generating compute capability 2.0 binary
     12, Loop is parallelizable
         Accelerator kernel generated
         12, !$acc loop gang, vector(32) ! blockidx%x threadidx%x
             CC 2.0 : 8 registers; 0 shared, 40 constant, 0 local memory bytes
     19, Generating local(a(:))
         Generating compute capability 2.0 binary
     20, Loop is parallelizable
         Accelerator kernel generated
         20, !$acc loop gang, vector(32) ! blockidx%x threadidx%x
             CC 2.0 : 8 registers; 0 shared, 40 constant, 0 local memory bytes
     25, Generating update host(a(10))
         Generating update host(a(:1))

the run under “pgcollect”

pgcollect simple_local_update_host
 a=    2.000000        20.00000    
 a=    4.000000        40.00000    
 a=    8.000000        80.00000    
target process has terminated, writing profile data

A+

Juan

Hi Juan,

Thanks for the report. I was able to reproduce this issue and have sent a report (TPR#18908) to our Tools engineers. Your use of the update directive is correct and this looks like a problem with either pgcollect or pgprof. I’m not sure which.

In the meantime, you can view basic profiling information by setting the environment variable “PGI_ACC_TIME=1” instead of using pgcollect.

Thanks again,
Mat

Tank you Mat

A+

Juan

Hello Mat .

I could test the simple example with an old pgi/10.9 compiler …
… and the problem disappear .

With the example compiled with pgi/11.10 it come back again …

So the problem was probably introduced in the pgi/11.X release .

Also, it must be a compiler bug …

The profile generated with a compiler pgi/10.9
could be read with pgprof of superior version, 11.10 or 12.8
… without problem

A+

Juan

Thank Juan. I added this information to TPR#18908.

  • Mat