how to compile a !$acc program?

I read the manual of pgi 9.0.1 and write a sample program,

! include ‘accel_lib.h’
program main
use accel_lib
implicit none
integer :: i
integer,parameter :: N=100000000
real :: x=0.0
!$acc region do parallel(8), private(x,i)
do i=1,N
x=x+i
enddo
print *, x
end program main

then
[~]$ pgf95 1.f90 -ta=nvidia
ptxas /tmp/pgaccJTdfPuFB79TC.ptx, line 81; error : State space incorrect for instruction ‘st’
ptxas fatal : Ptx assembly aborted due to errors
PGF90-F-0000-Internal compiler error. pgnvd job exited with nonzero status code 0 (1.f90: 13)

what’s wrong ?

thanks !

[~]$ pgaccelinfo
Device Number: 0
Device Name: GeForce 8400 SE
Device Revision Number: 1.1
Global Memory Size: 267714560
Number of Multiprocessors: 1
Number of Cores: 8
Concurrent Copy and Execution: No
Total Constant Memory: 65536
Total Shared Memory per Block: 16384
Registers per Block: 8192
Warp Size: 32
Maximum Threads per Block: 8192
Maximum Block Dimensions: 512 x 512 x 64
Maximum Grid Dimensions: 65535 x 65535 x 1
Maximum Memory Pitch: 262144B
Texture Alignment 256B
Clock Rate: 918 MHz

Hi l4linux,

The basic problem here is that your code is not parallelizable. If we remove the “parallel(8)” clause from the “!$acc region do” directive, the compiler correctly detects that the code is not parallel and won’t generate a GPU kernel.

pgf90 test.f90 -ta=nvidia -Minfo=accel
main:
      9, No parallel kernels found, accelerator region ignored
     10, Scalar last value needed after loop for x
         Loop carried scalar dependence for x
     11, Accelerator restriction: scalar variable live-out from loop: x

However, when you use the “parallel” clause, you are telling the compiler to go ahead and parallelize the code anyway. Unfortunately, this leads to some nonsensical PTX code and the error by ptxas.

To fix, promote x to an array and then do the reduction on the host. Note that we will support reductions on the GPU in the future, but this support is not available in the 9.0 release.

 cat test.f90
! include 'accel_lib.h'
program main
use accel_lib
implicit none
integer :: i
integer,parameter :: N=1000000
real :: x=0.0
real :: xarr(N)
!$acc region do
do i=1,N
   xarr(i)=i
enddo

do i=1,N
  x=x+xarr(i)
end do
print *, x
end program main

Note that the directory “$PGI/linux86-64/9.0-1/etc/samples” contains several accelerator examples which might be helpful.

  • Mat

thank you very much !