The basic problem here is that your code is not parallelizable. If we remove the “parallel(8)” clause from the “!$acc region do” directive, the compiler correctly detects that the code is not parallel and won’t generate a GPU kernel.
pgf90 test.f90 -ta=nvidia -Minfo=accel
9, No parallel kernels found, accelerator region ignored
10, Scalar last value needed after loop for x
Loop carried scalar dependence for x
11, Accelerator restriction: scalar variable live-out from loop: x
However, when you use the “parallel” clause, you are telling the compiler to go ahead and parallelize the code anyway. Unfortunately, this leads to some nonsensical PTX code and the error by ptxas.
To fix, promote x to an array and then do the reduction on the host. Note that we will support reductions on the GPU in the future, but this support is not available in the 9.0 release.
! include 'accel_lib.h'
integer :: i
integer,parameter :: N=1000000
real :: x=0.0
real :: xarr(N)
!$acc region do
print *, x
end program main
Note that the directory “$PGI/linux86-64/9.0-1/etc/samples” contains several accelerator examples which might be helpful.