Fortran accelerator problem

robschi · January 25, 2010, 10:53pm

Consider the following trivial program:

program nv
parameter(idim=10000)
real a(idim),b(idim)
a(1:idim)=0.0
b(1:idim)=0.0
!$acc region
do k=1,5000
do i=1,idim
a(i)=a(i)+ sin(i1.0)cos(i1.0)
end do
end do
b(1:idim)=a(1:idim)0.71
a(1)=b(idim)
!$acc end region
write(,) a(1),a(idim),b(idim)
end program nv

When I compile it with
pgf90 nv.f90 and run it I get

time ./a.out
1033.070 1455.028 1033.070

real 0m3.451s
user 0m3.403s
sys 0m0.003s

When I try to use the GPU and compile it with
pgf90 -ta=nvidia,cc11 nv.f90
I get

time ./a.out
2273.086 1455.028 1033.070

real 0m0.215s
user 0m0.114s
sys 0m0.097s

Note, that the result is different. It seems that the assignment
a(1)=b(idim)
is not executed.

Is this a correct behaviour? Are simple assignments not allowed in the GPU code?
i
The compiler version is 10.1. And all runs on a Fedora 11 system.
Thanks in advance.
Robert

MatColgrove · January 29, 2010, 8:17pm

Hi Robert,

Currently kernels are only created from loops. So your scalar assignment is being missed. This is a known deficiency which our engineers are working on addressing. To help things along, I sent a report to our engineers (TPR#16548).

To work around this limitation, either perform the scalar assignment on the host or put the scalar assignment in a loop that’s executed only once.

For example:

        program nv
        parameter(idim=10000)
        real a(idim),b(idim)
        a(1:idim)=0.0
        b(1:idim)=0.0
!$acc region
        do k=1,5000
          do i=1,idim
            a(i)=a(i)+ sin(i*1.0)*cos(i*1.0)
          end do
        end do
        b(1:idim)=a(1:idim)*0.71
!$acc end region
        a(1)=b(idim)
        write(*,*) a(1),a(idim),b(idim)
        end program nv

or

        program nv
        parameter(idim=10000)
        real a(idim),b(idim)
        a(1:idim)=0.0
        b(1:idim)=0.0
!$acc region
        do k=1,5000
          do i=1,idim
            a(i)=a(i)+ sin(i*1.0)*cos(i*1.0)
          end do
        end do
        b(1:idim)=a(1:idim)*0.71
        do k=1,1
          a(k)=b(idim)
        enddo
!$acc end region
        write(*,*) a(1),a(idim),b(idim)
        end program nv

Hope this helps,
Mat

tull · November 18, 2011, 2:23am

Robert,

This was logged as TPR 16548, and our QA groups has verified this is
working as of the 11.6 release.

Thanks for the report.

regards,
dave

Topic		Replies	Views
less speed of accelerator directives Legacy PGI Compilers	6	3508	March 26, 2012
No parallel kernels found, accelerator region ignored Legacy PGI Compilers	3	8450	February 11, 2010
Error when program reaches GPU code Legacy PGI Compilers	3	601	September 10, 2020
how to compile a !$acc program? Legacy PGI Compilers	3	12926	July 14, 2009
where am I going wrong? Legacy PGI Compilers	2	2233	September 28, 2012
simple multi-gpu test program not working Legacy PGI Compilers	4	4098	June 14, 2013
Problem accelerating nested arrays Legacy PGI Compilers	5	7115	August 4, 2010
Fortran compilation problem. Legacy PGI Compilers	1	8923	March 18, 2010
pgfortran segfaults when compiling acc (14.1 only) Legacy PGI Compilers	3	3409	April 7, 2014
compiler information Legacy PGI Compilers	1	1974	December 21, 2012

Fortran accelerator problem

Related topics