Question: The OpenAcc directive "!$acc parallel loop" does not work in Community Edition 19.4

The Community Edition 19.4 was just installed. A small test on OpenAcc was performed:

program testOpenAcc
implicit none

integer::i
integer::N
real(8)::a(10000)
a=0.0D+00
N=1000
write(*,*)a(1:5)
write(*,*)'Hello world!'
	!$acc data copyin(N,i,a(1:N))
	!$acc parallel loop
	do i=1,N
		a(i)=1.0D+00
	enddo
	!$acc end parallel loop
	
	!$acc update self(a(1:N))
	!$acc end data
write(*,*)a(1:5)
end program testOpenAcc

The output was as follows:

0.000000000000000 0.000000000000000 0.000000000000000
0.000000000000000 0.000000000000000
Hello world!
Failing in Thread:1
call to cuStreamSynchronize returned error 719: Launch failed (often invalid pointer dereference)

Failing in Thread:1
call to cuMemFreeHost returned error 719: Launch failed (often invalid pointer dereference)


The directive ‘!$acc parallel loop’ can work on Community Edition 18.4, but it can not work on the present version, i.e., Community Edition 19.4.

Can you help find the reason?

Thanks a lot!

Jingbo

Hi Jingbo,

It looks like a problem where the compiler isn’t ignoring the “i” in the data clause. It should since the loop index variable is implicitly private and making it shared (the default behavior of a data clause) would cause this type of runtime error. I added a problem report TPR#27410 and sent it to our engineers to investigate.

The work around is to remove “i” from the copyin clause (N isn’t needed either so I removed it as well).

% cat test.f90
program testOpenAcc
implicit none

integer::i
integer::N
real(8)::a(10000)
a=0.0D+00
N=1000
write(*,*)a(1:5)
write(*,*)'Hello world!'
        !$acc data copyin(a(1:N))
        !$acc parallel loop
        do i=1,N
                a(i)=1.0D+00
        enddo
        !$acc end parallel loop

        !$acc update self(a(1:N))
        !$acc end data
write(*,*)a(1:5)
end program testOpenAcc
% pgf90 test.f90 -ta=tesla -Minfo=accel -V19.4; a.out
testopenacc:
     11, Generating copyin(a(:n))
     12, Generating Tesla code
         13, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
     18, Generating update self(a(:n))
    0.000000000000000         0.000000000000000         0.000000000000000
    0.000000000000000         0.000000000000000
 Hello world!
    1.000000000000000         1.000000000000000         1.000000000000000
    1.000000000000000         1.000000000000000

Hope this helps,
Mat

Dear Mat,

Thank you for your comments!

It works very well!

Regards,

Jingbo

TPR 27410 is resolved w/20.1