Code inside "if" statement in a parallel region

Hi,

I am trying to compile 2 almost equal versions of the same program in parallel with PVF 14.10. The only difference is that in one of them I added a code line (“ielem=2”) inside an “if” statement (outside the “if” produces no problem). Here is the code:

!$acc parallel loop private (ielem,iel)
	do iarista=-numcc,nar
	if (iarista.eq.0) then 
	goto 1234
	endif
		ielem=iar(3,iarista)
		
		if(idry(ielem).eq.2) then
		ielem=2 (This is the code line which, if present, produces the failure)
		goto 1234
		endif
		
		iel=iar(4,iarista)
		
		if(idry(iel).eq.2) then 
		goto 1234
		endif
	
1234	continue
	enddo
!$acc end parallel loop

The compiler gives no especific error, it just says “Program build failed” after the end of the compilation and linking (I have the -Minfo=accel activated). This is really strange, because I have seen that the error is quite random (some times occur, some times not) but when the compilation succeeds then, when I run the program it gives the following error: “call to cuMemcpyDtoHAsync returned error 700”. The problem size is not too big (numcc=80 and nar=760) and I have done a device copy of the needed variables.

Even more strange: while I am writing this post I am doing tests with the program. Now, after some attempts the code line “ielem=2” gives no error and the program runs ok (???). But adding a new line just after the previous: “fluxup(iarista)=0.”, then it produces the same problem described before (some times fails, some times succeeds, but when it does after running it gives me the error: “call to cuStreamSynchronize returned error 700”).

I apologise for such confusing explanations. I am blocked with this and I am wondering if there could be a problem with the compiler. Could you please give me some advice?

Thank you very much in advance for your help.

Martí

Hi Marti,

From what you have written it’s difficult to tell what’s wrong. It could be a compiler issue but could also be an issue with your code. Can you please either post or send to PGI Customer Service (trs@pgroup.com) a reproducing example?

One thing I notice is that “ielem=2” code is useless given it’s not used again in the loop. If it’s resulting value is used outside of this loop, the loop itself can’t be run in parallel. You can force parallelization with the “private” clause (as you do here), but the value of host copy of “ielem” would not change. Thus potentially giving you incorrect results.

If you do need the value “ielem” later on the host, then remove it from the “private” clause and instead use “atomic” operations or a reduction.

  • Mat

Hi Mat,

I have sent a copy of the program to the PGI Customer Service.

The value of “ielem” is actually not used again outside of the loop but I take good note of your advice.

Thank you again,

Martí