Loop carried dependence when using kernels on fortran arrays

caplanr · September 19, 2017, 11:20pm

I am having a weird problem.

I consistently use the “kernels” around fortran array syntax, but it seems that when I do so with some array operations, I get strange compiler comments and it is not parallelized.

Example:

This code:

!$acc kernels default(present)
      pres=he_p*rho*temp/he_rho
!$acc end kernels

yields this output:

setp_acc:
      0, Accelerator kernel generated
         Generating Tesla code
  45208, Accelerator scalar kernel generated
         Generating implicit present(pres$f(:,:,:),pres$f,temp(:,:,:),rho(:,:,:),pres(:,:,:))
  45209, Loop carried dependence due to exposed use of pres$f(:,:,:) prevents parallelization
         Parallelization requires privatization of pres$f as well as last value
         Loop is parallelizable
         Loop carried reuse of pres prevents parallelization
         Inner sequential loop scheduled on accelerator
         Accelerator scalar kernel generated
         Accelerator kernel generated
         Generating Tesla code
      45209, !$acc loop seq

In this code segment, temp, rho, and pres are all allocatable, form the same module, and on the device.
The scalars are stored in a different module.

Any ideas on why this is happening? What does it mean “pres$f?”.

MatColgrove · September 20, 2017, 5:03pm

Hi sumseq,

Per the Fortran standard when using array syntax, the right hand side needs to be fully evaluated before assignment to the left hand side. If there isn’t a dependency between the right and left-hand side, then the compiler can often ignore this rule. Though if there is a dependency, then it needs to first create a temp array to hold the results of the right hand side and copy the temp array results to the left hand side. The “pres$f” is this temporary array.

Is “he_p” or any of the other variables a pointer? If so, then this is most likely the problem. If there is a pointer and that pointer could potentially point at pres, then the temp array will need to be created since this creates a potential dependency.

The workaround would be to make this an explicit loop since the above rule only applies to array syntax or forall loops.

-Mat

caplanr · September 24, 2017, 6:37am

Hi,

That is very interesting to know!

In this case, he_p and he_rho are simple real*8 scalars.
rho and temp are arrays declared as pointers but I was under the impression that in fortran, all pointers are assumed to be pointing to unique data (i.e. that there is no fortran equivalent to the C “restrict” keyword).

Is there a way to impose the equivalent of “restrict” for fortran pointers?

MatColgrove · September 25, 2017, 6:26pm

Hi sumseq,

but I was under the impression that in fortran, all pointers are assumed to be pointing to unique data (i.e. that there is no fortran equivalent to the C “restrict” keyword).

It’s valid Fortran to have multiple pointers point to the same target.

Is there a way to impose the equivalent of “restrict” for fortran pointers?

No, sorry.

-Mat

Topic		Replies	Views
Pointer array Legacy PGI Compilers	3	3420	January 4, 2017
Complex loop carried dependence Legacy PGI Compilers	1	3946	December 21, 2015
loop carried dependence Legacy PGI Compilers	2	13316	September 15, 2009
OpenACC + MPI / Loop carried dependence prevents parallelization nvc, nvc++ and nvfortran	4	682	July 25, 2023
acc kernels / acc parallel question Legacy PGI Compilers	2	3925	September 1, 2017
Code execution depends strangely on irrelevant parameters Legacy PGI Compilers	8	8193	October 22, 2013
Loop carried dependence of a->x prevents parallelization Legacy PGI Compilers	3	802	April 10, 2023
How to parallelize a forward/backward dependency loop Legacy PGI Compilers	3	3152	January 27, 2016
Openacc fortran array syntax not translated correctly Legacy PGI Compilers	3	930	May 27, 2021
Simple assignment not parallelizing in 18.7 - worked in 18.4 Legacy PGI Compilers	7	3879	July 2, 2019

Loop carried dependence when using kernels on fortran arrays

Related topics