Loop carried dependence when using kernels on fortran arrays

I am having a weird problem.

I consistently use the “kernels” around fortran array syntax, but it seems that when I do so with some array operations, I get strange compiler comments and it is not parallelized.

Example:

This code:

!$acc kernels default(present)
      pres=he_p*rho*temp/he_rho
!$acc end kernels

yields this output:

setp_acc:
      0, Accelerator kernel generated
         Generating Tesla code
  45208, Accelerator scalar kernel generated
         Generating implicit present(pres$f(:,:,:),pres$f,temp(:,:,:),rho(:,:,:),pres(:,:,:))
  45209, Loop carried dependence due to exposed use of pres$f(:,:,:) prevents parallelization
         Parallelization requires privatization of pres$f as well as last value
         Loop is parallelizable
         Loop carried reuse of pres prevents parallelization
         Inner sequential loop scheduled on accelerator
         Accelerator scalar kernel generated
         Accelerator kernel generated
         Generating Tesla code
      45209, !$acc loop seq

In this code segment, temp, rho, and pres are all allocatable, form the same module, and on the device.
The scalars are stored in a different module.

Any ideas on why this is happening? What does it mean “pres$f?”.

Hi sumseq,

Per the Fortran standard when using array syntax, the right hand side needs to be fully evaluated before assignment to the left hand side. If there isn’t a dependency between the right and left-hand side, then the compiler can often ignore this rule. Though if there is a dependency, then it needs to first create a temp array to hold the results of the right hand side and copy the temp array results to the left hand side. The “pres$f” is this temporary array.

Is “he_p” or any of the other variables a pointer? If so, then this is most likely the problem. If there is a pointer and that pointer could potentially point at pres, then the temp array will need to be created since this creates a potential dependency.

The workaround would be to make this an explicit loop since the above rule only applies to array syntax or forall loops.

-Mat

Hi,

That is very interesting to know!

In this case, he_p and he_rho are simple real*8 scalars.
rho and temp are arrays declared as pointers but I was under the impression that in fortran, all pointers are assumed to be pointing to unique data (i.e. that there is no fortran equivalent to the C “restrict” keyword).

Is there a way to impose the equivalent of “restrict” for fortran pointers?

Hi sumseq,

but I was under the impression that in fortran, all pointers are assumed to be pointing to unique data (i.e. that there is no fortran equivalent to the C “restrict” keyword).

It’s valid Fortran to have multiple pointers point to the same target.

Is there a way to impose the equivalent of “restrict” for fortran pointers?

No, sorry.

-Mat