Fortran -> C in OpenACC

jeonb · May 7, 2018, 3:55pm

Hi,

We are testing basic OpenACC implementation, mostly converting from OpenMP codes. Also testing Fortran/C with reduction in loop.

A following fortran code works good - even though this might not be the best practice. Any advice is welcomed.

    !$acc data copyin(a(1:n)) copy(r(1:n),e(1:n))
    !$acc kernels 
    !$acc loop reduction(+:npair, sum_acc)
    do i=1, n
       sum_loc = 0.0
       do j=1,n
          npair = npair + 1
          r(i) = r(i)+dexp(a(i) + a(j))
          e(i) = e(i)+dlog(a(i) + a(j))
          sum_loc = sum_loc + r(i)*0.1d0 + e(i)*0.2d0          
       end do
       sum_acc = sum_acc + sum_loc
    enddo
    !$acc end kernels
    !$acc end data
    !

-Minfo=accel message as follows:

41, Generating copy(e(1:n))
Generating copyin(a(1:n))
Generating copy(r(1:n))
42, Generating implicit copy(sum_acc,npair)
44, Loop is parallelizable
Accelerator kernel generated
Generating Tesla code
44, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
Generating reduction(+:sum_acc,npair)
46, !$acc loop seq
46, Complex loop carried dependence of r prevents parallelization
Loop carried dependence of r,e prevents parallelization
Loop carried backward dependence of r,e prevents vectorization
Complex loop carried dependence of e prevents parallelization
Inner sequential loop scheduled on accelerator

Then we converted into C, as shown below, but the pgcc says that it cannot parallelize - all loops are just sequential.

#pragma acc data copyin(a[0:N]) copy(r[0:N],e[0:N])
#pragma acc kernels 
#pragma acc loop  reduction(+:npair, sum_acc)  
  for (i=0;i<N;i++) {
    sum_loc = 0.0;
    for (j=0;j<N;j++) {
      npair += 1;
      r[i] += exp(a[i] + a[j]);
      e[i] += log(a[i] + a[j]);
      sum_loc += r[i]*0.1 + e[i]*0.2;
    }
    sum_acc += sum_loc;
  }

The following message is from pgcc using -Minfo=acc.

43, Generating copy(e[:N])
Generating copyin(a[:N])
Generating copy(r[:N])
44, Generating implicit copy(sum_acc,npair)
46, Complex loop carried dependence of a->,r->,e-> prevents parallelization
Accelerator kernel generated
Generating Tesla code
46, #pragma acc loop seq
48, #pragma acc loop seq
48, Complex loop carried dependence of a->,r-> prevents parallelization
Loop carried dependence due to exposed use of r[i1],e[i1] prevents parallelization
Complex loop carried dependence of e-> prevents parallelization

I assume that we converted almost same loop from fortran to C while compiler responds very differently. Am I missing any operation? Any comments are appreciated.

Thanks,

BJ

PS. The version of PGI is 18.3-0 64-bit target on x86-64 Linux and we’re testing on P100 GPGPU card.

MatColgrove · May 7, 2018, 5:20pm

HI BJ,

This is most likely due to pointer aliasing. C allows for pointers of the same type to point at the same object. In order to parallelize the loop, the compiler must prove that objects are disjoint, but because of the potential for aliasing, it can’t.

Try adding the C99 “restrict” attribute to you pointer declarations. “restricts” asserts to the compiler that the pointers don’t overlap.

Alternatively, you can use the flag “-Msafeptr” to assert that all pointers don’t alias, but this is a big hammer and may result in runtime errors if your pointers do indeed overlap.

-Mat

Topic		Replies	Views
OpenACC and Fortran: compiler error question Legacy PGI Compilers	3	2949	June 25, 2013
OpenACC reduction for complex variables in FORTRAN Legacy PGI Compilers	3	7502	September 30, 2014
prevent parallelization Legacy PGI Compilers	3	1921	March 22, 2012
PGI 14.1 Fortran/acc bug report Legacy PGI Compilers	3	4629	June 17, 2014
Fortran MATMUL function fails in OpenACC Legacy PGI Compilers	5	795	October 12, 2021
Implicit copyin for buildin functions Legacy PGI Compilers	2	540	October 12, 2021
Fortran code not compiling for GPU Legacy PGI Compilers	11	7358	August 23, 2017
Reduction not recognized in Fortran Legacy PGI Compilers	6	3345	June 1, 2012
Unsupported local variable Legacy PGI Compilers	8	5033	January 26, 2018
undefined reference to `__pgi_uacc_computestart' Legacy PGI Compilers	8	7583	June 14, 2018

Fortran -> C in OpenACC

Related topics