Reduction operation in C

Hi,
I read that the reduction operation was supported by pgi. I tried running this in C with a simple loop like


for(i=0;i<n;i++)
    s=s+a[i];

But I get " Loop carried scalar dependence for ‘s’ " info. I tried the same simple loop in fortran and it worked without hassle. Is reduction supported only on fortran? Thanks.

Hi phoenix07p,

Sum reduction works for C as well. My guess is something else is inhibiting the accelerator region. Can you post an example of what you’re seeing?

Thanks,
Mat

% cat testr.c 
#include <stdio.h>
#include <malloc.h>

int main () {

   float s;
   float *a;
 
   a=malloc(sizeof(float)*1024);
   for (int i=0;i<1024;++i) {
      a[i]=i;
   }

#pragma acc region for
   for (int i=0;i<1024;++i) {
      s=s+a[i];
   }
  
  printf("S=%g\n", s);

}

   
  
% pgcc testr.c -ta=nvidia -Minfo -V11.3
main:
     14, Generating copyin(a[0:1023])
         Generating compute capability 1.0 binary
         Generating compute capability 1.3 binary
         Generating compute capability 2.0 binary
     15, Loop is parallelizable
         Accelerator kernel generated
         15, #pragma acc for parallel, vector(256) /* blockIdx.x threadIdx.x */
             CC 1.0 : 7 registers; 1064 shared, 16 constant, 0 local memory bytes; 100% occupancy
             CC 1.3 : 7 registers; 1064 shared, 16 constant, 0 local memory bytes; 100% occupancy
             CC 2.0 : 10 registers; 1032 shared, 52 constant, 0 local memory bytes; 100% occupancy
         16, Sum reduction generated for s
% a.out
S=523776