Hi,
I’m using pgi 10.1 on linux 64bit.
I have problems with dependencies in loops, since I get the following messages when compiling with “pgcc -g -ta=nvidia,cc11 -Minfo -fastsse -c ./main.c -o main.o”:
calc:
11, No parallel kernels found, accelerator region ignored
15, Complex loop carried dependence of 'fArr2' prevents parallelization
17, Complex loop carried dependence of 'fArr2' prevents parallelization
Generated 4 alternate loops for the loop
Generated vector sse code for the loop
main:
38, Loop unrolled 4 times (completely unrolled)
39, Loop unrolled 4 times (completely unrolled)
Everywhere it is said that I should either use the restrict keyword or the option -Msafteptr, but neither of these is working for my case.
I reduced my program to an small example code where I copy stuff from one array to another (see below). I know it does not really make sense what I am doing there, but first I want to get rid of these dependencies.
Does anyone has an idea?
#include <stdio.h>
#include <stdlib.h>
void calc(float *restrict fArr1, float *restrict fArr2, int iCols, int iRows)
{
int i,j;
int n = iCols * iRows;
float fVal;
#pragma acc region copy(fArr1[0:n-1], fArr2[0:n-1])
{
#pragma acc for private(fVal,i,j)
/* compute stencil, residual and update */
for (j = 0; j < iRows; j++)
{
for (i = 0; i < iCols; i++)
{
fVal = 5.0f * fArr1[j*iCols+i];
fArr2[j*iCols+i] = 2.0f * fVal;
}
}
}
}
int main (int argc, char** argv)
{
int retVal = 0; /* return value */
int i,j;
int iCols = 4;
int iRows = 4;
/*Init arrays*/
float *fArr1 = (float*) malloc(iCols * iRows * sizeof(float));
float *fArr2 = (float*) malloc(iCols * iRows * sizeof(float));
for (j=0; j< iRows; ++j){
for(i=0; i< iCols; ++i){
fArr1[j*iCols+i] = i;
fArr2[j*iCols+i] = 0.0f;
}
}
if (fArr1 && fArr2)
{
/* running calculations */
calc(fArr1,fArr2,iCols,iRows);
/* print one example result */
printf("Result[%d]: %f\n", iRows*iCols-1,fArr2[iRows*iCols-1]);
}
else
{
printf(" Memory allocation failed ...\n");
retVal = -1;
}
/* cleanup */
free(fArr1);
free(fArr2);
return retVal;
}