Hi, Everyone,
I have a question when reading through a webpage about combing both OpenACC and OpenMP into one single program unit at the Dr.Dobb’s website (http://www.drdobbs.com/parallel/the-openacc-execution-model/240006334?pgno=2). The code snippet of concerned is excerpted to show in the text below. Can anyone let me know why the reduction clause (i.e., reduction(+:tmp)) of the OpenACC pragma is missing from line 16, while the same reduction clause (for the same loop as line 16) remains invoked by OpenMP in line 15?
Thanks,
Li
1 void gramSchmidt(restrict float Q[][COLS], const int rows, const int cols)
2 {
3 #pragma acc data copy(Q[0:rows][0:cols])
4 for(int k=0; k < cols; k++) {
5 double tmp = 0.;
6 #pragma omp parallel for reduction(+:tmp)
7 #pragma acc parallel reduction(+:tmp)
8 for(int i=0; i < rows; i++) tmp += (Q[i][k] * Q[i][k]);
9 tmp = sqrt(tmp);
10
11 #pragma omp parallel for
12 #pragma acc parallel loop
13 for(int i=0; i < rows; i++) Q[i][k] /= tmp;
14
15 #pragma omp parallel for reduction(+:tmp)
16 #pragma acc parallel loop
17 for(int j=k+1; j < cols; j++) {
18 tmp=0.;
19 for(int i=0; i < rows; i++) tmp += Q[i][k] * Q[i][j];
20 for(int i=0; i < rows; i++) Q[i][j] -= tmp * Q[i][k];
21 }
22 }
23 }