In the fourth article in whitepapers on your resource page you discuss data regions and updatein and updateout compiler directives. You also state that a data region can contain one of more accelerator regions and even another data region.
The example given is only a piece of code and only for Fortran - not c. Please direct me or give me a complete c code example showing the use of a data region(s) before and after and advantages and also show how a data region can contain another data region.
Also show by example the use of updatein and updateout in data regions.
Any help appreciated thank in advance.
THS 1138
Below is a small example written in C that shows the use of a data region which contains multiple compute regions. It also shows the use of the update directive for transferring data both to the device and from the device.
void main()
{
float a[100], b[100], c[100], d[100];
float s;
int i, j;
int n = 100;
int m = 100;
for (i = 0; i < n; i++) {
a[i] = (float) i;
b[i] = (float) n-i;
c[i] = 0.0;
d[i] = 0.0;
}
#pragma acc data region copyin(a,b) copy(c,d)
{
for (j = 0; j < m; j++) {
#pragma acc region
{
for (i = 0; i < n; i++) {
c[i] += a[i] / b[i];
}
}
#pragma acc update host(c)
s = 0.0;
for (i = 0; i < n; i++) {
s += c[i];
}
printf("Sum of C array: %f\n",s);
if (s > 500.0) {
printf("Updating Device copy of C......\n");
for (i = 0; i < n; i++) {
c[i] = (float) j;
}
#pragma acc update device(c)
}
#pragma acc region
{
for (i = 0; i < n; i++) {
d[i] += c[i] - (float) j;
}
}
} /* j loop */
} /* End of data region */
s = 0.0;
for (i = 0; i < n; i++) {
s += d[i];
}
printf("Sum of D: %f\n",s);
}
If I wanted to see the performance without the data region I could take out the lines:
#pragma acc data region copyin(a,b) copy(c,d
#pragma acc update host©
#pragma acc update device©
Now with these three lines removed (and replaced with blank lines) I can see the program’s performance without any data regions by recompiling the modified code and running. I hopefully would recompie with
ta=nvida,time
command line option, then I will see performance of the program without data regions.
Any help appreciated. Thanks in advance.
THX 1138
Yes, that is correct. Compiling with -ta=nvidia,time will cause the program upon exit to print out timing information regarding the GPU.
Can you now work into the exsting code an example of the reflected clause? Also, I guess we in the c language world cannot use mirror. Is there a work around?
Thanks in advance.
THX 1138