But I don’t know how to execute open ACC program and which program open ACC is executed on.
Can you please describe what you’ve done so far and what specific issues you are encountering?
Note that there’s an OpenACC “Getting Started Guide” in your installation’s “doc” directory (openAcc_gs.pdf) that may help walk you through the process. It’s also posted online at PGI Documentation Archive for Versions Prior to 17.7. (Note as of Today 2/3/2014 the online guide is pointing to an old version but I’m working on getting this updated).
I made basic sample code myself for testing PGI workstation compiler(version 14.1) works well, and it worked with command line instruction ‘pgcc -acc lab04.c -Minfo’
But there is a weird problem I have. @@ There is no output anywhere !! @@
I’m sure compiler works well, and kernels too. But there must be an errors I guess.
and… now I’m getting many problems with using compiler, rather than open ACC itself haha.
Could u give some advices ?
I’ll post entire code.
[Thank you very much, dear Mat.
I solved my problem.
I made basic sample code myself for testing PGI workstation compiler(version 14.1) works well, and it worked with command line instruction ‘pgcc -acc lab04.c -Minfo’
But there is a weird problem I have. @@ There is no output anywhere !! @@
I’m sure compiler works well, and kernels too. But there must be an errors I guess.
and… now I’m getting many problems with using compiler, rather than open ACC itself haha.
Could u give some advices ?
I’ll post entire code and command window.
#include <stdio.h>
#include <stdlib.h>
float scaled(float* v1, float* v2, float a, int n)
{
int i;
float sum = 0.0f;
#pragma acc kernels loop
for(i=0;i<n;i++)
{
v1[i]+=a*v2[i];
sum+=v1[i];
}
return sum;
}
int main(int argc, char* argv[])
{
int n;
float *vector1;
float *vector2;
if( argc > 1 )
n = atoi( argv[1] );
else
n = 100000;
if( n <= 0 ) n = 100000;
vector1=(float*)malloc(n*sizeof(float));
vector2=(float*)malloc(n*sizeof(float));
scaled(vector1, vector2, 3.3, n);
printf("programming done\n");
return 0;
}
To run your code, you need to execute it by running the command “lab04.exe”. On Windows, the default name of the executable will be the name of the first file on the command line. You can use the “-o” flag to set the executable name to something else.
The code as written wont accelerate give the loop dependencies between the two pointers, v1 and v2. In C, pointers are allowed to point at the same location in memory. Although we can see they do not, the compiler doesn’t have enough information so must assume that they do overlap and thus can’t be parallelized. To fix, you must do one of three things:
Use the C99 “restrict” keyword on your pointers to assert to the compiler that the two pointers point to different memory. (for example “float * restrict v1”)
Use the compiler flag “-Msafeptr” which asserts the compiler that all pointers point to different memory
Use the OpenACC “independent” clause on your loop directive to tell the compiler to ignore dependencies in this loop.
The preferred method is to use “restrict”, but not all compilers support C99 (including Microsoft), so you may not be able to. Next is to use “independent”. The quickest method is to use “-Msafeptr” but this applies to the whole program and can lead to wrong answers if pointers do overlap.
Your program has a few general issues such as the vectors aren’t initialized, you don’t keep the return value from scaled, and don’t print out the results.
However, when I compile my code with instruction ‘pgcc -acc -o test02.exe lab03.c -Minfo’ and ‘test02.exe’ on PGI cmd, it occurs error message ‘Accelerator Fatal Error : No NVIDIA/CUDA version of this construct available for the current device’.
To solve that problem, i installed CUDA toolkit 4.1 ver. But my computer couldn’t recognize new version.
After uninstalling 4.1 version, i installed CUDA driver 5.5, the newest version of driver which also includes CUDA toolkit 5.5 ver. And my computer recognized it.