Can PGI compiler automatically extract paralellism?


I wanted to know if PGI compiler for accelerators, have any capabilities that can automatically extract paralellism from a loop nest and automatically annotate it with openacc directives?


Hi Thejas,

No, and you wouldn’t really want it to. On the CPU side, there is an option for Auto- parallelization (-Mconcur) which can automatically find parallelism opportunities in loops. However, on the CPU the compiler does not need to worry about data. While the compiler could also find parallelism for the accelerator (the same dependency analysis is used for both), it can not optimize the data movement since the compiler does not have a global view of the the data flow. Since data management is such a crucial part of GPU performance, the cost of getting the data movement wrong is too high. While auto-acceleration could work well with a small subset of code, the vast majority of code would need to use the OpenACC directives.

  • Mat

Thanks Mat!