I have an OpenACC code in C with a nested loop. I just want to parallelise over the outer loop. Will the PGI compiler leave the inner loops alone or do I have to do something special to stop it from parallelising or vectorising the inner loops?
If you’re using the “kernels” directive and the inner loop is parallizable, then the compiler will most likely parallelize it. Same is true for “parallel” unless the flag “-acc=noautopar” is used.
Best to use the “#pragma acc loop seq” directive before the inner for loop to explicitly tell the compiler to run the loop sequentially.
Hope this helps,
Thanks - that is very useful.