If this is a known issue, I apologize as I don’t see any references to it. I’m using Community Edition 1910.
I’m compiling/running an older model where the main loop splits the dataset over threads using openmp directives. Each time step branches over the spacial data, then recombines the data after the parallel section and goes back to a single thread before proceeding to the next time step.
With the newer llvm based compiler, I get significantly degraded performance over the non-llvm compiler. While I haven’t profiled it yet, just observing “top” the process repeatedly goes back and forth between 200% (2 threads) and 100%(single thread). Using the non-llvm compiler, it stays pegged at 200% throughout. The performance difference is somewhere around 40%-50% slower with llvm.Is this a known difference between the two? It almost seems as if there is more thread overhead when launching a parallel section with the llvm compiler.
Compile options are:
pgf90 file.f -c -fast -Mfixed -mp
This isn’t a problem as long as I remember to link to the non-llvm compiler after upgrading, but I would like to understand the reason for the difference.
Thank you for continuing to make this software available to the community.