Performance bug in nvfortran 20.7

heatx.txt (4.9 KB)

Thanks Robert. It looks like the compiler isn’t vectorizing the loops when “target” is used since there could be other references to the array thus making it unsafe to vectorize. Though, I’ve added a issue report, TPR #29091, to see if the compiler’s dependency analysis can be improved to detect that in this case, no dependency. You can force the compiler to disable the dependency check via “-Mnodepchk” and the time is recovered, but it reaches convergence much earlier, presumably because of the max reduction computation.

Note, adding AVX512 (-Mvect=simd:512) and relaxed fp precision (-Mfprelaxed) helps the overall time as well.

Hi Robert,

Our engineers are working on this issue but wanted to me to let you know that the core problem is a combination of the target attribute combined with the induction variables being declared in the module. Hence, a work around to the issue is move the declaration of “i”, “j”, and “k” from the module into the main program. The code vectorizes as expected in this case.