I have a Fortran 90 code that uses OpenACC.
For some directives, the line is very long so I put it on multiple lines using “&”.
I found a problem when trying to reduce a lot of scalars in one directive.
If I do this:
!$acc parallel loop collapse(2) default(present) & !$acc reduction(+:h_fluxp,h_fluxm,h_fluxp_pn,h_fluxm_pn, & !$acc h_fluxp_ps,h_fluxm_ps,eqd1,eqd2,h_ax_dipole,h_area_ps)
then the compiler output says:
1701, Generating NVIDIA GPU code 1704, !$acc loop gang, vector(128) collapse(2) ! blockidx%x threadidx%x Generating reduction(+:h_area_ps,h_fluxm,h_fluxm_ps,h_fluxm_pn,eqd2) Generating implicit reduction(+:h_area_pn)
and some of the resulting values are wrong!
I am using:
nvfortran 22.5-0 64-bit target on x86-64 Linux -tp haswell
UPDATE! (even though first post):
It turns out I was missing
h_area_pn in the reduction clause and the compiler was implicitly detecting it. However, by doing this - the compiler messed up all the other reductions (which was a bad “silent” bug as the code ran but gave wrong results!)
If I include
h_area_pn in the reduction clause - everything is OK.
So it seems that the implicit detection of a reduction scalar is messing up the explicitly declared ones.