Hi all,

I’m a bit at loss of how to proceed. I am working with an numerical weather prediction model. When I compile a specific routine of the model code using the -Mvect=sse option, the results are erroneous. After considerable effort of trying to isolate the problem, I am now running an idealized test where the winds should stay zero for the whole simulation duration (flat terrain, atmosphere in hydrostatic equilibirum, no wind). If I compile the subroutine in question without the -Mvect=sse option, the results look perfect (all winds <1e-15 m/s, which corresponds to numerical precision). When I switch on the -Mvect=sse option, the model gives erroneous results (winds of >25 m/s) after just a single timestep.

I cannot exclude an error in the code, but the code runs finde with several other compilers (pathf90, gfortran, Cray). I am not sure wether the optimizations done by -Mvect=sse may lead to differences in numerical results, but I strongly doubt that an optimization should lead to such a large effect in such a simple and well defined idealized testcase. Any ideas of how to proceed would be highly appreciated!

Kind regards,
Oliver Fuhrer

PS. I am compiling the whole code with -O0 and no specific optimizations (no IPA). The routine in question is compiled with the following compiler call…

/opt/pgi/9.0.4/linux86-64/9.0-4/bin/pgf901 /lus/scratch/olifu/test_wol/lm_4.14/src/fast_waves_rk.f90 -opt 2 -inform inform -nohpf -nostatic -x 19 0x400000 -quad -x 59 4 -x 59 4 -x 15 2 -x 49 0x400004 -x 51 0x20 -x 57 0x4c -x 58 0x10000 -x 124 0x1000 -x 57 0xfb0000 -x 58 0x78031040 -x 70 0x6c00 -x 47 0x400000 -x 48 4608 -x 49 0x100 -x 120 0x200 -stdinc /opt/pgi/9.0.4/linux86-64/9.0-4/include:/usr/local/include:/usr/lib64/gcc/x86_64-suse-linux/4.1.2/include:/usr/lib64/gcc/x86_64-suse-linux/4.1.2/include:/usr/include -def unix -def __unix -def __unix__ -def linux -def __linux -def __linux__ -def __NO_MATH_INLINES -def __x86_64__ -def __LONG_MAX__=9223372036854775807L -def '__SIZE_TYPE__=unsigned long int' -def '__PTRDIFF_TYPE__=long int' -def __THROW= -def __extension__= -def __amd64__ -def __SSE__ -def __MMX__ -def __SSE2__ -def __SSE3__ -def __SSE4A__ -def __ABM__ -idir . -idir /lus/scratch/olifu/test_wol/lm_4.14/src -idir /opt/mpt/3.5.0/xt/mpich2-pgi/include -idir /opt/mpt/3.5.0/xt/mpich2-pgi/include -idir /opt/xt-libsci/10.4.0/pgi/lib -idir /opt/xt-libsci/10.4.0/pgi/include -idir /opt/mpt/3.5.0/xt/sma/include -idir /opt/mpt/3.5.0/xt/pmi/include -idir /opt/cray/hdf5/ -idir /opt/cray/netcdf/ -idir /opt/xt-pe/2.2.48B/include -def __MPICH2 -def __CRAYXT_COMPUTE_LINUX_TARGET -def __TARGET_LINUX__ -ccff -freeform -preprocess -dclchk -vect 48 -freeform -modexport /tmp/pgf90sF7f2jo_jJcy.cmod -modindex /tmp/pgf90IF7fMy-GC3wS.cmdx -output /tmp/pgf90cF7fgbEjF9YT.ilm
  0 inform,   0 warnings,   0 severes, 0 fatal for fast_waves_rk
  0 inform,   0 warnings,   0 severes, 0 fatal for fast_waves_runge_kutta
  0 inform,   0 warnings,   0 severes, 0 fatal for w_bbc_rk
  0 inform,   0 warnings,   0 severes, 0 fatal for w_bbc_rk_up5
PGF90/x86-64 Linux 9.0-4: compilation successful

Hi again,

I’ve investigated this further, and it really seems to me that this is a bug when the PGI Fortran compiler tries to vectorize the code. I’ve reduced the code to a sample which does…

a = 1.0d0
u = 0.0d0
do i = 4, ie-4
  u(i,4,1,2) = a(i+1,4,1) - a(i,4,1)
write(*,*) 'u:',u(:,4,1,2)

…where a(ie,7,1), b(ie,7,1) and u(ie,7,1,2). This should result in an output of only zeroes, as a has been initialized by a constant value everywhere. This works fine if the above code is compiled without “-Mvect=sse”. If it is compiled using vectorization, the result is an array which contains a one at location u(5,4,1,2).

I’ve tested this on two different machines running pgf90 10.8-0 as well as 9.0-4. I will send the *.tar.gz with a testcase reproducing the bug to customer support.


Hi Oli,

Thank you for the example code. I’ve been able to recreate the error and have created a high priority problem report (TPR#17335) and sent on to our engineers for further investigation.

  • Mat