numerics question, ancient bug long fixed?

kcrowell42 · March 9, 2015, 9:52pm

Hello,

I run a fortran90 landscape evolution model on OS X, have been doing so since the first gen Mac Pros with the earliest PGI compilers for OS X. I’ve seen one niggling issue with an executable compiled I believe with 8.0-4, 8.0-5, or 8.0-6, using the quick-start optimizing flags “-fast -Mipa=fast”.

In one pair of functions which mix materials with two different erosion parameters the developer catches nonphysical cases of negative erosion and negative parameter values passed into the function, defaults the bad values to the default input parameter values, and issues a warning that the nonphysical values were caught.

The locations of these occurrences suggests the flow is encountering a sudden slope reversal from roughness in the digital terrain. I’m assuming some sort of overflow or underflow was happening.

With no optimization, the nonphysical values never occur, and the resulting terrain is exactly that predicted by the more optimized version. I have never been able to replicate the problem with any level of optimization with later releases, nor with gfortran on the original MacPro1,1, a 3-year old laptop, and the latest MacPro6,1 nor with g95 on older Mac Pros. All executables compiled with the later PGI releases predict the same terrain; gfortran predicted “close enough” if not exact matches. That particular executable does carry the errors to Mac Pros and laptops with different CPUs.

It doesn’t appear that those flags introduce anything particularly dangerous. Would this be chalked up to a subtle compiler bug from back in the day? If so what sort of bug would be most suspect? I would just like to be able to say something reasonable if asked.

Thanks,
Kelly

MatColgrove · March 10, 2015, 5:03pm

Hi Kelly,

Would this be chalked up to a subtle compiler bug from back in the day?

Given the evidence, it’s certainly plausible. Of course without investigation, it’s impossible to tell for sure.

If so what sort of bug would be most suspect?

I looked through all of our 8.0 bug reports. I see only one report of incorrect answers when optimization is applied and it only occurs when a converting single to double precision Cray pointer (TPR#15761). For example:

subroutine sub( n)
 common /aa/aa(10000)
 real*8 aa
 pointer(pb,bb)
 real*4 bb(100000)
 do i = 1, 800
 aa(i) = dble(bb(i))   ! << Here
 enddo
 end

So if you use Cray pointers, that could be it. Otherwise, I’m not sure.

Mat

kcrowell42 · March 10, 2015, 6:22pm

Thanks Mat,

Cray pointers weren’t used, so that shouldn’t be associated with the problem. Since it has never occurred with any other releases or without optimization, nor with 64-bit gfortran, I’ll claim it was a likely subtle compiler bug.

The bad input values that trigger the warning in the functions result from a predictor-corrector routine solving a stiff system of equations, and the sudden rise the flow sees is problematic anyway. Luckily this all happens far enough outside our region of interest, which is why I never spent the effort to clean up the input terrain in the affected areas.

Cheers,
Kelly

Topic		Replies	Views
Odd error maybe due to numerical resolution? Legacy PGI Compilers	3	2449	February 1, 2011
Errors when building with PGI compiler Legacy PGI Compilers	10	15282	January 16, 2012
putative bug in the 14.7 fortran compiler Legacy PGI Compilers	4	6580	September 9, 2014
Fortran compiler bug report Legacy PGI Compilers	2	4745	August 24, 2017
PGF90- Internal compiler error Legacy PGI Compilers	1	2249	October 13, 2017
Severe Problem with PGF90 Legacy PGI Compilers	4	8819	August 5, 2005
fail to converged when binary compiled by latest release Legacy PGI Compilers	5	23246	October 25, 2004
possible compiler bug Legacy PGI Compilers	8	6322	July 11, 2014
Possible compiler bug Legacy PGI Compilers	2	1674	August 19, 2019
old 32 bit code won't run after compiling on 64 bit? Legacy PGI Compilers	1	4514	February 24, 2006

numerics question, ancient bug long fixed?

Related topics