Hi.
I have been struggling with porting some fortran to pgfortran cuda. I have solved my problem, but in so doing have discovered something that may be of interest. pgfortran single precision breaks down more or less where you’d expect it to, however intel ifort somehow manages to pull off higher precision than you’d expect! Here is a test program that helped me identify my problem - all it does is compare the expressions 1-cos(a) and a^2/2 (which should tend to eachother for small a) and sin(a) and a (which again for small a, tend to eachother).
program testprec
real*4 :: a
print*,' real precision : ',precision(a)
do i=1,precision(a)
a = 0.1**i
write(6,'(A3,F20.16)') 'a: ', a
write(6,'(A10,F16.12,A9,F16.12)') '1-cos(a): ', 1.0-cos(a),
- ' a^2/2 : ', a**2/2.0
write(6,'(A10,F16.12,A9,F16.12)') ' sin(a): ', sin(a),
- ' a : ', a
enddo
end
Here is the pgfortran result:
real precision : 6
a: 0.1000000014901161
1-cos(a): 0.004995822906 a^2/2 : 0.005000000354
sin(a): 0.099833421409 a : 0.100000001490
a: 0.0100000007078052
1-cos(a): 0.000050008297 a^2/2 : 0.000050000006
sin(a): 0.009999834001 a : 0.010000000708
a: 0.0010000000474975
1-cos(a): 0.000000476837 a^2/2 : 0.000000500000
sin(a): 0.000999999931 a : 0.001000000047
a: 0.0001000000120257
1-cos(a): 0.000000000000 a^2/2 : 0.000000005000
sin(a): 0.000100000012 a : 0.000100000012
a: 0.0000100000015664
1-cos(a): 0.000000000000 a^2/2 : 0.000000000050
sin(a): 0.000010000002 a : 0.000010000002
a: 0.0000010000002248
1-cos(a): 0.000000000000 a^2/2 : 0.000000000001
sin(a): 0.000001000000 a : 0.000001000000
As you can see, the evaluation of 1-cos(a) fails for a=0.0001 - you might expect this for decimal place #4 (i.e. precision(a)/2 +1). The approximation a^2/2 is still OK though - so my code now uses the approximation for small a.
The intel compiler though gives this:
real precision : 6
a: 0.1000000014901161
1-cos(a): 0.004995835014 a^2/2 : 0.005000000354
sin(a): 0.099833421409 a : 0.100000001490
a: 0.0100000007078052
1-cos(a): 0.000049999591 a^2/2 : 0.000050000006
sin(a): 0.009999834001 a : 0.010000000708
a: 0.0010000000474975
1-cos(a): 0.000000500000 a^2/2 : 0.000000500000
sin(a): 0.000999999931 a : 0.001000000047
a: 0.0001000000047497
1-cos(a): 0.000000005000 a^2/2 : 0.000000005000
sin(a): 0.000100000005 a : 0.000100000005
a: 0.0000100000006569
1-cos(a): 0.000000000050 a^2/2 : 0.000000000050
sin(a): 0.000010000001 a : 0.000010000001
a: 0.0000010000001112
1-cos(a): 0.000000000001 a^2/2 : 0.000000000001
sin(a): 0.000001000000 a : 0.000001000000
As you can see, the expression for 1-cos(a) continues to be valid for smaller and smaller a. I suspect that the intel compiler deploys versions of sine and cosine that identify that a is small and fall back on the approximations should this be the case.
If GPUs continue to be much faster in single precision for the foreseeable future, it might be a good idea to improve the trig functions in pgfortran so they are at least as good as the intel versions. Just a thought - I know you have an infinite list of ideas/requests!!!
Rob.