I have to rewrite a Fortran program such that the new one makes use of CUDA and the results matches the old one. There are a lot of calculations in it and all of them uses REAL variables without Kinds set – they are actually REAL4. The calculations in the program includes adding(+), subtracting(-), multiplying(), dividing(/), exponential(**) and the functions sqrt() and abs(). The modified code compiles OK with the same warnings about ‘Predefined intrinsic loses intrinsic property’ in the subroutine that uses MOD(Integer,Integer). The calculations now are done some in CPU and some in GPU.
However, the results don’t match. After reading the FAQ about execution precision, I suppose there may be some issues about the extended precision on Intel processors. So I tried using -Kieee and ‘-r4 -pc 32’ set. The results of the new and old still don’t match, still having quite small errors.
At this point, how do I get the results to match? Does the GPU perform REAL variables arithmetic in single or double precision? Do the options ‘-r4’ or ‘-pc xx’ affect GPU calculations at all?
I also have some questions on Fortran programming.
In dividing, do B = A / 3, and B = A / 3.0 work the same? If B is a REAL * 4, what will happen if I assign B = A / 3.0D0? Will it be different?