Slight difference in CPU-serial and GPU-parallel solutions

hemanthgrylls · August 2, 2021, 8:38pm

I am porting my serial version of code on to the GPU using OpenACC. Although the speedup and the solution output is good, I am seeing a slight (very slight) difference in the solution output (in my case density of a flow-field) between the CPU (serial) and GPU (parallel) runs.

I want to know if this is common and observed commonly by others too, or I am having a tiny bug in my GPU porting that I should take care of. Also please let me know if any of you have experienced similar kind of stuff while porting your code.

It is worth noting that the nature of the program that I am testing is very sensitive to the initial conditions (because of the nature of equations involved in the background) and also to the intermediate states that occur in the time loop. Some major bit flips during execution can change the final solution to an observable extent.

I am running my program using Fortran in double precision on Nvidia V100 with pgfortran compiler.

I have posted this question on stack overflow but the question was removed for some reason… I hope my question is clear.

MatColgrove · August 2, 2021, 11:41pm

It’s not abnormal to see slight difference when using massive parallelization due to rounding error difference. In particular if the code uses reductions. It’s also not uncommon to see differences when just going to different CPU architectures as well (ARM vs x86_64 vs Power) or even between compilers.

You can try adding “-Kieee -Mnofma” to your compilation. -Kieee has the compiler use strict IEEE 754 compliance though wont help with differences due to parallelization and may hurt performance. Even though FMA (fuse-multiply add) instructions are more accurate, they can cause differences in accuracy when compared to a result that doesn’t use them.

I have posted this question on stack overflow but the question was removed for some reason… I hope my question is clear.

I saw that. No idea why it was closed, though the answer ‘njuffa’ gave was good.

system · October 1, 2021, 11:41pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.