Hi All, I know that the recent generation of GPUs are IEEE-compliant and can support double precision. However, I still wonder the details of the floating point processor of GPUs. For the FPU in x86 processor, underlying 80-bit may be used for floating point numbers. I do not find the technique details of FPU on GPUs, it actually also implements 80-bit based double precision?

This problem is from my recent work. In some cases (although the probability is very very low), the floating point arithmetic operators on the GPU have different behaviors compared with CPUs. On the CPU, I have forced the double precision to use 64-bit rather than 80-bit.

Additionally, I also think this problem is related to the compiler. My program do not meet such problems while using CUDA 2.0, but has problems in all later CUDA versions.

I have not figure out a way to separate this problem from my complicated program, thus i am sorry for the long and boring text…