The OpenCL header file from the CUDA sdk doesn’t define CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT, nor do NVidia OpenCL devices return that information in the Single FP Config information (http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf#page=234). This means Divide and SQRT calculated on the devices are not done according to the IEEE 754 spec, and you can trivially check this for yourself.
This is a bit weird since there is a research paper NVidia have published highlighting how IEEE 754 compliant the GPUS are: https://developer.nvidia.com/sites/default/files/akamai/cuda/files/NVIDIA-CUDA-Floating-Point.pdf, and we have an entire list of the Compute Capabilities of various devices (https://developer.nvidia.com/cuda-gpus) yet we can’t do accurate math on any of them using OpenCL.
Is this something that was maybe overlooked, or planned to be implemented later, or is there an inherent reason why it won’t/can’t be implemented? Or have I just missed something like downloading the wrong driver (I have the CUDA 5.5 SDK).