OpenCL sqrt() precision

I got different result computing sqrt(178868) on GPU in two different ways.

  1. computing on GPU, 178868 given as a constant defined inside kernel

int a = 178868;
output[0] = sqrt( (float) a);

-> result = 422.927886962890625

  1. computing on GPU, 178868 given as a kernel parameter

output[1] = sqrt( (float) i); // i = 178868

-> result = 422.9278564453125

I suppose that in case 1, the opencl compiler computes sqrt(178868), but why is the result different form case 2 ?

I also try to compute sqrtf(178868) on CPU : the result is identical to GPU case 1.

The kernel code I use :

__kernel void testsqrt( __global float *output, int i)
int a = 178868;
output[0] = sqrt( (float) a);

output[1] = sqrt( (float) i);

My config :
Intel Xeon E5620
Ubuntu 11.10 64 bits
CUDA toolkit 4.1.28
NVIDIA developer driver 285.05.33

This is common to many compilers, when arithmetic operations are done on constants, or varioable containing constants (true constant not data from constant memory :) ), they pre-compute the result and use it instead compiling the computation itself.

int a = 178868;
b = sqrt( (float) a);
becomes, before the translation to machine-language code:
b = 422.927886962890625;

As the compiler run on the CPU, it use it’s floating-point operation, that are IEEE compliant, so the number is correct and naturally equal to the same operation computed indepently on the same CPU (or another IEEE-compliant CPU).
But your GPU is NOT conceived to be fully IEEE-compliant and major complex floating-point operations use shortcuts to be fasters instead to send back results with full-precision correctness. This is absolutely normal, and a correct behavior for a GPU.