OpenCL sqrt() precision

sqrt · March 23, 2012, 1:52pm

I got different result computing sqrt(178868) on GPU in two different ways.

computing on GPU, 178868 given as a constant defined inside kernel

int a = 178868;
output[0] = sqrt( (float) a);

→ result = 422.927886962890625

computing on GPU, 178868 given as a kernel parameter

output[1] = sqrt( (float) i); // i = 178868

→ result = 422.9278564453125

I suppose that in case 1, the opencl compiler computes sqrt(178868), but why is the result different form case 2 ?

I also try to compute sqrtf(178868) on CPU : the result is identical to GPU case 1.

The kernel code I use :

__kernel void testsqrt( __global float *output, int i)
{
int a = 178868;
output[0] = sqrt( (float) a);

output[1] = sqrt( (float) i);
}

My config :
NVIDIA GTS450, 2 Go RAM
Intel Xeon E5620
Ubuntu 11.10 64 bits
CUDA toolkit 4.1.28
NVIDIA developer driver 285.05.33

parallelis · March 27, 2012, 5:37pm

This is common to many compilers, when arithmetic operations are done on constants, or varioable containing constants (true constant not data from constant memory :) ), they pre-compute the result and use it instead compiling the computation itself.

ie:
int a = 178868;
b = sqrt( (float) a);
becomes, before the translation to machine-language code:
b = 422.927886962890625;

As the compiler run on the CPU, it use it’s floating-point operation, that are IEEE compliant, so the number is correct and naturally equal to the same operation computed indepently on the same CPU (or another IEEE-compliant CPU).
But your GPU is NOT conceived to be fully IEEE-compliant and major complex floating-point operations use shortcuts to be fasters instead to send back results with full-precision correctness. This is absolutely normal, and a correct behavior for a GPU.