Why I cannot see any difference between them? Am I doing anything wrong?
I tested them using the following code, and comparing the results with the value by math.h in C.
__global__ void addKernel(double *dev_c)
{
dev_c[0] = pow (2.70134219723423422342334134, 2.70134219723423422342334134);
dev_c[1] = powf (2.70134219723423422342334134, 2.70134219723423422342334134);
}
int main()
{
double c[2] = {0.0, 0.0};
double *dev_c;
cudaMalloc((void**)&dev_c, 2*sizeof(double));
addKernel <<< 1, 1 >>> (dev_c);
cudaMemcpy(c, dev_c, 2*sizeof(double), cudaMemcpyDeviceToHost);
printf("CUDA math double precision: %.24f \n", c[0]);
printf("CUDA math single precision: %.24f \n", c[1]);
getchar();
}
The output is
CUDA math double precision: 14.650218963623047000000000
CUDA math single precision: 14.650218963623047000000000
Comparing with the pow() in math.h
C math.h double precision: 14.650221542435155000000000
Or, is there anything I failed to do, that renders CUDA double precision not working?
Many thanks in advance.