I’m new to CUDA programing, but I can’t seem to get my CUDA 2.1 capable (Quadro NVS 4200M) card to support double precision math.

To compile I’ve been using: nvcc test.cpp -o test -arch=sm_13

I have compiled with both -arch=sm_13 and -arch=sm_20, but a simple program (below) shows a loss of precision in the CUDA kernel. I do not get any warning at compile time of demotion of double to float.

What other compile flags are required to support double precision math?

Here is a simple example (loop of 50 multiplications) where the precision between a calculation performed on the Linux (ubuntu 10.04) host is different than the precision of the CUDA test kernel I wrote.

Thanks so much for the help!

double a = 1.112321232123212223432;

double b = 1.234323334323343234323;

double c = 1.0;

// Host Calculation

for(int i=0; i<50; ++i) c *= (a * b);

// CUDA Calculation

**global** void multiplyLoop(double a, double b, double* c)

{

if(threadIdx.x==0 && blockIdx.x==0)

{

*c = 1.0;

for(int i=0; i<50; ++i) *c *= (a * b);

}

}