Wrong results for double precision calculations Not setting arch=sm_13 causes incorrect results (onl

Hi,

I have a CUDA program where the kernel uses doubles, e.g.

__global__ void saxpy_parallel(int n, double a, double *x, double *y)

{

	int i = blockIdx.x*blockDim.x + threadIdx.x;

	if (i < n){

	  y[i] = a*x[i] + y[i];

	 }

}

If I set the GPU Architecture to sm_13 everything is fine. However, using the default sm_10 architecture (where double is not supported) I get incorrect results when running the application, even if I don’t use the double precision!

The compiler just prompts a warning: “warning : Double is not supported. Demoting to float”

Can anyone reproduce the problem?

The funny thing is, that (using the same GPU hardware and CUDA program) I also get the warning, but the application produces correct(!) results. Why not on windows?

Greets, Sandra

My system:

Windows Server 2008, 64-bit

Cuda Toolkit 3.2 (version included in Parallel NSight installation)

2 GPUs of Tesla S1070 (cc13)

Visual Studio 2008

Driver 260.93

Hi,

I have a CUDA program where the kernel uses doubles, e.g.

__global__ void saxpy_parallel(int n, double a, double *x, double *y)

{

	int i = blockIdx.x*blockDim.x + threadIdx.x;

	if (i < n){

	  y[i] = a*x[i] + y[i];

	 }

}

If I set the GPU Architecture to sm_13 everything is fine. However, using the default sm_10 architecture (where double is not supported) I get incorrect results when running the application, even if I don’t use the double precision!

The compiler just prompts a warning: “warning : Double is not supported. Demoting to float”

Can anyone reproduce the problem?

The funny thing is, that (using the same GPU hardware and CUDA program) I also get the warning, but the application produces correct(!) results. Why not on windows?

Greets, Sandra

My system:

Windows Server 2008, 64-bit

Cuda Toolkit 3.2 (version included in Parallel NSight installation)

2 GPUs of Tesla S1070 (cc13)

Visual Studio 2008

Driver 260.93