Hi,
I have a CUDA program where the kernel uses doubles, e.g.
__global__ void saxpy_parallel(int n, double a, double *x, double *y)
{
int i = blockIdx.x*blockDim.x + threadIdx.x;
if (i < n){
y[i] = a*x[i] + y[i];
}
}
If I set the GPU Architecture to sm_13 everything is fine. However, using the default sm_10 architecture (where double is not supported) I get incorrect results when running the application, even if I don’t use the double precision!
The compiler just prompts a warning: “warning : Double is not supported. Demoting to float”
Can anyone reproduce the problem?
The funny thing is, that (using the same GPU hardware and CUDA program) I also get the warning, but the application produces correct(!) results. Why not on windows?
Greets, Sandra
My system:
Windows Server 2008, 64-bit
Cuda Toolkit 3.2 (version included in Parallel NSight installation)
2 GPUs of Tesla S1070 (cc13)
Visual Studio 2008
Driver 260.93