Issues with double precision support on GT200

mach4 · July 4, 2008, 5:52am

I recently obtained a GTX 260 and wanted to test double precision support. I figured a simple way to do it would be to switch all the float variables to double in the matrixMul example from the SDK.

I did that and it compiles and runs without errors but the numbers are completely wrong. Not only that but commenting out the kernel call still gives me the exact same numbers so it seems either the kernel is not doing anything or it’s writing to the wrong part of memory.

I am confused about how double precision is supported. I’m running the latest version of CUDA (2 beta2, on linux) which I thought would be all that was needed.

Just to make it clear here are a couple of snippets from the code (but it is literally just the matrixMul example with the float variables changed to double):

matrixMul.c

 

unsigned int mem_size_C = sizeof(double) * size_C;

// allocate device memory for result

double* d_C;

CUDA_SAFE_CALL(cudaMalloc((void**) &d_C, mem_size_C));

matrixMul_kernel.cu

__global__ void

matrixMul( double* C, double* A, double* B, int wA, int wB)

matrixMul_kernel.cu

__shared__ double As[BLOCK_SIZE][BLOCK_SIZE];

Reimar · July 4, 2008, 6:49am

While your problem does not sound like it has anything to do with that, make sure you are compiling for compute model 1.3, otherwise you will only get single precision.
I assume you changed the randomInit function etc. to double, too?
Are the values calculated on the CPU via computeGold correct?

Sarnath · July 4, 2008, 6:51am

Dats because the cudaMalloc() might have been the same between the previous run (with kernel) and the next run (without invoking kernel). Since the global memory was not over-written – you got the same output! – possible reason…

I have seen such behaviour before.

mach4 · July 4, 2008, 7:46am

Where do I check if I’m compiling for 1.3?

I did change all the helper functions and computegold is working fine as I get basically the same numbers there as with single precision.

mach4 · July 4, 2008, 7:50am

That is a good point and I’ve seen it happen before but it’s not the case here as I’ve tried changing the kernel on purpose so it would give me different numbers but they stayed the same.

Reimar · July 4, 2008, 7:54am

You need to pass -arch compute_13 or -code compute_13 (depending on compilation target I think) to nvcc, note that this is only in theory I only tried it once and it did not work (probably I did something stupid).

Best compile the code with -ptx (or better with -cubin and use decuda) to verify it actually generates double-precision operations.

mfatica · July 4, 2008, 2:38pm

To enable double precision, you need to pass the flag “-arch sm_13” to nvcc

nvcc -arch sm_13 file.cu

mach4 · July 7, 2008, 4:06pm

Thanks, this worked.

Topic		Replies	Views
Issues with double precision support on GT200 CUDA Programming and Performance	0	3688	July 4, 2008
Problem with double precision matrix float works, double doesn't CUDA Programming and Performance	7	3915	April 22, 2009
Double Precision Help... Double precision CUDA Programming and Performance	6	5045	September 1, 2011
Double precision in CUDA 2.3 CUDA Programming and Performance	5	38171	March 5, 2010
How to activate double-precision computation CUDA Programming and Performance	4	30356	September 14, 2009
double precision on the GTX 280 CUDA Programming and Performance	2	5199	August 13, 2008
double doesnt work in kernel CUDA Programming and Performance	7	3685	October 23, 2008
Kernel works in single precision but not in double CUDA Programming and Performance	7	1596	July 28, 2009
Problem with running code with double precision values Double precision gives wrong result CUDA Programming and Performance	2	1188	August 28, 2009
How to support Double Precision on CUDA2.1? Double precision works well on CUDA2.0 but can not work CUDA Programming and Performance	2	2712	February 19, 2009

Issues with double precision support on GT200

Related topics