How to activate double-precision computation

Dear all:
I buy a GTX-260 card, under official documentation, this chip
supports double-precision, but when I use “double” to do calculation,
the result is wrong, however it works when I use emulation mode.

(my approach is to change “float” to “double” in original code which
works under “float” computation )

Does anyone have the same problem?

                      Lung Sheng

while compiling pass the flag “-arch sm_13” without the quotes to the compiler…

ie. nvcc -arch sm_13

this will enable the program to compile for double precision. if you are still getting wrong answers after this then the problem is with your code. I assume that you are using the latest version of the cuda toolkit.

Thanks a lot, it works.

However when I use “”-arch sm_13" on single precision “float”, it also works,

does nvcc take all “float” to “double” ?

besides, in manual of nvcc


   --gpu-code <gpu architecture name>,... (-code)

       Specify the name of nVidia gpu to generate code for.

Unless option -export-dir is specified (see below), nvcc will embed a compiled

       code image in the executable for each specified 'code' architecture, which is a

       true binary load image for each 'real' architecture (such as a sm_13), and ptx

       code for each virtual architecture (such as compute_10). During runtime, such

       embedded ptx code will be dynamically compiled by the cuda runtime system if no

       binary load image is found for the 'current' GPU, and provided that the ptx

       level is compatible with this current GPU.

Architectures specified for options -arch and -code may be virtual as well as

       real, but the 'code' architectures must be compatible with the 'arch' architec-

       ture.  For instance, 'arch'=compute_13 is not compatible with 'code'=sm_10,

       because the earlier compilation stages will assume the availability of com-

       pute_13 features that are not present on sm_10.  This option defaults to the

       value of option '-arch'.

Allowed values for this option: ‘compute_10’, ‘compute_11’, ‘compute_13’,

       'compute_14', 'compute_20', 'sm_10', 'sm_11', 'sm_13', 'sm_14', 'sm_20'.


I don’t see further description of all choices, including sm_13,

do you know any documentation for “-arch” ?


                                                      Lung Sheng

it’s in the appendix A of the programmer’s guide, I think–compute capabilities of each card.

-arch sm_13 does not convert all floats to doubles, but if you have constants without a “f” suffix (2.0 instead of 2.0f, for example) they will be converted to doubles because that is their traditional C datatype.

When I try the option -arch sm_13 I get the following error:

nvcc fatal : A single input file is required for a non-link phase when an outputfile is specified

This is the custom build step I defined in Visual Studio for compiling the .cu file:

(CUDA_BIN_PATH)\nvcc.exe" –arch sm_13 -ccbin "(VCInstallDir)bin” -c -D_DEBUG -DWIN64 -D_CONSOLE -D_MBCS -Xcompiler /EHsc,/W3,/nologo,/Wp64,/Od,/RTC1,/Zi,/MTd -I"(CUDA_INC_PATH)" -I./ -I"C:\NVIDIA GPU Computing SDK\C\common\inc" -o (ConfigurationName)$(InputName).obj $(InputName).cu

The command line I was using for single precision was exactly the same, except for the option -arch sm_13 that was not there. Could someone please help me? Thank you.