enable double precision for SDK I can't figure out where in the makefile the -arch sm_13 should

JeremiahPalmer · October 15, 2009, 1:48pm

Howdy, all!

I have downloaded, compiled, and run the CUDA SDK. Now, I’d like to run some of the examples under double precision. However, every time I put the -arch sm_13 flag in a place in a makefile where I think it should go, the compiler complains at me. Has anyone out there had success in compiling the SDK with DP. If so, can you please tell me exactly where I should put the -arch sm_13 flag?

Thanks,
Jeremiah

jgoffeney · October 15, 2009, 3:30pm

It should directly follow each invocation of nvcc in the Makefile such as:

[codebox]nvcc -arch sm_13 -c -Xcompiler -m64,-pipe -I “…/CudaCommon” -I “src” -I “/home/jason.goffeney/Tools/Linux3rdParty/gdal/include” -I “/home/jason.goffeney/Tools/Linux3rdParty/glew/include” -I “/home/jason.goffeney/Tools/Linux3rdParty/sqlite3/include” -I “/usr/local/cuda/include” …/CudaCommon/kernels/itmFunctions/itm_kernels.cu -o ${OBJECTS_DIR}itm_kernels_cuda.obj[/codebox]

JeremiahPalmer · October 15, 2009, 3:48pm

Thanks for your reply; however, the SDK’s Makefiles use a lot of aliasing and split up their flags into different lines of the Makefile. Do you know where the flag should go in the SDK Makefiles, specifically?

jgoffeney · October 15, 2009, 4:04pm

Try looking in common.mk in the common directory of the SDK root which is included in each project Makefile. There is a line commented out as NVCCFLAGS += $(SMVERSIONFLAGS). If you uncomment it and in the project Makefile add SMVERSIONFLAGS = -arch sm_13, I think it will work.

JeremiahPalmer · October 15, 2009, 4:05pm

That did it. Thanks!

MacFan · November 13, 2009, 1:40pm

This thread is helping me a lot, as I just wasted a morning trying to compare some float and double codes for a certain function.

Nevertheless, I have to wonder why this is necessary. This is the first time I have encountered a compiler that needs the makefile to be edited to allow double precision arithmetic. I have only wasted 2 hours on this but I suspect others may have wasted a lot more. Is there a good reason for this, and why is dp not enabled by default?

I had just changed float to double in a global typedef and was baffled until I saw this thread! My dp code is still not behaving properly but the warnings have gone - anything else I should look out for? I am working with cuda 3.0beta on SnowLeopard 10.6.2 and a GTX 285 Mac Edition.

avidday · November 13, 2009, 1:46pm

Because the greatest proportion of CUDA capable GPUs in the wild can’t do double precision. Right now nvcc/nvopencc has to be able to compile for four different architectures. Soon it will be five. The architecture selection in nvcc isn’t any different to gcc requiring command line options to generate 64 bit code or SSE3 intructions, or any other architecture specific features.

JeremiahPalmer · November 13, 2009, 3:17pm

I also had to add some double precision versions of functions in cutil.h. In other words, the utility functions in the SDK lack dp.

MacFan, I have had lots of trouble with getting dp code to work correctly. In some cases, my kernels launch but don’t do the complete work. In other cases, I get ULFs. I have had many eyes look over my codes and no one can find anything wrong with it. I submitted a bug report a few weeks ago but haven’t heard back yet.

MacFan · November 16, 2009, 10:23am

I spent the weekend completely confused and then realized that (a) I did not have the sm_13 make option properly included (b) had not enforced the double type uniformly throughout the code. For (b) I have now moved the type definition to a single typedef statement that I can switch from float to double and that permeates the entire code. The code now compiles and the arithmetical gibberish has gone.

What I am doing is some algorithms for inverse CDFs for probability applications, and I am looking at the speed improvements to be had by eliminating IF statements where one branch is slower than the others and produces a stall. As of late yesterday I have got speed up of 2-3 in DP over a well known method - but I need to check the answers are indeed correct and do have the desired precision, ~10^-15.

Topic		Replies	Views
CUDA Double Precision in MATLAB Using double CUDA precision numbers in MATLAB CUDA Programming and Performance	5	11774	March 10, 2009
Does CUDA 4.0 keep double precision on by default on GPUs with Compute Capability 2.0? CUDA Programming and Performance	2	2194	August 6, 2011
More information about double precision in Guide? CUDA Programming and Performance	4	4037	May 30, 2008
Problem with double precision matrix float works, double doesn't CUDA Programming and Performance	7	3915	April 22, 2009
Identifier __hiloint2double is undefined CUDA Programming and Performance	18	27057	August 20, 2010
Double precision numbers, emulation, and compute capability < 1.3 CUDA Programming and Performance	5	1720	August 11, 2009
-arch sm_13 business CUDA Programming and Performance	2	507	March 30, 2019
How to support Double Precision on CUDA2.1? Double precision works well on CUDA2.0 but can not work CUDA Programming and Performance	2	2712	February 19, 2009
float / double issue CUDA Programming and Performance	12	22100	December 31, 2010
Double precision in CUDA 2.3 CUDA Programming and Performance	5	38171	March 5, 2010

enable double precision for SDK I can't figure out where in the makefile the -arch sm_13 should

Related topics