I wanted to know if double precision is turned on by default when using CUDA 4.0 and a GPU with Compute Capability 2.0? I am asking this because I read that you have to add the arch_sm=13 flag to nvcc compilation to enable this? Is this true only for older CUDA versions or is it true for CUDA 4.0 also?
P.S. I do get 2 warnings while compiling my code that double is not supported, demoting to float. But my understanding is that these warnings are only coming from compilation for lower architectures. Since the Makefile that comes with sample CUDA programs compiles for all architectures. I am compiling my files through the default Makefile that comes with all the sample projects, I’m just editing the Makefile to use my source code instead of sample source code.
Any help is most appreciated. Thanks.
You need to specify an arch_sim >=1.3. The default is 1.0 ( no double precision).
I would strongly suggest to write your own Makefile, using the one shipping with the SDK is a bad idea in the long run.
Can you say why is that? Is there a performance hit if I use the binary generated by the default Makefile? Thanks for your time.
EDIT: Also, are those two warnings(demoting to float) coming because the default Makefile compiles for all possible architectures? Can you help me with that?