CUDA and nvcc: using the preprocessor to choose between float or double

Having a .h, I want to define real to be double if compiling for c/c++ or for cuda with computing capability >= 1.3. If compiling for cuda with computing capability < 1.3 then define real to be float.

I tried:

#   if defined(__CUDACC__)

#       warning * making definitions for cuda

#       if defined(__CUDA_ARCH__)

#           warning __CUDA_ARCH__ is defined

#       else

#           warning __CUDA_ARCH__ is NOT defined

#       endif

#       if (__CUDA_ARCH__ >= 130)

#                       define real double

#                       warning using double in cuda

#       elif (__CUDA_ARCH__ >= 0)

#               define real float

#               warning using float in cuda

#               warning how the hell is this printed when __CUDA_ARCH__ is not defined?

#       else

#               define real 

#               error what the hell is the value of __CUDA_ARCH__ and how can I print it

#       endif

#   else

#       warning * making definitions for c/c++

#       define real double

#       warning using double for c/c++

#   endif

and compiled

nvcc -arch compute_13  -Ilibcutil testFloatDouble.cu

but it doesn’t work since in the second compilation pass, CUDACC is defined

but CUDA_ARCH seems not to:

* making definitions for cuda

__CUDA_ARCH__ is defined

using double in cuda

* making definitions for cuda

warning __CUDA_ARCH__ is NOT defined

warning using float in cuda

how the hell is this printed if __CUDA_ARCH__ is not defined now?

Undefined symbols for architecture i386:

  "myKernel(float*, int)", referenced from: ....

Any help is welcome. Thanks

This doesn’t work for host code as CUDA_ARCH is not defined there.

The rationale is that since a single binary can include code for different devices (using the [font=“Courier New”]–generate-code[/font] option of [font=“Courier New”]nvcc[/font]), there is no way to know the compute capability of the device at compile time of the host code.

You’ll need to have host code for both cases, and then choose the right code path at runtime.

Thanks anyway. I’m porting a library with functions using double for c/c++ or for cuda>=1.3, and using float otherwise.

Being CUDA_ARCH unusable for that, we’ll use a custom define (define USE_FLOAT_IN_CUDA , for instance).