Fermi Flag


is it at runtime possible to check the architectur of the GPU, without cudaGetDeviceProperties?

A flag what I can check befor I execute Fermi specific functions, for better compatibility with the Tesla-GPUs.

For example:

#ifdef FERMI




Your example would be a compile-time check. That’s possible using the CUDA_ARCH preprocessor macro:

#if __CUDA_ARCH__ >= 200



Once you have that, you can compile the kernel for multiple architectures using the -gencode option to nvcc, so that code best matching the actual GPU is selected at runtime.

Thank you for the quick response! Your solution work fine…
Exists a flag for the Host-Code too or only for Device-Code?

CUDA_ARCH only works in device code.

I would recommend testing if CUDART_VERSION >= 3000. That will allow your code to compile on older versions of CUDA, and it won’t hurt to call the cache config function on a GPU that doesn’t support it. The call will just be ignored.

You could certainly query the device type from host code using [font=“Courier New”]cuDeviceComputeCapability()[/font].

However, from a maintainability standpoint, if your kernel code has different capabilities on different CUDA devices, I’d rather suggest to announce those capabilities from the device code to the host code. E.g., create a variable in the device with flags for different functions available, and read that from the host code. That way you will never run into trouble by accidentally querying the wrong device in a multi-GPU setup, the driver selecting a code-blob other than the one you expect, or changing device code without updating host code.

The Problem is that it is not possible to call this functioin “cudaFuncSetCacheConfig(…)” from

Device-Code, only from Host-Code.





If I compile it on the system with GTX480 than is the compilation succesful. If I compile it on the system with GTX285 than I have a compilation error:

error: identifier "cudaFuncSetCacheConfig" is undefined

How I can commpile this code for GTX480 and GTX285?

I use CUDA Build Rule v3.0.14 and compile for sm_10 and sm_20…


Read my post again… You apparently do not have CUDA 3.0 installed on the GTX 285 system and need the #ifdef CUDART_VERSION >= 3000 for your code to work compile with older versions of CUDA.

Both systems have CUDA 3.0, but the GTX285 system from 01.2010 and GTX480 system from 03.2010. It’s crasy, both Toolkits have the same version number 3.0, but different versions of the files.
I install the new version auf CUDA 3.0 on the GTX285 system and it compiles succesful on both systems. The compiler ignor the “cudaFuncSetCacheConfig(…)” on the GTX285 system and it works fine.

Thank you!!!