Conditional Compilation (__CUDA_ARCH__)

Vectorizer · September 19, 2013, 11:55am

Greetings,

I would like to compile conditionally based on SM value (sm_20, sm21, sm_30, sm35, etc), under Visual Studio.
For testing, I set the GPU Architecture(1) under Properties of the cu file and of the Project to sm_21. Left all other GPU Architecture(?) fields at 0.
The compile statement has: … -gencode=arch=compute_20,code="sm_21,compute_20" …
and the CUDA_ARCH variable is set at 200, not 210 that I was expecting.
Is there a way to check the sm_ value in the code with a #ifdef?

njuffa · September 19, 2013, 5:20pm

The compiler invocation specifies arch=compute_20, so CUDA_ARCH will be defined as 200. Best I can tell, there is no compute_21 or sm_21 as a compiler-defined architecture, and therefore the predefined symbol CUDA_ARCH cannot take the value 210.

Instead, the compiler uses arch=compute_20 for all platforms with compute capability 2.x. The reason is presumably that purely from an instruction set perspective there is no difference between compute capability 2.0 and compute capability 2.1.

One can, however, tweak the code generation for a GPU with compute capability 2.1, which already happens in the example above: code="sm_21,compute_20".

Vectorizer · September 19, 2013, 8:00pm

Thanks for the responses. That tweak comes from the standard installation. The pulldown menus in the property pages gives me the 2.1 option. Yet there is no straightforward way to detect that with a simple #if WHATEVER == sm21 or something.

njuffa · September 19, 2013, 8:31pm

As far as I know, that is correct: While the “arch” setting is mapped to CUDA_ARCH, there is no equivalent mapping for the “code” setting.

What is the specific use case that would make that desirable, i.e. requires to distinguish between sm_20 and sm_21 at the source code level?

Vectorizer · September 19, 2013, 8:47pm

Nothing concrete at the moment. But since the pulldown menus make that distinction, I should be able to catch that in the code. I can enclose the new kepler with CUDA_ARCH >= 300.
I guess my question is why the pulldown menus lets you specify sm<major,minor> if there are no differences?
Thanks

njuffa · September 20, 2013, 2:32am

While compute capability 2.0 and 2.1 share the same instruction set, there are differences in hardware organization between GPUs with those two compute capabilities. The “arch” flag instructs the compiler to generate instructions from the sm_2x instruction set, the “code” flag can then be used to tweak the code generations (e.g. instruction selection, instruction scheduling) differently for sm_20 and sm_21. The same concept exists with other compilers. For example, gcc has -march and -mtune flags, where -march selects an ISA to target and -mtune tweaks the code for specific CPUs using that ISA.

The pulldown menu presumably gives you theses choices for programmer convenience, so if you know you have a GPU with compute caopability 2.1, you can simply select that.

I have never checked how much difference there is between code tweaked for sm_20 and code tweaked for sm_21. As far as I understand the differences between these two HW architectures, I would expect the generated code for the two targets to look quite similar, and performance-wise it is likely a second order effect.

Topic		Replies	Views
Compiling for the right architecture CUDA Programming and Performance	14	1754	September 14, 2010
Cmake and and Heterogenious GPUs CUDA Programming and Performance	12	12033	September 27, 2010
__CUDA_ARCH__ undefined?! CUDA Programming and Performance	10	20603	April 9, 2012
what does -arch and -code flags do? CUDA Programming and Performance	2	2222	June 26, 2009
Compile time architecture checking? CUDA Programming and Performance	1	1022	January 4, 2011
How to compile with PTX ISA v2.3 support? CUDA Programming and Performance	4	1697	April 22, 2013
Easy Question, what compile flag for atomicAdd ? CUDA Programming and Performance	7	8006	March 1, 2011
Compilation error in arch sm_20 but compiles well with -arch sm_13 CUDA Programming and Performance	5	910	February 3, 2011
Changing architecture on code blocks CUDA Programming and Performance	3	3194	June 4, 2012
CUDA runtime multi-architecture cubin loading CUDA Programming and Performance	5	1477	November 2, 2016

Conditional Compilation (__CUDA_ARCH__)

Related topics