It really is defined on the 3.2rc and 3.2 final toolkits. Take this code:
#include <stdio.h>
#ifndef __CUDA_ARCH__
#warning cuda arch not defined
#else
#warning everything is normal
#endif
__global__ void helloCUDA(const float f)
{
#if __CUDA_ARCH__ >= 200
printf("Hello thread %d, f=%f\n", threadIdx.x, f) ;
#endif
}
int main()
{
helloCUDA<<<1, 5>>>(1.2345f);
return cudaThreadExit();
}
I can safely compile it with the 2.3 toolkit:
avidday@cuda:~$ module load cuda/2.3
avidday@cuda:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2009 NVIDIA Corporation
Built on Thu_Jul_30_09:24:36_PDT_2009
Cuda compilation tools, release 2.3, V0.2.1221
avidday@cuda:~$ nvcc -arch=sm_13 cudaprintf.cu
cudaprintf.cu:6:2: warning: #warning everything is normal
avidday@cuda:~$ ./a.out
Under the 3.2 toolkit for sm_13:
avidday@cuda:~$ module switch cuda/2.3 cuda/3.2rc
avidday@cuda:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2010 NVIDIA Corporation
Built on Wed_Sep__8_17:12:45_PDT_2010
Cuda compilation tools, release 3.2, V0.2.1221
avidday@cuda:~$ nvcc -arch=sm_13 cudaprintf.cu
cudaprintf.cu:6:2: warning: #warning everything is normal
cudaprintf.cu:4:2: warning: #warning cuda arch not defined
avidday@cuda:~$ ./a.out
And under the 3.2 toolkit for sm_20:
avidday@cuda:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2010 NVIDIA Corporation
Built on Wed_Sep__8_17:12:45_PDT_2010
Cuda compilation tools, release 3.2, V0.2.1221
avidday@cuda:~$ nvcc -arch=sm_20 cudaprintf.cu
cudaprintf.cu:6:2: warning: #warning everything is normal
cudaprintf.cu:4:2: warning: #warning cuda arch not defined
avidday@cuda:~$ ./a.out
Hello thread 0, f=1.234500
Hello thread 1, f=1.234500
Hello thread 2, f=1.234500
Hello thread 3, f=1.234500
Hello thread 4, f=1.234500
If you run your nvcc compile statements with the --dryrun , you can see that on the device side compilation, -D__CUDA_ARCH__ is being passed to every gcc call. I don’t see what the problem is.