Hi, So I recently learned that you can eliminate the JIT compiling during loading of CUDA kernels by doing multiple targeted compiles. I am trying to target both a GTS 250, GTS 450 and GTX 460, but ideally support other cards with the JIT capabilities. This seems to work good for compute_11, compute_12, and compute_20. However if I try to target compute_21, I get the following error.
The compiler line is:
/usr/local/cuda/bin/nvcc -D_DEBUG -gencode=arch=compute_11,\"code=sm_11,compute_11\" -gencode=arch=compute_20,\"code=sm_20,compute_20\" -gencode=arch=compute_21,\"code=sm_21,compute_21\" -I. -I/usr/local/cuda/include -I./Debug -DUNIX -g -o Debug//CudaWorker.cu_o -c CudaWorker.cu
The output is:
nvcc fatal : Unsupported gpu architecture 'compute_21'
My system setup is Ubuntu 10.04.2, using the 2.6.32-28-generic, and the 260.19.26 nvidia driver with the latest cude 3.2.16 downloaded specifically for Ubuntu 10.04. I am trying to target a GTS 450 and GTX 460, which as far as polling the cards goes, are cuda capability 2.1
Anything I have done wrong?