I’m working on a CUDA library for sparse matrix computations: https://code.google.com/p/spgpu/
I noticed that if I compile (on nvcc 5.5) a set of CUDA code specifying
the gencode parameter “compute_11,sm_11 compute_20,sm_20” my test programs will run seamlessly
on a card with compute capability 1.1 and another card with compute capability 2.0.
But when I try to run the same code on a Kepler GPU, I get the error “invalid device function”.
If I add the gencode parameter “compute_30,sm_30” it works also on the Kepler card (GTX 660).
It works also if I completely remove the gencode parameter.
My question: As far as I know the PTX should be included in the output fat binary,
why a newer card’s driver can’t compile PTX code targeted to older GPUs
(the platform is Linux 86_64)?