Supporting cards of different compute capabilities in a single executable

I did my initial development using a Quadro FX 5600, but I finally have a C1060 to play with. It seems that you can not run a program built for compute capability 1.0 on a C1060.

First question:
Is that right? I originally expected that it wouldn’t run as fast as it could, but it would run. Is this a limitation with CUDA, or did I do something wrong?

Second question:
What options exist for targeting different compute capability cards from a sing executable? Note that I am not asking about using two different types of cards simultaneously. I just don’t want to make a different version of my software for each type of card. The SDK examples had a way to do it through macros, but I found that rather confusing. I think it would be pretty difficult to maintain as the program becomes more complex.

This sounds wrong. I compile my code for compute capability 1.0 (not setting any target options, so I get the default), and run the same library on compute capability 1.0, 1.1 and 1.3 cards. (Specifically the 8800 GTX, the 8800 GT, and an assortment of GTX 200 cards.)

What happens when you try to use your program on the C1060?

It ran without reporting an error; however, the answers were wrong. I am uninstalling and reinstalling the drivers to try again.

It’s possible your code was doing something incorrect or relying on incorrect assumptions.